H. Akaike, Information theory and an extension of the maximum likelihood principle, 2nd International Symposium on Information Theory, pp.267-281, 1973.

P. Alquier and K. Lounici, PAC-Bayesian bounds for sparse regression estimation with exponential weights, Electronic Journal of Statistics, vol.5, issue.0, pp.127-145, 2011.
DOI : 10.1214/11-EJS601

URL : https://hal.archives-ouvertes.fr/hal-00465801

F. Bach, R. Jenatton, J. Mairal, and G. Obozinski, Optimization with sparsity-inducing penalties. Foundations and Trends, Machine Learning, pp.1-106, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00613125

F. R. Bach, Bolasso, Proceedings of the 25th international conference on Machine learning, ICML '08, pp.33-40, 2008.
DOI : 10.1145/1390156.1390161

URL : https://hal.archives-ouvertes.fr/hal-00271289

M. Baragatti and D. Pommeret, A study of variable selection using -prior distribution with ridge parameter, Computational Statistics & Data Analysis, vol.56, issue.6, pp.1920-1934, 2012.
DOI : 10.1016/j.csda.2011.11.017

URL : https://hal.archives-ouvertes.fr/hal-01293963

C. M. Bishop, Pattern recognition and machine learning, 2006.

C. Bouveyron, E. Côme, and J. Jacques, The discriminative functional mixture model for a comparative analysis of bike sharing systems, The Annals of Applied Statistics, vol.9, issue.4, 2014.
DOI : 10.1214/15-AOAS861

URL : https://hal.archives-ouvertes.fr/hal-01024186

L. Breiman and J. H. Friedman, Estimating Optimal Transformations for Multiple Regression and Correlation, Journal of the American Statistical Association, vol.41, issue.391, pp.580-598, 1985.
DOI : 10.1080/01621459.1985.10478157

URL : http://www.dtic.mil/get-tr-doc/pdf?AD=ADA123908

F. Bunea, A. B. Tsybakov, and M. H. Wegkamp, Aggregation for gaussian regression. The Annals of Statistics, pp.1674-1697, 2007.
DOI : 10.1214/009053606000001587

URL : http://arxiv.org/abs/0710.3654

R. H. Byrd, P. Lu, J. Nocedal, and C. Zhu, A Limited Memory Algorithm for Bound Constrained Optimization, SIAM Journal on Scientific Computing, vol.16, issue.5, pp.1190-1208, 1995.
DOI : 10.1137/0916069

E. Candès, Mathematics of sparsity (and a few other things), Proceedings of the International Congress of Mathematicians, 2014.

E. Candès and T. Tao, The Dantzig selector: Statistical estimation when p is much larger than n. The Annals of Statistics, pp.2313-2351, 2007.

S. S. Chen, D. L. Donoho, A. Michael, and . Saunders, Atomic Decomposition by Basis Pursuit, SIAM Journal on Scientific Computing, vol.20, issue.1, pp.33-61, 1998.
DOI : 10.1137/S1064827596304010

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

A. P. Dempster, N. M. Laird, and D. B. Rubin, Maximum likelihood for incomplete data via the em algorithm, Journal of the Royal Statistical Society, vol.39, pp.1-38, 1977.

B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, Least angle regression. The Annals of statistics, pp.407-499, 2004.

W. J. Fu, Penalized regressions: the bridge versus the lasso, Journal of computational and graphical statistics, vol.7, issue.3, pp.397-416, 1998.

E. I. George and D. P. Foster, Calibration and empirical Bayes variable selection, Biometrika, vol.87, issue.4, pp.731-747, 2000.
DOI : 10.1093/biomet/87.4.731

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

E. I. George and R. E. Mcculloch, Variable Selection via Gibbs Sampling, Journal of the American Statistical Association, vol.36, issue.423, pp.881-889, 1993.
DOI : 10.1007/BF01889985

Y. Grandvalet, J. Chiquet, and C. Ambroise, Sparsity by worst-case quadratic penalties

I. Guyon, A. Saffari, G. Dror, and G. Cawley, Model selection: Beyond the bayesian/frequentist divide, The Journal of Machine Learning Research, vol.11, pp.61-87, 2010.

D. Hernández-lobato, J. M. Hernández-lobato, and P. Dupont, Generalized spike-and-slab priors for bayesian group feature selection using expectation propagation, The Journal of Machine Learning Research, vol.14, issue.1, pp.1891-1945, 2013.

H. Ishwaran and J. S. Rao, Spike and Slab Gene Selection for Multigroup Microarray Data, Journal of the American Statistical Association, vol.100, issue.471, pp.764-780, 2005.
DOI : 10.1198/016214505000000051

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

H. Ishwaran and J. S. Rao, Spike and slab variable selection: frequentist and bayesian strategies. The Annals of Statistics, pp.730-773, 2005.
DOI : 10.1214/009053604000001147

URL : http://arxiv.org/abs/math/0505633

H. Ishwaran, U. Kogalur, and J. S. Rao, spikeslab: Prediction and variable selection using spike and slab regression, R Journal, vol.2, issue.2, p.2010

N. Kraemer, J. Schaefer, and A. Boulesteix, Regularized estimation of large-scale gene regulatory networks using gaussian graphical models, BMC Bioinformatics, issue.384, p.10, 2009.

F. Liang, R. Paulo, G. Molina, M. A. Clyde, and J. O. Berger, Mixtures of g-priors for bayesian variable selection, Journal of the American Statistical Association, vol.103, issue.481, 2008.

D. J. Mackay, Bayesian Interpolation, Neural Computation, vol.49, issue.3, pp.415-447, 1992.
DOI : 10.1093/comjnl/11.2.185

D. J. Mackay, Comparison of Approximate Methods for Handling Hyperparameters, Neural Computation, vol.39, issue.5, pp.1035-1068, 1999.
DOI : 10.1007/BF01437407

T. L. Markham, Oppenheim's Inequality for Positive Definite Matrices, The American Mathematical Monthly, vol.93, issue.8, pp.642-644, 1986.
DOI : 10.2307/2322329

G. J. Mclachlan and T. Krishnan, The EM Algorithm and Extensions

N. Meinshausen and P. Bühlmann, Stability selection, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.7, issue.4, 2010.
DOI : 10.1111/j.1467-9868.2010.00740.x

T. J. Mitchell and J. J. Beauchamp, Bayesian Variable Selection in Linear Regression, Journal of the American Statistical Association, vol.51, issue.404, pp.1023-1036, 1988.
DOI : 10.1080/01621459.1982.10477809

B. K. Natarajan, Sparse Approximate Solutions to Linear Systems, SIAM Journal on Computing, vol.24, issue.2, pp.227-234, 1995.
DOI : 10.1137/S0097539792240406

A. Njato-randriamanamihaga, E. Côme, L. Oukhellou, and G. Govaert, Clustering the vélib'dynamic origin/destination flows using a family of poisson mixture models, Neurocomputing, 2014.

R. B. O-'hara and M. J. Sillanpää, A review of Bayesian variable selection methods: what, how and which, Bayesian Analysis, vol.4, issue.1, pp.85-117, 2009.
DOI : 10.1214/09-BA403SUPP

A. Oppenheim, Inequalities Connected with Definite Hermitian Forms, Journal of the London Mathematical Society, vol.1, issue.2, pp.114-119, 1930.
DOI : 10.1112/jlms/s1-5.2.114

URL : http://jlms.oxfordjournals.org/cgi/content/short/s1-5/2/114

S. Petrone, J. Rousseau, and C. Scricciolo, Bayes and empirical Bayes: do they merge?, Biometrika, vol.101, issue.2, 2014.
DOI : 10.1093/biomet/ast067

URL : https://hal.archives-ouvertes.fr/hal-00767467

B. M. Pötscher and H. Leeb, On the Distribution of Penalized Maximum Likelihood Estimators: The LASSO, SCAD, and Thresholding, SSRN Electronic Journal, vol.100, issue.9, pp.2065-2082, 2009.
DOI : 10.2139/ssrn.1027629

P. Rigollet and A. Tsybakov, Exponential screening and optimal rates of sparse estimation. The Annals of Statistics, pp.731-771, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00606059

C. P. Robert and G. Casella, Monte Carlo statistical methods, 2004.

V. Ro?ková and E. I. George, EMVS: The EM Approach to Bayesian Variable Selection, Journal of the American Statistical Association, vol.58, issue.506, pp.just-accepted, 2013.
DOI : 10.1561/2200000001

T. E. Scheetz, K. A. Kim, R. E. Swiderski, A. R. Philp, T. Braun et al., Regulation of gene expression in the mammalian eye and its relevance to eye disease, Proceedings of the National Academy of Sciences, pp.14429-14434, 2006.
DOI : 10.1073/pnas.0602562103

J. G. Scott and J. O. Berger, Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem, The Annals of Statistics, vol.38, issue.5, pp.2587-2619, 2010.
DOI : 10.1214/10-AOS792

T. Skeggs, Special report, visitor figures 2013. The Art Newspaper, 2014.

R. Tibshirani, Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society. Series B. (Statistical Methodology), vol.58, issue.1, pp.267-288, 1996.

M. E. Tipping, Sparse bayesian learning and the relevance vector machine, The Journal of Machine Learning Research, vol.1, pp.211-244, 2001.

D. P. Wipf and S. S. Nagarajan, A new view of automatic relevance determination, Advances in neural information processing systems, pp.1625-1632, 2008.

C. F. Wu, On the convergence properties of the em algorithm. The Annals of statistics, pp.95-103, 1983.

L. Xu and M. I. Jordan, On Convergence Properties of the EM Algorithm for Gaussian Mixtures, Neural Computation, vol.11, issue.1, pp.129-151, 1996.
DOI : 10.1162/neco.1994.6.2.334

L. Yengo, J. Jacques, and C. Biernacki, Variable clustering in high dimensional linear regression models, Journal de la Société Française de Statistique, pp.38-56, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00764927

L. Yengo, J. Jacques, C. Biernacki, and M. Canouil, Variable clustering in high-dimensional linear regression: The r package clere, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00940929

P. Zhao and B. Yu, On model selection consistency of lasso, The Journal of Machine Learning Research, vol.7, pp.2541-2563, 2006.

H. Zou, The Adaptive Lasso and Its Oracle Properties, Journal of the American Statistical Association, vol.101, issue.476, pp.1418-1429, 2006.
DOI : 10.1198/016214506000000735

H. Zou and T. Hastie, Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.5, issue.2, pp.301-320, 2005.
DOI : 10.1073/pnas.201162998

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=