H. Akaike, A new look at the statistical model identification. Automatic Control, IEEE Transactions on, vol.19, issue.6, pp.716-723, 1974.

J. L. Andrews and P. D. Mcnicholas, Model-based clustering, classification, and discriminant analysis via mixtures of multivariate t-distributions, Statistics and Computing, vol.1, issue.4, pp.1021-1029, 2012.
DOI : 10.1007/s11222-011-9272-x

C. Biernacki, G. Celeux, and G. Govaert, Assessing a mixture model for clustering with the integrated completed likelihood, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.22, issue.7, pp.719-725, 2001.
DOI : 10.1109/34.865189

N. Bouguila, D. Ziou, and J. Vaillancourt, Novel Mixtures Based on the Dirichlet Distribution: Application to Data and Image Classification, Machine Learning and Data Mining in Pattern Recognition, pp.172-181, 2003.
DOI : 10.1007/3-540-45065-3_15

C. Bouveyron and C. Brunet, Simultaneous model-based clustering and visualization in the Fisher discriminative subspace, Statistics and Computing, vol.20, issue.2, pp.301-324, 2012.
DOI : 10.1007/s11222-011-9249-9

URL : https://hal.archives-ouvertes.fr/hal-00492406

C. Bouveyron and C. Brunet-saumard, Model-based clustering of high-dimensional data: A review, Computational Statistics & Data Analysis, vol.71, pp.52-78, 2013.
DOI : 10.1016/j.csda.2012.12.008

URL : https://hal.archives-ouvertes.fr/hal-00750909

C. Bouveyron and S. Girard, Robust supervised classification with mixture models: Learning from data with uncertain labels, Pattern Recognition, vol.42, issue.11, pp.2649-2658, 2009.
DOI : 10.1016/j.patcog.2009.03.027

URL : https://hal.archives-ouvertes.fr/hal-00325263

C. Bouveyron and J. Jacques, Model-based clustering of time series in group-specific functional subspaces Advances in Data Analysis and Classification, pp.281-300, 2011.

C. Bouveyron, S. Girard, and C. Schmid, High-dimensional discriminant analysis Communication in Statistics: Theory and Methods, pp.2607-2623, 2007.

C. Bouveyron, S. Girard, and C. Schmid, High-dimensional data clustering, Computational Statistics & Data Analysis, vol.52, issue.1, pp.502-519, 2007.
DOI : 10.1016/j.csda.2007.02.009

URL : https://hal.archives-ouvertes.fr/inria-00548573

S. Canu, Y. Grandvalet, V. Guigue, and A. Rakotomamonjy, SVM and kernel methods matlab toolbox, Perception Systemes et Information, 2005.

A. Caponnetto, C. A. Micchelli, M. Pontil, and Y. Ying, Universal multi-task kernels, Journal of Machine Learning Research, vol.68, pp.1615-1646, 2008.

G. Celeux and G. Govaert, Clustering criteria for discrete data and latent class models, Journal of Classification, vol.4, issue.4, pp.157-176, 1991.
DOI : 10.1007/BF02616237

URL : https://hal.archives-ouvertes.fr/inria-00075437

O. Chapelle, B. Schölkopf, and A. Zien, Semi-Supervised Learning, 2006.
DOI : 10.7551/mitpress/9780262033589.001.0001

J. Couto, Kernel K-Means for Categorical Data, Advances in Intelligent Data Analysis VI, pp.739-739, 2005.
DOI : 10.1007/11552253_5

M. Cuturi and J. P. Vert, The context-tree kernel for strings, Neural Networks, vol.18, issue.8, pp.1111-1123, 2005.
DOI : 10.1016/j.neunet.2005.07.010

URL : https://hal.archives-ouvertes.fr/hal-00433583

A. Dempster, N. Laird, and D. Rubin, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, Series B, vol.39, issue.1, pp.1-38, 1977.

M. M. Dundar and D. A. Landgrebe, Toward an Optimal Supervised Classifier for the Analysis of Hyperspectral Data, IEEE Transactions on Geoscience and Remote Sensing, vol.42, issue.1, pp.271-277, 2004.
DOI : 10.1109/TGRS.2003.817813

T. Evgeniou, C. A. Micchelli, and M. Pontil, Learning multiple tasks with kernel methods, Journal of Machine Learning Research, vol.6, pp.615-637, 2005.

R. A. Fisher, THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS, Annals of Eugenics, vol.59, issue.2, pp.179-188, 1936.
DOI : 10.1111/j.1469-1809.1936.tb02137.x

F. Forbes and D. Wraith, A new family of multivariate heavy-tailed distributions with variable marginal amounts of tailweight: application to robust clustering, Statistics and Computing, vol.94, issue.1, 2014.
DOI : 10.1007/s11222-013-9414-4

B. C. Franczak, R. P. Browne, and P. D. Mcnicholas, Mixtures of Shifted AsymmetricLaplace Distributions, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.36, issue.6, pp.1149-1157, 2014.
DOI : 10.1109/TPAMI.2013.216

M. Girolami, Mercer kernel-based clustering in feature space, IEEE Transactions on Neural Networks, vol.13, issue.3, pp.780-784, 2002.
DOI : 10.1109/TNN.2002.1000150

M. Gönen and E. Alpaydin, Multiple kernel learning algorithms, Journal of Machine Learning Research, vol.12, pp.2211-2268, 2011.

T. Hofmann, B. Schölkopf, and A. Smola, Kernel methods in machine learning. The Annals of Statistics, pp.1171-1220, 2008.

H. Kadri, A. Rakotomamonjy, F. Bach, and P. Preux, Multiple Operator-valued Kernel Learning, Neural Information Processing Systems (NIPS), pp.1172-1080, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00677012

M. Kuss and C. Rasmussen, Assessing approximate inference for binary Gaussian process classification, Journal of Machine Learning Research, vol.6, pp.1679-1704, 2005.

S. Lee and G. J. Mclachlan, Finite mixtures of multivariate skew t-distributions: some recent and new results, Statistics and Computing, vol.82, issue.4, pp.181-202, 2013.
DOI : 10.1007/s11222-012-9362-4

R. Lehoucq and D. Sorensen, Deflation Techniques for an Implicitly Restarted Arnoldi Iteration, SIAM Journal on Matrix Analysis and Applications, vol.17, issue.4, pp.789-821, 1996.
DOI : 10.1137/S0895479895281484

T. I. Lin, Robust mixture modeling using multivariate skew t??distributions, Statistics and Computing, vol.14, issue.3, pp.343-356, 2010.
DOI : 10.1007/s11222-009-9128-9

T. I. Lin, J. C. Lee, and W. J. Hsieh, Robust mixture modeling using the skew t distribution, Statistics and Computing, vol.14, issue.2, pp.81-92, 2007.
DOI : 10.1007/s11222-006-9005-8

P. Mahé and J. P. Vert, Graph kernels based on tree patterns for molecules, Machine Learning, vol.21, issue.Suppl.??1, pp.3-35, 2009.
DOI : 10.1007/s10994-008-5086-2

G. Mclachlan, Discriminant Analysis and Statistical Pattern Recognition, 1992.
DOI : 10.1002/0471725293

G. Mclachlan, D. Peel, and R. Bean, Modelling high-dimensional data by mixtures of factor analyzers, Computational Statistics & Data Analysis, vol.41, issue.3-4, pp.379-388, 2003.
DOI : 10.1016/S0167-9473(02)00183-4

P. Mcnicholas and B. Murphy, Parsimonious Gaussian mixture models, Statistics and Computing, vol.61, issue.3, pp.285-296, 2008.
DOI : 10.1007/s11222-008-9056-0

S. Mika, G. Ratsch, J. Weston, B. Schölkopf, and K. R. Müllers, Fisher discriminant analysis with kernels, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468), pp.41-48, 1999.
DOI : 10.1109/NNSP.1999.788121

T. Minka, Expectation propagation for approximate bayesian inference, Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence, pp.362-369, 2001.

A. Montanari and C. Viroli, Heteroscedastic factor mixture analysis. Statistical Modeling: An International journal, pp.441-460, 2010.

T. B. Murphy, N. Dean, and A. E. Raftery, Variable selection and updating in model-based discriminant analysis for high dimensional data with food authenticity applications, The Annals of Applied Statistics, vol.4, issue.1, pp.219-223, 2010.
DOI : 10.1214/09-AOAS279SUPP

A. Murua and N. Wicker, Kernel-based mixture models for classification, Computational Statistics, vol.42, issue.7, 2014.
DOI : 10.1007/s00180-014-0535-9

E. Pekalska and B. Haasdonk, Kernel Discriminant Analysis for Positive Definite and Indefinite Kernels, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.31, issue.6, pp.1017-1032, 2009.
DOI : 10.1109/TPAMI.2008.290

J. O. Ramsay and B. W. Silverman, Functional Data Analysis. Springer Series in Statistics, Biometrical Journal, vol.40, issue.1, 2005.
DOI : 10.1002/(SICI)1521-4036(199804)40:1<56::AID-BIMJ56>3.0.CO;2-#

C. Rasmussen and C. Williams, Gaussian processes for machine learning matlab toolbox, 2006.

C. Rasmussen and C. Williams, Gaussian Processes in Machine Learning, 2006.
DOI : 10.1162/089976602317250933

B. Scholkopf and A. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization , and Beyond, 2001.

B. Schölkopf, A. Smola, and K. Müller, Nonlinear Component Analysis as a Kernel Eigenvalue Problem, Neural Computation, vol.20, issue.5, pp.1299-1319, 1998.
DOI : 10.1007/BF02281970

G. Schwarz, Estimating the Dimension of a Model, The Annals of Statistics, vol.6, issue.2, pp.461-464, 1978.
DOI : 10.1214/aos/1176344136

J. Shawe-taylor and N. Cristianini, Kernel Methods for Pattern Analysis, 2004.
DOI : 10.1017/CBO9780511809682

G. R. Shorack and J. A. Wellner, Empirical Processes with Applications to Statistics, 1986.
DOI : 10.1137/1.9780898719017

A. Smola and R. Kondor, Kernels and Regularization on Graphs, Proc. Conf. on Learning Theory and Kernel Machines, pp.144-158, 2003.
DOI : 10.1007/978-3-540-45167-9_12

J. Wang, J. Lee, and C. Zhang, Kernel Trick Embedded Gaussian Mixture Model, Proceedings of the 14th international conference on algorithmic learning theory, pp.159-174, 2003.
DOI : 10.1007/978-3-540-39624-6_14

Z. Xu, K. Huang, J. Zhu, I. King, and M. R. Lyu, A novel kernel-based maximum a posteriori classification method, Neural Networks, vol.22, issue.7, pp.977-987, 2009.
DOI : 10.1016/j.neunet.2008.11.005