a1 Institut de Mathématiques de Toulouse, INSA de Toulouse, Université de Toulouse, 135 avenue de Rangueil, 31077 Toulouse Cedex 4, France; firstname.lastname@example.org
a2 Laboratoire de Statistique Théorique et Appliquée, Université Paris 6, 175 rue du Chevaleret, 75013 Paris, France; email@example.com
Specific Gaussian mixtures are considered to solve simultaneously variable selection and clustering problems. A non asymptotic penalized criterion is proposed to choose the number of mixture components and the relevant variable subset. Because of the non linearity of the associated Kullback-Leibler contrast on Gaussian mixtures, a general model selection theorem for maximum likelihood estimation proposed by [Massart Concentration inequalities and model selection Springer, Berlin (2007). Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23 (2003)] is used to obtain the penalty function form. This theorem requires to control the bracketing entropy of Gaussian mixture families. The ordered and non-ordered variable selection cases are both addressed in this paper.
(Received September 19 2008)
(Revised February 13 2009)
(Online publication January 5 2012)
Mathematics Subject Classification: