Issue |
ESAIM: PS
Volume 17, 2013
|
|
---|---|---|
Page(s) | 650 - 671 | |
DOI | https://doi.org/10.1051/ps/2012016 | |
Published online | 04 November 2013 |
An ℓ1-oracle inequality for the Lasso in finite mixture Gaussian regression models
Laboratoire de Mathématiques, Faculté des Sciences d’Orsay,
Université Paris-Sud, 91405
Orsay,
France
caroline.meynet@math.u-psud.fr
Received:
7
January
2012
Revised:
17
July
2012
We consider a finite mixture of Gaussian regression models for high-dimensional heterogeneous data where the number of covariates may be much larger than the sample size. We propose to estimate the unknown conditional mixture density by an ℓ1-penalized maximum likelihood estimator. We shall provide an ℓ1-oracle inequality satisfied by this Lasso estimator with the Kullback–Leibler loss. In particular, we give a condition on the regularization parameter of the Lasso to obtain such an oracle inequality. Our aim is twofold: to extend the ℓ1-oracle inequality established by Massart and Meynet [12] in the homogeneous Gaussian linear regression case, and to present a complementary result to Städler et al. [18], by studying the Lasso for its ℓ1-regularization properties rather than considering it as a variable selection procedure. Our oracle inequality shall be deduced from a finite mixture Gaussian regression model selection theorem for ℓ1-penalized maximum likelihood conditional density estimation, which is inspired from Vapnik’s method of structural risk minimization [23] and from the theory on model selection for maximum likelihood estimators developed by Massart in [11].
Mathematics Subject Classification: 62G08 / 62H30
Key words: Finite mixture of Gaussian regressions model / Lasso / ℓ1-oracle inequalities / model selection by penalization / ℓ1-balls
© EDP Sciences, SMAI, 2013
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.