Volume 27, 2023
|Page(s)||402 - 460|
|Published online||08 March 2023|
Robust estimation in finite mixture models*
Department of Mathematics, University of Luxembourg, Maison du Nombre,
6 Avenue de la Fonte,
4364 Esch-sur-Alzette, Grand Duchy of Luxembourg
** Corresponding author: firstname.lastname@example.org
Accepted: 27 January 2023
We observe a n-sample, the distribution of which is assumed to belong, or at least to be close enough, to a given mixture model. We propose an estimator of this distribution that belongs to our model and possesses some robustness properties with respect to a possible misspecification of it. We establish a non-asymptotic deviation bound for the Hellinger distance between the target distribution and its estimator when the model consists of a mixture of densities that belong to VC-subgraph classes. Under suitable assumptions and when the mixture model is well-specified, we derive risk bounds for the parameters of the mixture. Finally, we design a statistical procedure that allows us to select from the data the number of components as well as suitable models for each of the densities that are involved in the mixture. These models are chosen among a collection of candidate ones and we show that our selection rule combined with our estimation strategy result in an estimator which satisfies an oracle-type inequality.
Mathematics Subject Classification: 62G05 / 62G35 / 62F35 / 62G07
Key words: Finite mixture model / robust estimation / supremum of an empirical process
© The authors. Published by EDP Sciences, SMAI 2023
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.