Issue |
ESAIM: PS
Volume 18, 2014
|
|
---|---|---|
Page(s) | 584 - 612 | |
DOI | https://doi.org/10.1051/ps/2013041 | |
Published online | 15 October 2014 |
Nonparametric estimation of the density of the alternative hypothesis in a multiple testing setup. Application to local false discovery rate estimation
1
Laboratoire de Mathématiques d’Orsay, Université Paris Sud, UMR
CNRS 8628, Bâtiment
425, 91405
Orsay cedex,
France
nvanhanh@genopole.cnrs.fr
2
Laboratoire Statistique et Génome, Université d’Évry Val
d’Essonne, UMR CNRS 8071, USC INRA, 23 bvd de France, 91037
Évry,
France
catherine.matias@genopole.cnrs.fr
Received: 24 October 2012
Revised: 29 March 2013
In a multiple testing context, we consider a semiparametric mixture model with two components where one component is known and corresponds to the distribution of p-values under the null hypothesis and the other component f is nonparametric and stands for the distribution under the alternative hypothesis. Motivated by the issue of local false discovery rate estimation, we focus here on the estimation of the nonparametric unknown component f in the mixture, relying on a preliminary estimator of the unknown proportion θ of true null hypotheses. We propose and study the asymptotic properties of two different estimators for this unknown component. The first estimator is a randomly weighted kernel estimator. We establish an upper bound for its pointwise quadratic risk, exhibiting the classical nonparametric rate of convergence over a class of Hölder densities. To our knowledge, this is the first result establishing convergence as well as corresponding rate for the estimation of the unknown component in this nonparametric mixture. The second estimator is a maximum smoothed likelihood estimator. It is computed through an iterative algorithm, for which we establish a descent property. In addition, these estimators are used in a multiple testing procedure in order to estimate the local false discovery rate. Their respective performances are then compared on synthetic data.
Mathematics Subject Classification: 62G07 / 62G20
Key words: False discovery rate / kernel estimation / local false discovery rate / maximum smoothed likelihood / multiple testing / p-values / semiparametric mixture model
© EDP Sciences, SMAI, 2014
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.