Volume 19, 2015
|Page(s)||115 - 134|
|Published online||24 June 2015|
Sharp variable selection of a sparse submatrix in a high-dimensional noisy matrix
1 Université Paris-Est, LAMA (UMR 8050), UPEMLV, UPEC, CNRS, 77454, Marne-la-Vallée, France
2 CREST, Timbre J340 3, av. Pierre Larousse, 92240 Malakoff Cedex, France
3 St. Petersburg National Research University of Information Technologies, Mechanics and Optics, 49 Kronverkskiy pr., 197101 St. Petersburg, Russia
Received: 18 February 2014
Revised: 22 May 2014
We observe a N × M matrix of independent, identically distributed Gaussian random variables which are centered except for elements of some submatrix of size n × m where the mean is larger than some a> 0. The submatrix is sparse in the sense that n/N and m/M tend to 0, whereas n,m,N and M tend to infinity. We consider the problem of selecting the random variables with significantly large mean values, as was also considered by [M. Kolar, S. Balakrishnan, A. Rinaldo and A. Singh, NIPS (2011)]. We give sufficient conditions on a as a function of n,m,N and M and construct a uniformly consistent procedure in order to do sharp variable selection. We also prove the minimax lower bounds under necessary conditions which are complementary to the previous conditions. The critical values a∗ separating the necessary and sufficient conditions are sharp (we show exact constants), whereas [M. Kolar, S. Balakrishnan, A. Rinaldo and A. Singh, NIPS (2011)] only prove rate optimality and focus on suboptimal computationally feasible selectors. Note that rate optimality in this problem leaves out a large set of possible parameters, where we do not know whether consistent selection is possible.
Mathematics Subject Classification: 62G05 / 62G20
Key words: Estimation / minimax testing / large matrices / selection of sparse signal / sharp selection bounds / variable selection
© EDP Sciences, SMAI, 2015
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.