Issue |
ESAIM: PS
Volume 19, 2015
|
|
---|---|---|
Page(s) | 725 - 745 | |
DOI | https://doi.org/10.1051/ps/2015013 | |
Published online | 11 December 2015 |
Randomized pick-freeze for sparse Sobol indices estimation in high dimension
UniversitéParis-Sud, Laboratoire de Mathématiques d’Orsay,
Bâtiment 425, Université Paris-Sud, 91405
Orsay,
France
alexandre.janon@math.u-psud.fr
Received:
21
September
2014
Revised:
30
March
2015
This article investigates selection of variables in high-dimension from a non-parametric regression model. In many concrete situations, we are concerned with estimating a non-parametric regression function f that may depend on a large number p of inputs variables. Unlike standard procedures, we do not assume that f belongs to a class of regular functions (Hölder, Sobolev, ...), yet we assume that f is a square-integrable function with respect to a known product measure. Furthermore, observe that, in some situations, only a small number s of the coordinates actually affects f in an additive manner. In this context, we prove that, with only 𝒪(slog p) random evaluations of f, one can find which are the relevant input variables with overwhelming probability. Our proposed method is an unconstrained ℓ1-minimization procedure based on the Sobol’s method. One step of this procedure relies on support recovery using ℓ1-minimization and thresholding. More precisely, we use a thresholded-LASSO to faithfully uncover the significant input variables. In this frame, we prove that one can relax the mutual incoherence property (known to require 𝒪(s2log p) observations) and still ensure faithful recovery from 𝒪(sαlog p) observations for any 1 ≤ α ≤ 2.
Mathematics Subject Classification: 62G08 / 62G35 / 65H10 / 93A30 / 93B35
Key words: Sensitivity analysis / Sobol indices / high-dimensional statistics / LASSO / Monte-Carlo method
© EDP Sciences, SMAI, 2015
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.