Non-asymptotic analysis of Stochastic approximation algorithms for streaming data | ESAIM: Probability and Statistics (ESAIM: P&S)

Open Access

Issue		ESAIM: PS Volume 27, 2023


Page(s)		482 - 514
DOI		https://doi.org/10.1051/ps/2023006
Published online		12 April 2023

F. Bach and E. Moulines, Non-asymptotic analysis of stochastic approximation algorithms for machine learning. Adv. Neural Inf. Process. Syst. 24 (2011). [Google Scholar]
F. Bach and E. Moulines, Non-strongly-convex smooth stochastic approximation with convergence rate O (1/n). Adv. Neural Inf. Process. Syst. 26 (2013). [Google Scholar]
A. Benveniste, M. Metivier and P. Priouret, Vol. 22 of Adaptive algorithms and stochastic approximations. Springer Science & Business Media (2012). [Google Scholar]
L. Bottou, F.E. Curtis and J. Nocedal, Optimization methods for large-scale machine learning. Siam Rev. 60 (2018) 223-311. [Google Scholar]
C. Boyer and A. Godichon-Baggioni, On the asymptotic rate of convergence of stochastic newton algorithms and their weighted averaged versions. Comput. Optim. Appl. (2022) 1-52. [Google Scholar]
H. Cardot, P. Cenac and A. Godichon-Baggioni, Online estimation of the geometric median in Hilbert spaces: nonasymptotic confidence balls. Ann. Stat. (2017) 591-614. [Google Scholar]
H. Cardot, P. Cenac and J.-M. Monnez, A fast and recursive algorithm for clustering large datasets with k-medians. Comput. Stat. Data Anal. 56 (2012) 1434-1449. [Google Scholar]
H. Cardot, P. Cenac and P.-A. Zitt, Efficient and fast estimation of the geometric median in Hilbert spaces with an averaged stochastic gradient algorithm. Bernoulli 19 (2013) 18-43. [MathSciNet] [Google Scholar]
A. d’Aspremont, Smooth optimization with approximate gradient. SIAM J. Optim. 19 (2008) 1171-1183. [Google Scholar]
S. Gadat and F. Panloup, Optimal non-asymptotic analysis of the Ruppert-Polyak averaging stochastic algorithm. Stoch. Process. Appl. 156 (2023) 312-348. [Google Scholar]
D. Gervini, Robust functional estimation using the median and spherical principal components. Biometrika 95 (2008) 587-600. [CrossRef] [MathSciNet] [Google Scholar]
A. Godichon-Baggioni, Estimating the geometric median in Hilbert spaces with stochastic gradient algorithms: Lp and almost sure rates of convergence. J. Multivariate Anal. 146 (2016) 209-222. [Google Scholar]
A. Godichon-Baggioni, Lp and almost sure rates of convergence of averaged stochastic gradient algorithms: locally strongly convex objective. ESAIM: Probab. Stat. 23 (2019) 841-873. [Google Scholar]
A. Godichon-Baggioni, Convergence in quadratic mean of averaged stochastic gradient algorithms without strong convexity nor bounded gradient. Preprint arXiv:2107.12058 (2021). [Google Scholar]
A. Godichon-Baggioni and B. Portier, An averaged projected Robbins-Monro algorithm for estimating the parameters of a truncated spherical distribution. Electr. J. Stat. 11 (2017) 1890-1927. [Google Scholar]
R.M. Gower, N. Loizou, X. Qian, A. Sailanbayev, E. Shulgin and P. Richtárik, SGD: General analysis and improved rates, in International conference on machine learning, PMLR (2019) 5200-5209. [Google Scholar]
J. Haldane, Note on the median of a multivariate distribution. Biometrika 35 (1948) 414-417. [CrossRef] [Google Scholar]
T. Hastie, R. Tibshirani, J.H. Friedman and J.H. Friedman, The elements of statistical learning: data mining, inference, and prediction, vol. 2. Springer (2009). [Google Scholar]
H. Karimi, J. Nutini and M. Schmidt, Linear convergence of gradient and proximal-gradient methods under the polyaklojasiewicz condition, in Joint European conference on machine learning and knowledge discovery in databases. Springer (2016) 795-811. [CrossRef] [Google Scholar]
J. Kemperman, The median of a finite measure on a Banach space. Statistical data analysis based on the L1-norm and related methods (Neuchâtel, 1987) (1987) 217-230. [Google Scholar]
K. Kurdyka, On gradients of functions definable in o-minimal structures. Ann. l’institut Fourier 48 (1998) 769-783. [CrossRef] [MathSciNet] [Google Scholar]
H. Kushner and G.G. Yin, Vol. 35 of Stochastic approximation and recursive algorithms and applications. Springer Science & Business Media (2003). [Google Scholar]
G. Lan, First-order and stochastic optimization methods for machine learning. Springer (2020). [CrossRef] [Google Scholar]
Y. LeCun, Y. Bengio and G. Hinton, Deep learning. Nature 521 (2015) 436-444. [CrossRef] [PubMed] [Google Scholar]
S. Lojasiewicz, A topological property of real analytic subsets. Coll. du CNRS, Les equations aux dérivées partielles 117 (1963) 2. [Google Scholar]
A. Mokkadem and M. Pelletier, A generalization of the averaging procedure: the use of two-time-scale algorithms. SIAM J. Control Optim. 49 (2011) 1523-1543. [Google Scholar]
N. Murata and S.-I. Amari, Statistical analysis of learning dynamics. Signal Process. 74 (1999) 3-28. [Google Scholar]
I. Necoara, Y. Nesterov and F. Glineur, Linear convergence of first order methods for non-strongly convex optimization. Math. Program,. 175 (2019) 69-107. [CrossRef] [MathSciNet] [Google Scholar]
A. Nemirovski, A. Juditsky, G. Lan and A. Shapiro, Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19 (2009) 1574-1609. [Google Scholar]
Y. Nesterov et al., Lectures on convex optimization, vol. 137. Springer (2018). [CrossRef] [Google Scholar]
B.T. Polyak, Gradient methods for minimizing functionals. Zhurnal Vychislitel’noi Matematiki i Matematicheskoi Fiziki 3 (1963) 643-653. [Google Scholar]
B.T. Polyak and A.B. Juditsky, Acceleration of stochastic approximation by averaging. SIAM J. Control Optim. 30 (1992) 838-855. [Google Scholar]
H. Robbins and S. Monro, A stochastic approximation method. Ann. Math. Stat. (1951) 400-407. [Google Scholar]
D. Ruppert, Efficient estimations from a slowly convergent Robbins-Monro process. Tech. rep., Cornell University Operations Research and Industrial Engineering (1988). [Google Scholar]
M. Schmidt, N. Roux and F. Bach, Convergence rates of inexact proximal-gradient methods for convex optimization. Adv. Neural Inf. Process. Syst. 24 (2011) 1458-1466. [Google Scholar]
S. Shalev-Shwartz et al., Online learning and online convex optimization. Found. Trends Mach. Learn. 4 (2012) 107-194. [Google Scholar]
I. Steinwart and A. Christmann, Estimating conditional quantiles with the help of the pinball loss. Bernoulli 17 (2011) 211-225. [CrossRef] [MathSciNet] [Google Scholar]
C.H. Teo, A. Smola, S. Vishwanathan and Q.V. Le, A scalable modular convex solver for regularized risk minimization, in Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining (2007) 727-736. [CrossRef] [Google Scholar]
N. Werge and O. Wintenberger, AdaVol: An adaptive recursive volatility prediction method. Econometr. Stat. 23 (2022) 19-35. [Google Scholar]
M. Zinkevich, Online convex programming and generalized infinitesimal gradient ascent, in Proceedings of the 20th International Conference on Machine Learning (ICML-03) (2003) 928-936. [Google Scholar]

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.