Stochastic approximation algorithms in optimization and machine learning

September 5, 2025 2 Views

[Submitted on 6 Sep 2023 (v1), last revised 3 Sep 2025 (this version, v3)]

View a PDF of the paper titled The case for and against fixed step-size: Stochastic approximation algorithms in optimization and machine learning, by Caio Kalil Lauand and 1 other authors

View PDF
HTML (experimental)

Abstract:Theory and application of stochastic approximation (SA) have become increasingly relevant due in part to applications in optimization and reinforcement learning. This paper takes a new look at SA with constant step-size $\alpha>0$, defined by the recursion, $$\theta_{n+1} = \theta_{n}+ \alpha f(\theta_n,\Phi_{n+1})$$ in which $\theta_n\in\mathbb{R}^d$ and $\{\Phi_{n}\}$ is a Markov chain. The goal is to approximately solve root finding problem $\bar{f}(\theta^*)=0$, where $\bar{f}(\theta)=\mathbb{E}[f(\theta,\Phi)]$ and $\Phi$ has the steady-state distribution of $\{\Phi_{n}\}$.

The following conclusions are obtained under an ergodicity assumption on the Markov chain, compatible assumptions on $f$, and for $\alpha>0$ sufficiently small:

$\textbf{1.}$ The pair process $\{(\theta_n,\Phi_n)\}$ is geometrically ergodic in a topological sense.

$\textbf{2.}$ For every $1\le p\le 4$, there is a constant $b_p$ such that $\limsup_{n\to\infty}\mathbb{E}[\|\theta_n-\theta^*\|^p]\le b_p \alpha^{p/2}$ for each initial condition.

$\textbf{3.}$ The Polyak-Ruppert-style averaged estimates $\theta^{\text{PR}}_n=n^{-1}\sum_{k=1}^{n}\theta_k$ converge to a limit $\theta^{\text{PR}}_\infty$ almost surely and in mean square, which satisfies $\theta^{\text{PR}}_\infty=\theta^*+\alpha \bar{\Upsilon}^*+O(\alpha^2)$ for an identified non-random $\bar{\Upsilon}^*\in\mathbb{R}^d$. Moreover, the covariance is approximately optimal: The limiting covariance matrix of $\theta{\text PR}_n$ is approximately minimal in a matricial sense.

The two main take-aways for practitioners are application-dependent. It is argued that, in applications to optimization, constant gain algorithms may be preferable even when the objective has multiple local minima; while a vanishing gain algorithm is preferable in applications to reinforcement learning due to the presence of bias.

Submission history

From: Caio Kalil Lauand [view email]
[v1]
Wed, 6 Sep 2023 12:22:32 UTC (190 KB)
[v2]
Sun, 17 Sep 2023 17:16:32 UTC (1,830 KB)
[v3]
Wed, 3 Sep 2025 02:47:11 UTC (8,767 KB)

Source link

Deep Insight Think Deeper. See Clearer

[D] Why does BYOL/JEPA like models work? How does EMA prevent model collapse?

[D] cool applications of ML in fixed income markets?

[D] AAAI considered 2nd tier now?

[R] Building a deep learning image model system to identify BJJ positions in matches

Instruction-guided reinforcement learning with cross-modal auxiliary objectives

Implementing the Gaussian Challenge in Python

AI Agents for Web Testing: A Case Study in the Wild

High Precision Physics-Informed Neural Operators via Fourier Continuation

Stochastic approximation algorithms in optimization and machine learning

Submission history

About AI Writer

Check Also

Instruction-guided reinforcement learning with cross-modal auxiliary objectives

Leave a Reply Cancel reply

2 Products Test Positive For Listeria—Check Your Freezer Now

Instruction-guided reinforcement learning with cross-modal auxiliary objectives

Windows 11 SSD issues blamed on reviewers using ‘early versions of firmware’

How Autonomous Agents Are Transforming Transportation: A Deep Dive | autonomous agents Guide 2025

That’s great people think I look like a zombie cadaver

2 Products Test Positive For Listeria—Check Your Freezer Now

Demystifying Machine Learning: A Beginner’s Guide | machine learning Guide 2025

Demystifying Deep Learning: A Beginner’s Guide | deep learning Guide 2025

Unleashing Creativity: The Power of Generative AI in Art and Design | generative ai Guide 2025

Understanding ChatGPT: The Future of Conversational AI | chatgpt Guide 2025