Stochastic approximation algorithms in optimization and machine learning

View a PDF of the paper titled The case for and against fixed step-size: Stochastic approximation algorithms in optimization and machine learning, by Caio Kalil Lauand and 1 other authors

View PDF
HTML (experimental)

Abstract:Theory and application of stochastic approximation (SA) have become increasingly relevant due in part to applications in optimization and reinforcement learning. This paper takes a new look at SA with constant step-size $\alpha>0$, defined by the recursion, $$\theta_{n+1} = \theta_{n}+ \alpha f(\theta_n,\Phi_{n+1})$$ in which $\theta_n\in\mathbb{R}^d$ and $\{\Phi_{n}\}$ is a Markov chain. The goal is to approximately solve root finding problem $\bar{f}(\theta^*)=0$, where $\bar{f}(\theta)=\mathbb{E}[f(\theta,\Phi)]$ and $\Phi$ has the steady-state distribution of $\{\Phi_{n}\}$.

The following conclusions are obtained under an ergodicity assumption on the Markov chain, compatible assumptions on $f$, and for $\alpha>0$ sufficiently small:

$\textbf{1.}$ The pair process $\{(\theta_n,\Phi_n)\}$ is geometrically ergodic in a topological sense.

$\textbf{2.}$ For every $1\le p\le 4$, there is a constant $b_p$ such that $\limsup_{n\to\infty}\mathbb{E}[\|\theta_n-\theta^*\|^p]\le b_p \alpha^{p/2}$ for each initial condition.

$\textbf{3.}$ The Polyak-Ruppert-style averaged estimates $\theta^{\text{PR}}_n=n^{-1}\sum_{k=1}^{n}\theta_k$ converge to a limit $\theta^{\text{PR}}_\infty$ almost surely and in mean square, which satisfies $\theta^{\text{PR}}_\infty=\theta^*+\alpha \bar{\Upsilon}^*+O(\alpha^2)$ for an identified non-random $\bar{\Upsilon}^*\in\mathbb{R}^d$. Moreover, the covariance is approximately optimal: The limiting covariance matrix of $\theta{\text PR}_n$ is approximately minimal in a matricial sense.

The two main take-aways for practitioners are application-dependent. It is argued that, in applications to optimization, constant gain algorithms may be preferable even when the objective has multiple local minima; while a vanishing gain algorithm is preferable in applications to reinforcement learning due to the presence of bias.

Submission history

From: Caio Kalil Lauand [view email]
[v1]
Wed, 6 Sep 2023 12:22:32 UTC (190 KB)
[v2]
Sun, 17 Sep 2023 17:16:32 UTC (1,830 KB)
[v3]
Wed, 3 Sep 2025 02:47:11 UTC (8,767 KB)


Source link

About AI Writer

AI Writer is a content creator powered by advanced artificial intelligence. Specializing in technology, machine learning, and future trends, AI Writer delivers fresh insights, tutorials, and guides to help readers stay ahead in the digital era.

Check Also

Instruction-guided reinforcement learning with cross-modal auxiliary objectives

[Submitted on 29 Nov 2024 (v1), last revised 7 Sep 2025 (this version, v2)] View …

Leave a Reply

Your email address will not be published. Required fields are marked *