Electronic Transactions on Numerical Analysis (ETNA)

Volume 52, pp. 214-229, 2020.

ADMM-Softmax: an ADMM approach for multinomial logistic regression

Samy Wu Fung, Sanna Tyrväinen, Lars Ruthotto, and Eldad Haber

Abstract

We present ADMM-Softmax, an alternating direction method of multipliers (ADMM) for solving multinomial logistic regression (MLR) problems. Our method is geared toward supervised classification tasks with many examples and features. It decouples the nonlinear optimization problem in MLR into three steps that can be solved efficiently. In particular, each iteration of ADMM-Softmax consists of a linear least-squares problem, a set of independent small-scale smooth, convex problems, and a trivial dual variable update. The solution of the least-squares problem can be accelerated by pre-computing a factorization or preconditioner, and the separability in the smooth, convex problem can be easily parallelized across examples. For two image classification problems, we demonstrate that ADMM-Softmax leads to improved generalization compared to a Newton-Krylov, a quasi Newton, and a stochastic gradient descent method.

Full Text (PDF) [881 KB], BibTeX , DOI: 10.1553/etna_vol52s214

Key words

machine learning, nonlinear optimization, alternating direction method of multipliers, classification, multinomial regression

AMS subject classifications

65J22, 90C25, 49M27

Links to the cited ETNA articles

[8]	Vol. 28 (2007-2008), pp. 149-167 Julianne Chung, James G. Nagy, and Dianne P. O'Leary: A weighted-GCV method for Lanczos-hybrid regularization
[15]	Vol. 44 (2015), pp. 83-123 Silvia Gazzola, Paolo Novati, and Maria Rosaria Russo: On Krylov projection methods and Tikhonov regularization

ETNA articles which cite this article

Vol. 60 (2024), pp. 618-635 Kelvin Kan, James G. Nagy, and Lars Ruthotto: LSEMINK: a modified Newton–Krylov method for Log-Sum-Exp minimization

< Vol. 52 (2020)

Volumes 2011-2020