Please use this identifier to cite or link to this item: http://hdl.handle.net/11718/27004
Title: A gradient-based bilevel optimization approach for tuning regularization hyperparameters
Authors: Sinha, Ankur
Khandait, Tanmay
Mohanty, Raja
Keywords: Bilevel optimization;Hyperparameter tuning;Machine learning;Model regularization;Augmented Lagrangian method
Issue Date: 29-Sep-2023
Publisher: Springer
Abstract: Hyperparameter tuning in the area of machine learning is often achieved using naive techniques, such as random search and grid search. However, most of these methods seldom lead to an optimal set of hyperparameters and often get very expensive. The hyperparameter optimization problem is inherently a bilevel optimization task, and there exist studies that have attempted bilevel solution methodologies to solve this problem. These techniques often assume a unique set of weights that minimizes the loss on the training set. Such an assumption is violated by deep learning architectures. We propose a bilevel solution method for solving the hyperparameter optimization problem that does not suffer from the drawbacks of the earlier studies. The proposed method is general and can be easily applied to any class of machine learning algorithms that involve continuous hyperparameters. The idea is based on the approximation of the lower level optimal value function mapping that helps in reducing the bilevel problem to a single-level constrained optimization task. The single-level constrained optimization problem is then solved using the augmented Lagrangian method. We perform extensive computational study on three datasets that confirm the efficiency of the proposed method. A comparative study against grid search, random search, Tree-structured Parzen Estimator and Quasi Monte Carlo Sampler shows that the proposed algorithm is multiple times faster and leads to models that generalize better on the testing set.
URI: http://hdl.handle.net/11718/27004
ISSN: 18624480
Appears in Collections:Journal Articles

Files in This Item:
There are no files associated with this item.


Items in IIMA Institutional Repository are protected by copyright, with all rights reserved, unless otherwise indicated.