Show simple item record

dc.contributor.authorSinha, Ankur
dc.contributor.authorKhandait, Tanmay
dc.contributor.authorMohanty, Raja
dc.date.accessioned2024-01-03T09:14:33Z
dc.date.available2024-01-03T09:14:33Z
dc.date.issued2023-09-29
dc.identifier.issn18624480
dc.identifier.urihttp://hdl.handle.net/11718/27004
dc.description.abstractHyperparameter tuning in the area of machine learning is often achieved using naive techniques, such as random search and grid search. However, most of these methods seldom lead to an optimal set of hyperparameters and often get very expensive. The hyperparameter optimization problem is inherently a bilevel optimization task, and there exist studies that have attempted bilevel solution methodologies to solve this problem. These techniques often assume a unique set of weights that minimizes the loss on the training set. Such an assumption is violated by deep learning architectures. We propose a bilevel solution method for solving the hyperparameter optimization problem that does not suffer from the drawbacks of the earlier studies. The proposed method is general and can be easily applied to any class of machine learning algorithms that involve continuous hyperparameters. The idea is based on the approximation of the lower level optimal value function mapping that helps in reducing the bilevel problem to a single-level constrained optimization task. The single-level constrained optimization problem is then solved using the augmented Lagrangian method. We perform extensive computational study on three datasets that confirm the efficiency of the proposed method. A comparative study against grid search, random search, Tree-structured Parzen Estimator and Quasi Monte Carlo Sampler shows that the proposed algorithm is multiple times faster and leads to models that generalize better on the testing set.en_US
dc.language.isoenen_US
dc.publisherSpringeren_US
dc.relation.ispartofOptimization Lettersen_US
dc.subjectBilevel optimizationen_US
dc.subjectHyperparameter tuningen_US
dc.subjectMachine learningen_US
dc.subjectModel regularizationen_US
dc.subjectAugmented Lagrangian methoden_US
dc.titleA gradient-based bilevel optimization approach for tuning regularization hyperparametersen_US
dc.typeArticleen_US


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record