hyperparameter tuning

Choosing the correct hyperparameters for machine learning or deep learning models is one of the best ways to extract the last juice out of your models.

The difference between parameter and hyperparameter

What is hyperparameter tuning and why it is important?

Hyperparameter tuning (or hyperparameter optimization) is the process of determining the right combination of hyperparameters that maximizes the model performance. It works by running multiple trials in a single training process. Each trial is a complete execution of your training application with values for your chosen hyperparameters, set within the limits you specify. This process once finished will give you the set of hyperparameter values that are best suited for the model to give optimal results.

Needless to say, It is an important step in any Machine Learning project since it leads to optimal results for a model. If you wish to see it in action, here’s a research paper that talks about the importance of hyperparameter optimization by experimenting on datasets. https://arxiv.org/pdf/2007.07588 https://towardsdatascience.com/hyperparameter-tuning-c5619e7e6624

How to do hyperparameter tuning? How to find the best hyperparameters?

Choosing the right combination of hyperparameters requires an understanding of the hyperparameters and the business use-case. However, technically, there are two ways to set them.

Manual hyperparameter tuning

Manual hyperparameter tuning involves experimenting with different sets of hyperparameters manually i.e. each trial with a set of hyperparameters will be performed by you. This technique will require a robust experiment tracker which could track a variety of variables from images, logs to system metrics.

There are a few experiment trackers that tick all the boxes. neptune.ai is one of them. It offers an intuitive interface and an open-source package neptune-client to facilitate logging into your code. You can easily log hyperparameters and see all types of data results like images, metrics, etc. Head over to the docs to see how you can log different metadata to Neptune.

Alternative solutions include W&B, Comet, or MLFlow. Check more tools for experiment tracking & management here.

Advantages of manual hyperparameter optimization:

Tuning hyperparameters manually means more control over the process.
If you’re researching or studying tuning and how it affects the network weights then doing it manually would make sense.

Disadvantages of manual hyperparameter optimization:

Manual tuning is a tedious process since there can be many trials and keeping track can prove costly and time-consuming.
This isn’t a very practical approach when there are a lot of hyperparameters to consider.

Read about how to manually optimize Machine Learning model hyperparameters here.

Automated hyperparameter tuning

Automated hyperparameter tuning utilizes already existing algorithms to automate the process. The steps you follow are:

First, specify a set of hyperparameters and limits to those hyperparameters’ values (note: every algorithm requires this set to be a specific data structure, e.g. dictionaries are common while working with algorithms).
Then the algorithm does the heavy lifting for you. It runs those trials and fetches you the best set of hyperparameters that will give optimal results.

In the blog, we will talk about some of the algorithms and tools you could use to achieve automated tuning. Let’s get to it.

Hyperparameter tuning methods

In this section, I will introduce all of the hyperparameter optimization methods that are popular today.

Random Search

In the random search method, we create a grid of possible values for hyperparameters. Each iteration tries a random combination of hyperparameters from this grid, records the performance, and lastly returns the combination of hyperparameters that provided the best performance.

Grid Search

In the grid search method, we create a grid of possible values for hyperparameters. Each iteration tries a combination of hyperparameters in a specific order. It fits the model on each and every combination of hyperparameters possible and records the model performance. Finally, it returns the best model with the best hyperparameters.

Bayesian Optimization

Tuning and finding the right hyperparameters for your model is an optimization problem. We want to minimize the loss function of our model by changing model parameters. Bayesian optimization helps us find the minimal point in the minimum number of steps. Bayesian optimization also uses an acquisition function that directs sampling to areas where an improvement over the current best observation is likely.

Tree-structured Parzen estimators (TPE)

The idea of Tree-based Parzen optimization is similar to Bayesian optimization. Instead of finding the values of p(y|x) where y is the function to be minimized (e.g., validation loss) and x is the value of hyperparameter the TPE models P(x|y) and P(y). One of the great drawbacks of tree-structured Parzen estimators is that they do not model interactions between the hyper-parameters. That said TPE works extremely well in practice and was battle-tested across most domains.

Hyperparameter tuning algorithms

These are the algorithms developed specifically for doing hyperparameter tuning.

Hyperband

Hyperband is a variation of random search, but with some explore-exploit theory to find the best time allocation for each of the configurations. You can check this research paper for further references.

Population-based training (PBT)

This technique is a hybrid of the two most commonly used search techniques: Random Search and manual tuning applied to Neural Network models.

PBT starts by training many neural networks in parallel with random hyperparameters. But these networks aren’t fully independent of each other.

It uses information from the rest of the population to refine the hyperparameters and determine the value of hyperparameter to try. You can check this article for more information on PBT.

BOHB

BOHB (Bayesian Optimization and HyperBand) mixes the Hyperband algorithm and Bayesian optimization. You can check this article for further reference. https://neptune.ai/blog/hyperband-and-bohb-understanding-state-of-the-art-hyperparameter-optimization-algorithms

Tools for hyperparameter optimization

Now that you know what are the methods and algorithms let’s talk about tools, and there are a lot of those out there.

Some of the best hyperparameter optimization libraries are:

optuna

Reference List

https://neptune.ai/blog/hyperparameter-tuning-in-python-complete-guide

Boyang Yan

Explorer