Hyperparameter Tuning in Deep Neural Networks

Jul 21, 2019 2019-07-21T00:00:00-05:00 on Tutorial

One way to make your deep learning model more accurate and generate better results is to tune your model’s hyperparameters. By doing so, you can speed up your training process and optimize the outputs provided by the model. In this post, we try to figure out some ways to make sure that we choose the right hyperparameters every time we train a deep neural network.

1. Tuning Process

The most common hyperparameters one needs to choose for training their neural networks are as follows:
- Learning rate (alpha: “\(\alpha\)”)
- Mini-batch size
- Number of hidden units for each layer
- Momentum term (beta: “\(\beta\)”); generally \(\beta\) = 0.9 is taken
- Number of layers
- Learning rate decay (Changing \(\alpha\) as the learning progresses)
- For Adam Optimizers: (\(\beta_{1}\), \(\beta_{2}\), and \(\epsilon\))
  - Generally, \(\beta_{1}\) = 0.9, \(\beta_{2}\) = 0.999, and \(\epsilon\) = \(10^{-8}\)
The hyperparameters are listed in the order of their significance (in tuning) while training a deep neural network, but the order may vary according to the requirements
When tuning hyperparameters, try to sample the values of the parameters in random so that we can find the ones that perform the best for our model

2. Using an appropriate scale to pick hyperparameters

In cases when we try to sample values for a hyperparameter like learning_rate (\(\alpha\)), we need to be smart while taking random values at different scales
For example, the acceptable value for the learning_rate can be anything in between 0 and 1, but we know that the values that are less than 0.1 are more plausible that the higher values
- In such case, we can divide the scale from 0 to 1 logarithmically and then take random values from each scale
  - e.g. r = -4 * np.random.rand(); r –> [-4, 0]
  - \(\alpha\) = \(10^{r}\); \(\alpha\) –> [\(10^{-4}\), 1]

Updated Apr 13, 2020 2020-04-13T14:47:03-05:00

machine_learning

This post is written by Ashish Jaiswal

Hyperparameter Tuning in Deep Neural Networks

1. Tuning Process

2. Using an appropriate scale to pick hyperparameters

Recent Update

Trending Tags

Contents

Trending Tags

Hyperparameter Tuning in Deep Neural Networks

1. Tuning Process

2. Using an appropriate scale to pick hyperparameters

Recent Update

Trending Tags

Contents

Further Reading

A Beginner's Guide to Semi-Supervised Learning

My Summer Internship Experience at Meta: Building Video Understanding Systems and Exploring California (Summer 2022)

Practical Aspects of Deep Learning - 1

Trending Tags