Using keras-tuner to tune hyperparameters of a TensorFlow model

In this article, I am going to show how to use the random search hyperparameter tuning method with Keras. I decided to use the keras-tuner project, which at the time of writing the article has not been officially released yet, so I have to install it directly from the GitHub repository.

#remove ! if your are not running it in Jupyter Notebook
!git clone https://github.com/keras-team/keras-tuner.git
!pip install ./keras-tuner

As an example, I will use the Fashion-MNIST dataset, so the goal is to perform a multiclass classification of images. First, I have to load the training and test dataset. Fashion-MNIST is available as one of the Keras built-in datasets, so the following code downloads everything I need.

import tensorflow as tf
from tensorflow import keras
import numpy as np

fashion_mnist = keras.datasets.fashion_mnist

(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

The images have been already preprocessed, so currently, the dataset contains one channel (gray-scale) of color values in the range 0-255). I want to scale the values to range between 0 and 1, so I divide them by 255.

train_images = train_images / 255.0
test_images = test_images / 255.0

I am going to reshape the dataset to use it as an input of the convolutional layer.

train_images = train_images.reshape(len(train_images), 28, 28, 1)
test_images = test_images.reshape(len(test_images), 28, 28, 1)

Parameters

Keras-tuner needs a function that accepts the set of parameters and returns a compiled model, so I have to define such function.

There are four kinds of parameters available: range, choice, linear, and fixed.

Range

The range returns integer values between the given minimum and maximum. The values are incremented by the step parameter.

hp.Range('conv_1_filter', min_value=64, max_value=128, step=16)

Linear

The liner parameter is similar to the range but works with float numbers. In this case, the step is called resolution.

hp.Linear('learning_rate', min_value=0.01, max_value=0.1, resolution=0.1)

Choice

The choice parameter is much simpler. We give it a list of values, and it returns one of them.

hp.Choice('learning_rate', values=[1e-2, 1e-3])

Fixed

Finally, we can set a constant as the parameter value. It is useful when we want to let keras-tuner tune all parameters except one. The fixed parameter works only with the predefined models: Xception and ResNet.

hp.Fixed('learning_rate', value=1e-4)

How to define the model

Here is my function that builds a neural network using the parameters given by keras-tuner. Even though it is not necessary in this case, I will parameterize all layers and the learning rate, to show that it is possible.

def build_model(hp):
  model = keras.Sequential([
    keras.layers.Conv2D(
        filters=hp.Range('conv_1_filter', min_value=64, max_value=128, step=16),
        kernel_size=hp.Choice('conv_1_kernel', values = [3,5]),
        activation='relu',
        input_shape=(28,28,1)
    ),
    keras.layers.Conv2D(
        filters=hp.Range('conv_2_filter', min_value=32, max_value=64, step=16),
        kernel_size=hp.Choice('conv_2_kernel', values = [3,5]),
        activation='relu'
    ),
    keras.layers.Flatten(),
    keras.layers.Dense(
        units=hp.Range('dense_1_units', min_value=32, max_value=128, step=16),
        activation='relu'
    ),
    keras.layers.Dense(10, activation='softmax')
  ])

  model.compile(optimizer=keras.optimizers.Adam(hp.Choice('learning_rate', values=[1e-2, 1e-3])),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

  return model

Configure the tuner

When the function is ready, I have to configure the tuner. We need to specify the objective, which is the metric used to compare models. In this case, I want to use validation set accuracy.

The other important parameter is the number of trails. That parameter tells the tuner how many hyperparameter combinations it has to test.

I must also specify the name and the output directory. It tells the tuner where it should store the debugging data.

Note that I passed the function defined above as the first parameter!

from kerastuner.tuners import RandomSearch

tuner = RandomSearch(
    build_model,
    objective='val_accuracy',
    max_trials=5,
    directory='output',
    project_name='FashionMNIST')

Hyperparameter tuning

Now, I have a configured tuner. It is time to run it. I need the training datasets, and the number of epochs is every trial. I must also specify the validation dataset or the percentage of training dataset that will be used for validation.

I call the search function, and eventually, I will get the results of the tuning.

tuner.search(train_images, train_labels, epochs=2, validation_split=0.1)

Using the model

When the search is done, I can get the best model and either start using it or continue training.

model = tuner.get_best_models(num_models=1)[0]

In this example, I trained the model for only two epochs, so I will continue training it, starting from the third epoch.

model.fit(train_images, train_labels, epochs=10, validation_split=0.1, initial_epoch=2)
Older post

Understanding the Keras layer input shapes

What is the input_shape in Keras/TensorFlow?

Newer post

Using Hyperband for TensorFlow hyperparameter tuning with keras-tuner

Tuning TensorFlow with Hyperband