In this article, I am going to show how to use the random search hyperparameter tuning method with Keras. I decided to use the keras-tuner project, which at the time of writing the article has not been officially released yet, so I have to install it directly from the GitHub repository.
Table of Contents
#remove ! if your are not running it in Jupyter Notebook
!git clone https://github.com/keras-team/keras-tuner.git
!pip install ./keras-tuner
As an example, I will use the Fashion-MNIST dataset, so the goal is to perform a multiclass classification of images. First, I have to load the training and test dataset. Fashion-MNIST is available as one of the Keras built-in datasets, so the following code downloads everything I need.
import tensorflow as tf
from tensorflow import keras
import numpy as np
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
The images have been already preprocessed, so currently, the dataset contains one channel (gray-scale) of color values in the range 0-255). I want to scale the values to range between 0 and 1, so I divide them by 255.
train_images = train_images / 255.0
test_images = test_images / 255.0
I am going to reshape the dataset to use it as an input of the convolutional layer.
train_images = train_images.reshape(len(train_images), 28, 28, 1)
test_images = test_images.reshape(len(test_images), 28, 28, 1)
Parameters
Keras-tuner needs a function that accepts the set of parameters and returns a compiled model, so I have to define such function.
There are four kinds of parameters available: range, choice, linear, and fixed.
Range
The range returns integer values between the given minimum and maximum. The values are incremented by the step parameter.
hp.Range('conv_1_filter', min_value=64, max_value=128, step=16)
Linear
The liner parameter is similar to the range but works with float numbers. In this case, the step is called resolution.
hp.Linear('learning_rate', min_value=0.01, max_value=0.1, resolution=0.1)
Choice
The choice parameter is much simpler. We give it a list of values, and it returns one of them.
hp.Choice('learning_rate', values=[1e-2, 1e-3])
Fixed
Finally, we can set a constant as the parameter value. It is useful when we want to let keras-tuner tune all parameters except one. The fixed parameter works only with the predefined models: Xception and ResNet.
hp.Fixed('learning_rate', value=1e-4)
How to define the model
Here is my function that builds a neural network using the parameters given by keras-tuner. Even though it is not necessary in this case, I will parameterize all layers and the learning rate, to show that it is possible.
def build_model(hp):
model = keras.Sequential([
keras.layers.Conv2D(
filters=hp.Range('conv_1_filter', min_value=64, max_value=128, step=16),
kernel_size=hp.Choice('conv_1_kernel', values = [3,5]),
activation='relu',
input_shape=(28,28,1)
),
keras.layers.Conv2D(
filters=hp.Range('conv_2_filter', min_value=32, max_value=64, step=16),
kernel_size=hp.Choice('conv_2_kernel', values = [3,5]),
activation='relu'
),
keras.layers.Flatten(),
keras.layers.Dense(
units=hp.Range('dense_1_units', min_value=32, max_value=128, step=16),
activation='relu'
),
keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer=keras.optimizers.Adam(hp.Choice('learning_rate', values=[1e-2, 1e-3])),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model
Configure the tuner
When the function is ready, I have to configure the tuner. We need to specify the objective, which is the metric used to compare models. In this case, I want to use validation set accuracy.
The other important parameter is the number of trails. That parameter tells the tuner how many hyperparameter combinations it has to test.
I must also specify the name and the output directory. It tells the tuner where it should store the debugging data.
Note that I passed the function defined above as the first parameter!
from kerastuner.tuners import RandomSearch
tuner = RandomSearch(
build_model,
objective='val_accuracy',
max_trials=5,
directory='output',
project_name='FashionMNIST')
Hyperparameter tuning
Now, I have a configured tuner. It is time to run it. I need the training datasets, and the number of epochs is every trial. I must also specify the validation dataset or the percentage of training dataset that will be used for validation.
I call the search function, and eventually, I will get the results of the tuning.
tuner.search(train_images, train_labels, epochs=2, validation_split=0.1)
Using the model
When the search is done, I can get the best model and either start using it or continue training.
model = tuner.get_best_models(num_models=1)[0]
In this example, I trained the model for only two epochs, so I will continue training it, starting from the third epoch.
model.fit(train_images, train_labels, epochs=10, validation_split=0.1, initial_epoch=2)