Data augmentation refers to the process of expanding a dataset’s size by generating new data points from existing ones. This is achieved through various modifications and transformations of the original data. The primary objective of data augmentation is to enhance the dataset’s variety and richness, thereby reducing the likelihood of overfitting. Overfitting happens when a model becomes too attuned to the specific details and noise in the training data, impairing its ability to perform well on unseen data. By introducing a more diverse set of data through augmentation, models can be trained to generalize better to new, unseen data.
There are several techniques for data augmentation including:
- Random rotation: randomly rotating the image by a certain degree to increase the diversity of the dataset.For example, rotating an image of a dog by 30 degrees. This helps the model generalize better by learning to recognize the same object in different orientations.
from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
datagen = ImageDataGenerator(rotation_range=40)
2. Random flipping: randomly flipping the image horizontally or vertically.For example, flipping an image of a cat horizontally, so that it now appears as if the cat is facing the opposite direction. This helps the model generalize better by learning to recognize the same object regardless of its orientation.
datagen = ImageDataGenerator(horizontal_flip=True)
datagen = ImageDataGenerator(vertical_flip=True)
3. Random cropping: randomly cropping a portion of the image to increase the diversity of the dataset.For example, cropping an image of a car to only show the front portion of the car. This helps the model generalize better by learning to recognize the object regardless of its background or surrounding context.
datagen = ImageDataGenerator(width_shift_range=0.2, height_shift_range=0.2)
4. Random brightness and contrast: randomly adjusting the brightness and contrast of the image to increase the diversity of the dataset. For example, making an image of a sunset brighter or darker. This helps the model generalize better by learning to recognize the object regardless of the lighting conditions.
datagen = ImageDataGenerator(brightness_range=[0.2,1.0])
5. Random zoom: randomly zooming in or out of the image to increase the diversity of the dataset.For example, zooming in on an image of a flower to only show the center of the flower. This helps the model generalize better by learning to recognize the object regardless of its size or scale.
datagen = ImageDataGenerator(zoom_range=[0.5,1.0])
6. Random Shearing: Randomly applying shear transformation to an image. For example, tilting an image of a building to make it look as if it is leaning. This helps the model generalize better by learning to recognize the object regardless of its orientation.
Lets Go through from a real Example using CIFAR-10 Dataset
from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
# Load the CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
# Convert the data to float and scale it between 0 and 1
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
# Create the data generator
datagen = ImageDataGenerator(
rotation_range=40, # randomly rotate the image by 40 degrees
width_shift_range=0.2, # randomly shift the image horizontally by 20%
height_shift_range=0.2, # randomly shift the image vertically by 20%
horizontal_flip=True, # randomly flip the image horizontally
fill_mode='nearest') # fill in newly created pixels with nearest pixel value
# Fit the data generator to the training data
datagen.fit(x_train)
# Create the model
model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same', input_shape=x_train.shape[1:]))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(10))
model.add(Activation('softmax'))
# Compile the model
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
# Fit the model to the augmented data
model.fit_generator(datagen.flow(x_train, y_train, batch_size=32),
steps_per_epoch=len(x_train) / 32, epochs=100)
This code is using the CIFAR-10 dataset, which consists of 60,000 32×32 color images in 10 classes, with 6,000 images per class. There are 50,000 training images and 10,000 test images.
- The CIFAR-10 dataset is loaded and the training and test data are scaled to values between 0 and 1.
- An ImageDataGenerator object is created, with several data augmentation techniques specified:
- rotation_range: randomly rotate the image by a degree between 0 and 40.
- width_shift_range and height_shift_range: randomly shift the image horizontally and vertically by a factor between 0 and 0.2.
- horizontal_flip: randomly flip the image horizontally.
- fill_mode: fill in newly created pixels with the nearest pixel value.
3. The data generator is fit to the training data.
4. A convolutional neural network model is created using the Keras Sequential API. It consists of several convolutional layers, activation layers, max pooling layers, and dropout layers. The output of the model is a probability distribution over the 10 classes.
5. The model is compiled with categorical cross-entropy loss, the Adam optimizer and accuracy as the metric
6. The model is trained on the augmented data using the fit_generator() method, which takes the data generator as the first argument, and the number of steps per epoch as the second argument. The number of epochs is specified as 100.
The output of this code is the model’s training accuracy and loss for each epoch as it trains on the augmented data. With the use of data augmentation, this model should have a better accuracy and generalization than the one trained on non-augmented data.The expected output is an improvement in accuracy on the test set as the model is exposed to different variations of training data. This can be observed by comparing the accuracy of the model trained on the original data without data augmentation to the model trained on the augmented data.
Try this simple code yourself and feel the difference
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
#Beginners Guide
#Artificial Intelligence
#Machine Learning
#Data Science
#Data Analysis