Linear + Softmax Classifier + Stochastic Gradient Descent (SGD) Lab

Here we will implement a linear classifier using a softmax function and negative log likelihood loss. These terms will be more clear as we finish this lecture. The idea is that you will learn these concepts by attending lectures, doing background reading, and completing this lab. Pytorch contains a powerful set of libraries for training complex machine learning and deep learning models but for this lab we will also be implementing things from zero.

1. First let's load some training data.

We will be using the CIFAR-10 dataset. CIFAR-10 is a dataset consisting of 50k training images belonging to 10 categories. A validation set is also provided which contains 10k images. The images are rescaled to a size of 32x32 pixels. This is a relatively small dataset that can be loaded entirely in memory so it is very convenient to experiment with. You will probably read several papers reporting results on this dataset during this class but most state-of-the-art methods usually try experiments in much larger datasets with millions of images. You need to be more clever about dataloading and reading in those cases but pytorch offers parallel data loaders and other useful tools for those cases.

Pytorch already has a Dataset class for CIFAR10 so we just have to learn to use it. You should also check the ImageFolder data loader which could be useful for some of your projects. You should also learn to create your own Dataset classes by inheriting from torch.utils.data.Dataset. You only have to implement two functionalities: __getitem__ and __len__. Pytorch is opensource so you can check the code for other Dataset classes including CIFAR10: https://github.com/pytorch/vision/blob/master/torchvision/datasets/cifar.py

In [1]:
import torch, torchvision, PIL
from torchvision.datasets import CIFAR10 
from PIL import Image
import numpy as np
from io import BytesIO
import torchvision.transforms as transforms
import IPython.display
import lab_utils # Stuff from previous labs.
from lab_utils import pil2tensor, tensor2pil
%load_ext autoreload
%autoreload 2

classes = ['airplane', 'automobile', 'bird', 'cat', 'deer',
           'dog', 'frog', 'horse', 'ship', 'truck']
class2id = {name: idx for (idx, name) in enumerate(classes)}
trainset = CIFAR10(root='./data', train = True, download = True)

# Datasets need to implement the __len__ method.
print('\nThe training data has %d samples' % len(trainset))

# Datasets need to implement the __getitem__ method.
image_index = 0
img, label = trainset[image_index]
print('Image %d is a %s' % (image_index, classes[label]));
lab_utils.show_image(img.resize((128, 128)));  # Make bigger to visualize.
Files already downloaded and verified

The training data has 50000 samples
Image 0 is a frog

Let's show a group of images from the dataset to have a better idea of what the images look like.

In [2]:
sample_imgs = list()
for i in range(0, 100):
    img, _ = trainset[i]  # We don't do anything with the labels so we just use underscore _
    sample_imgs.append(img)
    
# Lots of hacks to show a list of PIL images as a grid of images.
def show_images(images, zoom = 1.0, ncols = 10):
    imgs = images if type(images[0]) == torch.Tensor else [pil2tensor(img) for img in images]
    grid = torchvision.utils.make_grid(imgs, nrow = ncols)
    pil_image = Image.fromarray(np.transpose(grid.mul(255).byte().numpy(), (1, 2, 0)))
    target_size = (int(zoom * pil_image.width), int(zoom * pil_image.height))
    pil_image = pil_image.resize(target_size, resample = PIL.Image.BILINEAR)
    bio = BytesIO(); pil_image.save(bio, format = 'png')
    IPython.display.display(IPython.display.Image(bio.getvalue()))

# Show images below at 2.0x zoom level. Remember they are 32x32 pixels only.
show_images(sample_imgs, zoom = 2.0, ncols = 20)