In this assignment, we will be exploring deep learning. This is NOT a machine learning class, so very little math is expected in this assignment, however thinking about the questions mathematically and exercising proper debugging will help.
This assignment can be completed in an IPython notebook. The starting code is available in a notebook to download here (you might need to right-click and "Save As"). Feel free to use Google Colab or an editor on your own machine. Be warned that part of this assignment involves neural network training, which is computationally expensive and may take a long time on a laptop.
This assignment will be graded by the staff, so make sure to comment your code and document design considerations. You are allowed to use external libraries if you wish (e.g.
In this section, we will be optimizing the parameters for neural network training, specifically for an image classification task. Note: if you are using Google Colab make sure to change the Runtime Type to GPU to speed up training!
These are some common hyperparameters to experiment with when training neural networks:
batch_size– smaller batch size = more parameter updates
Dropout(p)– randomly sets weights to 0 with probability
pto prevent overfitting
optimizer– change from
SGD(stochastic gradient descent) to
Adadeltaor another optimizer here
lr– larger learning rate = larger step per instance, less granularity
epochs– more training batches/passes through the data
TODO: Try out at least three different parameter alterations and note each final accuracy. To recieve full credit, you must change 3 different hyperparameters (e.g. batch size, learning rate, and epochs), and it may be informative to change a hyperparameter multiple times. In addition, write a sentence or two describing what your high-level intuition is for explaining the performance difference (if any). If you're coding in the notebook environment, feel free to write your explanation in a text cell. We’re not looking for any particular accuracy, we just want an exploration of different hyperparameter tunings and observations of the change in performance.
There is a lot of jargon involved, so you should consult the Keras documentation or other external resources and ask questions on Piazza. Feel free to make architecture changes (e.g.
model.add() new layers) if you’re feeling adventurous! However, just changing three hyperparameters (with explanation) is sufficient for full credit.
''' Trains a simple convnet on the MNIST dataset. Credit: Keras Team ''' from __future__ import print_function import keras from keras.datasets import mnist from keras.models import Sequential from keras.layers import Dense, Dropout, Flatten from keras.layers import Conv2D, MaxPooling2D # DEFINING *SOME* HYPERPARAMETERS batch_size = 256 num_classes = 10 epochs = 2 # DATA CLEAN-UP img_rows, img_cols = 28, 28 (x_train, y_train), (x_test, y_test) = mnist.load_data() x_train = x_train.reshape(x_train.shape, img_rows, img_cols, 1) x_test = x_test.reshape(x_test.shape, img_rows, img_cols, 1) input_shape = (img_rows, img_cols, 1) x_train = x_train.astype('float32') x_test = x_test.astype('float32') x_train /= 255 x_test /= 255 print('x_train shape:', x_train.shape) print(x_train.shape, 'train samples') print(x_test.shape, 'test samples') y_train = keras.utils.to_categorical(y_train, num_classes) y_test = keras.utils.to_categorical(y_test, num_classes) # MODEL DEFINITION (some hardcodede hyperparameters) model = Sequential() model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape)) model.add(Conv2D(64, (3, 3), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.9)) model.add(Flatten()) model.add(Dense(128, activation='relu')) model.add(Dropout(0.9)) model.add(Dense(num_classes, activation='softmax')) # MODEL COMPILATION AND TRAINING model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.SGD(lr=0.001), metrics=['accuracy']) model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1, validation_data=(x_test, y_test)) # MODEL EVALUATION score = model.evaluate(x_test, y_test, verbose=0) print('Test loss:', score) print('Test accuracy:', score)
In this section, we will explore different word embeddings systems using the Magnitude package and pre-trained word embeddings.
A brief installation (works in Google Colab):
pip install pymagnitude
We can now instantiate a query-able Magnitude vectors object as follows:
from pymagnitude import * file_path = "GoogleNews-vectors-negative300.magnitude" vectors = Magnitude(filge_path) print(vectors.distance("cat", "dog"))
For this question, create a text cell in your iPython notebook answering the following questions by referring to the Magnitude documentation:
[tissue', 'papyrus', 'manila', 'newsprint', 'parchment', 'gazette’]
Xis to throw”.
alumniin the vocabulary? What about