Spring 2026 CS 444
Assignment 3: Self-Supervised and Transfer Learning
Due date: Wednesday, April 1st, 11:59:59 PM
The goals of this assignment are to help you gain experience with PyTorch, learn how to use
pre-trained models provided by the deep learning community, and adapt
these models to new tasks and losses.
You will use a simple
self-supervised rotation prediction task to pre-train a model on the CIFAR10 dataset without using class labels, and
then fine-tune this model for CIFAR10 classification.
Before You Begin
Since this assignment requires the use of PyTorch, we strongly suggest that you self-study the official PyTorch tutorial. You can also check out the Introduction to PyTorch YouTube Series.
Starter Code
Download the starter code here -- and see setup instructions below under Assignment Setup.
Self-Supervised Learning by Rotation Prediction on CIFAR10

Source: Gidaris et al. (2018)
You will use PyTorch to train a model on a
self-supervised task, fine-tune a subset of the model’s weights, and
train a model in a fully supervised setting with different weight
initializations. You will be using the CIFAR10 dataset, which is a
dataset of small (32x32) images belonging to 10 different object
classes. For self-supervised training, you will ignore the provided
labels; however, you will use the class labels for fine-tuning and fully
supervised training.
The model architecture you will use is ResNet18. We will use the PyTorch ResNet18 implementation, so you do not need to create it from scratch.
The self-supervised training task is image rotation prediction, as proposed by Gidaris et al.
in 2018. For this task, all training images are randomly rotated by 0,
90, 180, or 270 degrees. The network is then trained to classify the
rotation of each input image using cross-entropy loss by treating each
of the 4 possible rotations as a class. This task can be treated as
pre-training, and the pre-trained weights can then be fine-tuned on the
supervised CIFAR10 classification task.
The top-level notebook (mp3_rotation.ipynb)
will guide you through all the steps of training a ResNet for the
rotation task and fine-tuning on the classification task. You will
implement the data loader, training and fine-tuning steps in PyTorch based on the starter code. In detail, you will complete the following five experiments:
- Train a ResNet18 on the Rotation task. Based on the
CIFAR10 dataloader, you will first generate the rotated images and
labels for the rotation task. You will train a ResNet18 on the rotation
task, report the test performance and store the model for the
fine-tuning tasks. For the test performance, find the lowest test loss
and report the corresponding accuracy. The expected rotation prediction accuracy on the test set should be around 78%.
- Initializing from the pretrained Rotation model, fine-tune only the weights of the final block of convolutional
layers and linear layer on the supervised CIFAR10 classification task and report the test results. (Note that training only the final linear layer while keeping the rest of the network frozen gives very poor results. You're welcome to verify this yourself but it's not required for the submission.)
The expected test set accuracy for fine-tuning the specified layers of the pre-trained model should be around 60%.
- Initializing from random weights, train the full network on the supervised CIFAR10 classification task. Report the test results. The expected test set accuracy for fine-tuning the entire randomly initialized model should be around 80%.
- Initializing from the pretrained Rotation model, train the full network on the supervised CIFAR10 classification task. Report the test results and compare the performance of all models across Parts 2, 3, and 4. The expected test set accuracy for fine-tuning the whole pre-trained model should be around 82%.
- Now come up with some alternative architecture (either custom-created
or off-the-shelf) together with any combination of "bells and whistles"
(optimization, data augmentation, normalization, etc.) to try to get the
highest accuracies you can on CIFAR10 classification. You should aim for at least 3% improvement in accuracy over your best model among Parts 2, 3, and 4. You can either (i) train your advanced model from scratch, or (ii) pretrain your model on the rotation prediction task then finetune either part or all of your model on classification. However, you are not allowed to initialize your model from any off-the-shelf pretrained weights. In your report, be sure to specify all the details of your architecture and hyperparameters and discuss your improvements compared to the baseline.
Extra Credit
- Perform an ablation study of your "bells and whistles" from part 5, provided there are at least 3 changes from the default settings that may have improved performance. The changes need to be conceptually significant, not, for example, just hyperparameter finetuning. The goal is to measure how much each component contributes. You may (a) comprehensively test all combinations of components, (b) start from the baseline and add one component at a time, or (c) start from your full approach and remove one component at a time. Keep all non-ablated settings fixed, and report a table of results with performance deltas relative to the baseline. Discuss your observations.
-
In Figure 5(b) from the Gidaris et al. paper,
the authors show a plot of CIFAR10 classification performance vs.
number of training examples per category for a supervised CIFAR10 model
vs. a RotNet model with the final layers fine-tuned on CIFAR10. The plot
shows that pre-training on the Rotation task can be advantageous when
only a small amount of labeled data is available. Using your RotNet
fine-tuning code and supervised CIFAR10 training code from the main
assignment, try to create a similar plot by performing supervised
fine-tuning/training on only a subset of CIFAR10.
- If you have a good amount of compute at your disposal, try to train a rotation prediction model on the larger ImageNette dataset (still smaller than ImageNet, though).
- Try any other improvements or advanced techniques not suggested above. To get extra credit, you must clearly describe what you did in the report and document your results and code.
Assignment Setup
To complete this assignment in a reasonable amount of time,
you'll need to use a GPU. This can either be your personal GPU, Google
Colab or Colab Pro with GPU enabled. Note: If you are using Colab Pro for this assignment, it is recommended to use the cheaper T4 GPU (default Colab Free GPU) to avoid using up your compute credits -- training time should still be reasonable.
Environment Setup
If you will be working on the assignment on a local machine
then you will need a Python environment set up with the appropriate
packages. We suggest that you use Conda to manage Python package
dependencies (https://docs.conda.io/projects/conda/en/latest/user-guide/getting-started.html).
Unless you have a machine with a GPU, running this
assignment on your local machine will be very slow and is not
recommended.
We suggest that you use Anaconda to manage Python package dependencies (https://www.anaconda.com/download). This guide provides useful information on how to use Conda: https://docs.conda.io/projects/conda/en/latest/user-guide/getting-started.html.
Ensure that IPython is installed (https://ipython.org/install). You may then navigate the assignment directory in terminal and start a local IPython server using the jupyter notebook command.
Data Setup
The CIFAR10 dataset is part of PyTorch's TorchVision library. Follow the instructions in the starter code to download and use the dataset.
Submission Instructions:
The assignment deliverables are as follows. If you are
working in a pair, only one designated student should make the
submission. You should indicate your Team Name on Kaggle Leaderboard and team members in the report.
- You must submit your output Kaggle CSV files (generated by "mp3_rotation_report_kaggle.ipynb" or "mp3_classify_report_kaggle.ipynb") for each part to their corresponding Kaggle competition webpages:
- Part 1
- Part 2
- Part 3
- Part 4
- Part 5
- Upload three files to Canvas:
- netid_mp3_output.pdf: Your IPython notebook with output cells converted to PDF format
- netid_mp3_code.zip: All of your code (Python files and .ipynb file) in a single ZIP file
- netid_mp3_report.pdf: Your assignment report (using this template) in PDF format
Please refer to course policies on collaborations, late submission, etc.
|