Spring 2022 CS444

Extra Credit: Adversarial Attacks

Due date: Monday, May 9th, 11:59:59PM -- has to be done individually, no late days allowed!

The goal of this extra credit assignment is to implement adversarial attacks on pre-trained classifiers. This topic was actually omitted from this semester's lectures, so you will need to self-study the slides and lecture recording from last year.

Fast Gradient Sign Method (up to 20 pts)

To start out, you should implement the Fast Gradient Sign Method (FGSM) described in the above lecture and this paper. We also recommend you go through the pytorch tutorial for FGSM.

Rather than performing your attacks on the full ImageNet dataset, which is >10G of data, we recommend running on Imagenette, which is a small subset of ImageNet that contains only 10 classes. The discription of the dataset can be found here. You can use any of the pretrained models as the model under attack. The pretrained models are trained on the whole ImageNet dataset, which has 1000 labels. To attack specific labels in those models, just select the correct output of corresponding classes that appear in the Imagenette. You may find using this helpful to select these classes.

Feel free to use the pytorch tutorial code to get started. What you submit for your report is up to you, but the more explanations, experiments, visualizations, and analysis you provide, the more extra credit you will receive!

Anything else (up to 20 more points)

You may implement any other adversarial attack or defense method you want. Here we provide some suggestions based on the same lecture linked above:

Attacks

Iterative gradient sign
Least likely class
Some source/target misclassification method (maximizing response for a specific target class)
Any other adversarial methods

Defenses against your own implemented adversarial attacks

SafetyNet
Robust architectures (feature denoising)
Preprocessing inputs (input transformations)
Any other defenses

Surprise us with anything else!

Study the transferability of methods between models
Applying adversarial attacks to different problems (object detection, segmentation, image captioning, etc.)
Implementing/running other adversarial methods

How you choose to show your work is up to you! Just be thorough with explanations and visualizations and we will be flexible with grading. Roughly speaking, each substantial additional technique you implement, and present with sufficient documentation and analysis, will be worth around 5 points, up to a maximum of 20.

Submission Instructions

All of your code (python files and any ipynb files) submitted in a single ZIP file. The filename should be netid_ec_code.zip.
A brief report in PDF format (no template, submit your results however you like). The filename you submit should be netid_ec_report.pdf.

If you wish to receive extra credit, this assignment must be turned in on time -- no late days allowed!