CS 6501-009: Computational Visual Recognition

Instructor: Vicente Ordóñez R (vicente at virginia.edu)

Instructor's Office Hour: Tuesdays 3pm to 4pm at Rice Hall 310
TA: Tianlu Wang (tw8cb at virginia.edu) -- TA Office Hour: Wednesdays 5pm to 6pm at Rice 430 (desk 12)
TA: Siva Sitaraman (ks6cq at virginia.edu) -- TA Office Hour: Fridays 3pm to 4pm at Rice 204
Time: Tuesday & Thursday between 11:00AM and 12:15PM, at Olsson Hall 005.
Discussion Forum: http://piazza.com/virginia/fall2017/cs6501009/home

How can we use computers to recognize objects, people, actions, animals, places, etc from images? This seemingly trivial task that people perform without much effort has remained one of the core problems in Computer Vision. In this class we will study, play with, and implement algorithms for computational visual recognition using machine learning and deep learning. The class sessions will consists of lectures by the instructor for the most foundational topics, and several student-led paper review sessions to study more recent developments. After this class you will be able to use computational visual recognition for problems ranging from classifying images, to detecting and outlining every object in an image. In summary, after successful completion of this course you should be able to teach a robot how to distinguish dogs from cats.

More about this class

Topics: Signup for this class if you are interested in any of the following:

Prerrequisites: This course requires no previous background in computer vision or machine learning but knowledge in either of those will be helpful. You need to know about matrices, calculating derivatives, and probabilities (bayes rule). You will also need to be at least a moderately proficient programmer in python. There will be several lab assignments. These assignments will show you the basics of modern general visual recognition algorithms and models, and will give you the tools for implementing more advanced ones. There will also be a couple of quizzes directly related to the assignments and material covered during class. Finally, we will have a class project where you will be able to work on something beyond your assignments and where you will have more freedom to pursue a focused problem that is of your interest and better matches your background. Finally we will be using python/pytorch in the lecture notes, so being proficient in Python by completing a few projects in this language before the class starts is helpful. You should install python, jupyter, and pytorch, and complete the following notebook before our first day of class [pytorch_tensors].

Grading: Labs: 30pts (Lab-1: 5pts, Lab-2: 5pts, Lab-3: 10pts, Lab-4: 10pts), Paper presentation and summaries: 10pts, Quiz: 20pts, Project: 40pts.


Date     Topic
Tues, August 22th Lecture: Introduction [slides]
  • Welcome
  • Why is Visual Recognition hard?
  • Challenges in Computer Vision
  • Problems and applications
  • Lab-1 (Due Tuesday August 29th 11:59pm) -- Image Processing Lab: [preview] [download]
Thurs, August 24th Lecture: Image Processing Basics & Image Features [slides]
  • Overview of the field
  • Basic Image Processing
  • Convolutions, and filtering
Reading: Szeliski Book, Chapter 3.
Tues, August 29th Lecture: Machine Learning for Vision I [slides]
  • Discussion: Supervised vs Unsupervised Learning
  • Supervised learning: k-Nearest neighbors
  • Pedestrian Detection using Histogram of Oriented Gradients
  • Unsupervised learning: Clustering
Thurs, August 31st Lecture: Machine Learning for Vision II [slides]
  • Supervised learning: Linear models
  • Gradient Descent
  • Stochastic Gradient Descent
  • Regularization
  • Lab-2 (Due Tuesday September 12th 11:59pm) -- Softmax Classifier Lab: [preview] [download]
Tues, September 5th Lecture: Deep Learning for Vision I [no slides, only chalkboard]
  • More on Softmax Classifier
  • More on Stochastic Gradient Descent
Thurs, September 7th TA Lecture: Categorization and the Perceptron Model
Tues, September 12th No class this day -- Please use this time to work on your Lab.
Thurs, September 14th Lecture: Deep Learning for Vision II [some slides, mostly chalkboard]
  • Lab Review
  • Perceptron
  • Multi-layer Perceptron
  • Neural Networks
Supplementary Reading: Neural Networks by Steve Renals
Tues, September 19th Lecture: Deep Learning for Vision III: Intro to Convnets [slides]
  • Neural Networks
  • Imagenet and Big Data
  • Convolutional Neural Networks
  • Lab-3 (Due Thursday September 28th 11:59pm) -- Deep Learning Lab: [preview] [download]
Thurs, September 21st Lecture: Deep Learning for Vision IV: Classification [slides]
  • Convolutional Network Architectures I
  • LeNet and Alexnet
Extra readings: [Alexnet paper],
[VGG-16 slides]
[VGG-16 paper]
Tues, September 26th Lecture: Deep Learning for Vision V: Detection [slides]
  • Convolutional Network Architectures II
  • VGG-net, GoogLenet, ResNet
Extra readings: [GoogLeNet],
Thurs, September 28th Lecture: Deep Learning for Vision VI: Segmentation [slides]
  • R-CNN, Fast-RCNN, Faster-RCNN
Extra readings: [R-CNN],
Tues, October 3rd No class this day -- Reading Days / Fall Break.
Thurs, October 5th Lecture: Deep Learning for Vision VII [see previously posted slides / chalkboard / in-class python demonstration]
  • Convolutional Networks with Variable-sized Inputs
  • Intro to YOLO - Single Shot Object Detection
Submit a 1 or 2 page project proposal in PDF on UVA Collab (Deadline: Thursday October 5th at 5pm).
Tues, October 10th Lecture: Deep Learning for Vision VIII [slides]
  • YOLO continuation / SSD
  • Fully-Convolutional FCN networks
  • Convolutional Networks for Segmentation
  • Intro to Recurrent Neural Networks RNNs I
  • Long-short Term Memory Networks LSTMs
Extra Reading: [Image-Captioning],
Thurs, October 12th Lecture: Deep Learning for Vision IX
  • Unenrolled LSTMs
  • Recurrent Neural Networks RNNs II
  • Bi-Directional Long-short Term Memory Networks LSTMs
  • Sequence-to-sequence models
  • Image Captioning
Extra Reading: [Generative Adversarial Networks]
  • Check this tutorial on how to implement style transfer in pytorch: [here]
Tues, October 17th Lecture: Generative Adversarial Networks [slides]
  • Generating Adversarial Examples
  • Generative Adversarial Networks
  • Style-transfer Networks
  • Here is a pytorch code you might want to try to adversarially learn to generate samples from any image collection using pytorch: [here]
Thurs, October 19th Student Paper Review: Style-transfer Models
  • Perceptual Losses for Real-Time Style Transfer and Super-Resolution, ECCV 2016. [arxiv] by Justin Johnson, Alexandre Alahi, Li Fei-Fei
  • Deep Feature Interpolation for Image Content Changes, CVPR 2017.[arxiv] by Paul Upchurch, Jacob Gardner, Geoff Pleiss, Robert Pless, Noah Snavely, Kavita Bala, Kilian Weinberger
  • Check this Mobile App that does something like what is shown in the second paper: [here]
Tues, October 24th Student Paper Review: Unsupervised learning of Deep Neural Networks.
  • Unsupervised Visual Representation Learning by Context Prediction, ICCV 2015. [arxiv] by Carl Doersch, Abhinav Gupta, Alexei A. Efros
  • Learning Visual Groups From Co-occurrences in Space and Time, ICLR 2016. [arxiv] by Phillip Isola, Daniel Zoran, Dilip Krishnan, Edward H. Adelson
Thurs, October 26th Student Paper Review: Recent Advances in Generative Adversarial Networks
  • Unsupervised representation learning with deep convolutional generative adversarial networks, ICLR 2016. [arxiv] by Alec Radford, Luke Metz, Soumith Chintala
  • Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. [arxiv] by Jun-Yan Zhu, Taesung Park, Phillip Isola, Alexei A. Efros
Submit a 2 or 3 page project progress report in PDF on UVA Collab (Deadline: Thursday October 26th at 5pm). Use this template.
Tues, October 31st Student Paper Review: People Recognition
  • Deep Face Recognition, BMVC 2015. [pdf] by Omkar M. Parkhi, Andrea Vedaldi, and Andrew Zisserman
  • Stacked Hourglass Networks for Human Pose Estimation, ECCV 2016. [arxiv] by Alejandro Newell, Kaiyu Yang, Jia Deng
Thurs, November 2nd In-Class Activity: Quiz Preparation.
Tues, November 7th Student Paper Review: Motion, Tracking, and Video
  • Two-Stream Convolutional Networks for Action Recognition in Videos. [arxiv], NIPS 2014. by Karen Simonyan, and Andrew Zisserman
  • Re3 : Real-Time Recurrent Regression Networks for Object Tracking. [arxiv] by Daniel Gordon, Ali Farhadi, Dieter Fox
Thurs, November 9th Quiz (20 pts)
Tues, November 14th No class this day -- Please use this time to work on your projects.
Thurs, November 16th Lecture: Course Recap
  • Course Overview and recap
  • Scholarship and Ethics in AI
  • Tues, November 21st Guest Lecture
    Thurs, November 23rd Thanksgiving recess - no classes this day.
    Tues, November 28thProject Presentations
    Thurs, November 30thProject Presentations
    Thurs, December 5thProject Presentations
    Submit a 4 to 5 page Final project report in PDF on UVA Collab + Link to your code (Deadline: Thursday December 5th). Use this template.

    Academic Integrity

    "The School of Engineering and Applied Science relies upon and cherishes its community of trust. We firmly endorse, uphold, and embrace the University’s Honor principle that students will not lie, cheat, or steal, nor shall they tolerate those who do. We recognize that even one honor infraction can destroy an exemplary reputation that has taken years to build. Acting in a manner consistent with the principles of honor will benefit every member of the community both while enrolled in the Engineering School and in the future. Students are expected to be familiar with the university honor code, including the section on academic fraud."

    Instructor's Note: In this class particularly, lab assignments are individual. You can still discuss them in a group or with your friends but you should not be straight up copying somebody else's solution or code. Not even a single line of code. You might be tempted to think, well, in how many ways could I write c = c + c * c - 2? You are probably right but what if that's actually an espectacularly wrong solution, and only two students turn a solution with this unlikely expression on it? If there are two assignments where I notice something even slightly as suspicious as this, I, the instructor, Vicente, will refer the case to the Honor Code system where the outcome, if the academic misconduct is proven, will probably be a harsh dismissal from the university. Also, do not try to get solutions from the previous versions of this class, I keep those solutions on file and I am good at remembering code I have seen before. The UVA Honor Code system is harsh indeed, there are not many possible outcomes as in other systems. I strongly advise you not to do anything bad. It is not worth it. Most of the grade in this course will be the course project in any case. Not turning in a lab assignment is much preferrable than turning in something that contains academic misconduct. Beyond the possible academic consequences that this might entail, it will be incredibly dissappointing to me if I find any traces of this in lab assignments. Be clear about what are your original contributions in the class project, and enjoy doing the work on your lab assignments. So let's just all enjoy the class, and avoid this.

    Other similar courses that might be of interest:

    Department of Computer Science, University of Virginia, 2017.