Computer Vision
CSCI 5520G
Fall 2022
Faisal Qureshi
faisal.qureshi@ontariotechu.net

News

Dec 15, 2022
And that's a wrap.
Nov 4, 2022
Midterm will take place in class on Nov. 15. Please see course website for the list of topics covered in this midterm.
Oct 27, 2022
Second set of paper presentations will take place in class on Nov. 8.
Oct 27, 2022
First set of paper presentations will take place in class on Nov. 1.
Oct 4, 2022
In-class presentations paper assignments now available. Check course Piazza.
Sep 20, 2022
Project proposals due by Oct 7. Check course canvas.
Aug 16, 2022
Website is now online.

Course Info

Lectures

Communication

http://piazza.com/uoit.ca/fall2022/202209computervision44115/home

Office hours

Syllabus

Canvas (requires login)

Labs and inclass exercises will be submitted through course canvas site.

Course notes

Computer vision notes are available here. The course covers a selection of topics from these notes.

Additional notes are available here.

Description

This is an introductory graduate course in computer vision. The course will focus on computer vision theory and applications.

Computer vision deals with processing and analyzing digital images to extract useful properties about the real world. Computer vision, for example, can be used to extract 3D scene structure from a given set of photos, recognize people in images, identify actions in a video sequence, etc. Computer vision has also been used in specialized domains, such as medical imaging, say for analyzing CT scans or MRI photographs, satellite imaging, say for analyzing the health of a an ecosystem, etc. Computer vision has also found wide-spread use in entertainment and gaming industry.

Solving computer vision, it turns out, is a tough problem. Digital images after all are little more than a collection of pixels. Recent advances in machine learning, especially in deep learning, has opened up new avenues for computer vision research. The goal is simple: design algorithms and systems that will enable a computer to “learn to see” by “looking” at example pictures and videos. With this in mind, this course will also briefly explore machine learning approaches that have found wide-spread use in computer vision applications.

This course will mix lectures on a selection of topics with paper reading and discussion. The topics are selected to help you understand and implement the papers that you are asked to read, present, and discuss. The first 45 minutes of most classes will be devoted to lectures on one of the selected topics. The remain time will be used for paper presentation and discussion. The course will cover the following topics:

These topics provides a decent basis for understanding the papers that we plan to read and discuss in this course.

Pre-requisites

The course assumes that students are comfortable with statistics, basic linear algebra, and programming.

We will be using Python for the programming part of this course. For Python, I recommend the Anaconda distribution, which comes pre-loaded for nearly all the packages that we will be using in this course. Of course you are welcome to use any variant/distribution of Python that suits you.

The course also assumes that students are willing to read and comprehend large volumes of technical papers. Furthermore, that students have some experience with technical report writing.

Grading

Important dates

Ontario Tech University’s academic calendar that lists important dates (and deadlines) is available at here.

Course calendar

The list of assigned papers will be available after the first week of classes. Please check the course website for details.

Midterm prepration

The midterm will cover the topics discussed in the following set of notes. This list will be updated after each lecture.

Computer Vision Papers

Find a collection of computer vision papers at https://github.com/jbhuang0604/awesome-computer-vision. The paper is organized in topics. Please find papers in topics that interest you. Please find at least five papers in two different areas. At least three of the five papers should be recent.

See course canvas for more instructions.

Course Work

Midterm

Presentation

Each student will be assigned recent papers to read and present. The student will be responsible for leading the discussion for this paper. Each student may be assigned to present multiple papers.

Instructions for the presenter

Instructions for the participants

Project

The course project is an independent exploration of a specific problem within the context of this course. A project can be implementation oriented—where a student implements a computer vision system—or application oriented—where a student attempts to solve a problem (of suitable difficulty) by applying machine learning techniques. The project topic will be selected in consultation with the instructor.

Project grade will depend on the ideas, how well you present them in the report, how well you position your work in the related literature, how thorough are your experiments and how thoughtful are your conclusions.

Course project is typically an individual effort.

Project topics

Projects must be related to computer vision theory, methods, and systems. A project that simply uses a pre-trained deep learning model, say YOLO or ImageNet to solve some larger “task” is not appropriate. Such a project simply applies a pre-built system to the task at hand. I want us to have an opportunity to implement computer vision systems that underpin all these different applications.

Possible topics are:

In many cases it is difficult to deal with real cameras and hardware. In these situations it is possible to implement and evaluate your algorithms using simulated data. E.g., you can use a game engine to simulate traffic images captured at a road intersection.

Project proposal

Progress Report

Final in-class Presentation

Final Report

For your final project write-up you must use ACM SIG Proceedings Template (available at the ACM website). Project report is at most 12 pages long, plus extra pages for references. Your report must of “publishable quality,” i.e., no typos, grammar error.

The final deadline for project report submission is 11th of December, midnight EST. This is a firm deadline. You will incur a penalty of 40% if you do not meet this deadline. These strict rules mimic conference submission process:

Reading material

You will find the following computer vision books useful.

Following books are good resources for machine learning, especially deep learning

These resources will not only help you understand the assigned papers. These resources may prove invaluable for your course projects.

Programming Resources

Here you’ll find a number of tutorials showcasing Python use in machine learning. I strongly recommend that you become comfortable with the following four Python packages/environment: