Advanced topics in high-performance computing
(MCSC 6230G/7230G)
Fall 2017
Faisal Qureshi
faisal.qureshi@uoit.net

News

Nov 29, 2017
Last lecture.
Nov 13, 2017
Important information about project presentations and report posted on the course Slack channel.
Nov 13, 2017
Assignment 3 is now available.
Oct 19, 2017
Assignment 2 is now available.
Oct 10, 2017
One page project proposals are due Oct. 31.
Oct 10, 2017
Code examples available on Github. See below.
Oct 5, 2017
Paper presentation schedule is now available. Please check course slack.
Sep 29, 2017
Assignment 1 is now available.
Sep 26, 2017
Reading paper list is now available.
Aug 28, 2017
Website is now online.

Course Info

Instructor

Faisal Qureshi

Email: faisal.qureshi@uoit.net
Office: UA4032

Slack channel

We will be using Slack for online communication. Please ensure that you are enrolled in the following slack channel:

mcsc-ml-f17-uoit.slack.com.

Lectures

  • Wed, 12:40 - 3:30 pm in ERC3027

Office hours

  • Tue, 1 - 2 pm in UA4032
  • Or by appointment

Syllabus

Description

This is an introductory graduate course in machine learning. This course will focus on both supervised and un-supervised learning methods, covering both theory and practice. The course is geared towards students who wish to develop a working knowledge of the recent advances in machine learning, and how these are applied in various domains.

Machine learning deals with how to design computer programs that learn from “experience.” Residing at the intersection of computer science and statistics, machine learning aims to extract useful information from data (often referred to as the training data) and leverages this information to create computer models capable of carrying out useful, non-trivial tasks, such as designing cars that can drive on their own, filters for blocking junk email, diagnostics tools for disease discovery, etc. By many accounts machine learning is the “greatest export” of computer science (and statistics) to other disciplines.

The course will cover the following topics:

Prerequisites

The course assumes that students are comfortable with statistics, basic linear algebra, and programming.

Reading material

We live in exciting times. Copious amount of information about machine learning is available on the internet. Check out the awesome machine learning on Github for list of machine learning courses and free, open source books.

Lectures

Important: Each lecture will include a programming activity. Please bring your laptops in to the lectures. Also ensure that your laptop has Python, Numpy, Scipy, Matplotlib, and Sklearn installed. The easiest way to achieve this it to download the Anaconda Python distribution.

Week 1

Tony Joseph will lead the first lecture. I am away at a conference in Berlin.

Week 2

Exercise

The goal is to use Kmeans or Meanshift to cluster the "make circles" dataset into two clusters. [Code]

Week 3

Class notes

Exercise

Experiments with linear regression. [Code] [Data]

Week 4

Class notes

Week 5

Class notes

Week 6

Week 7

Week 8

Week 9

Class notes

Week 10

Week 11

We will continue our discussion of Gaussian processes.

Week 12

Papers

Each student needs to select a relevant machine learning paper and give a 20 minutes presentation, outlining the contributions, strengths and weaknesses of that paper. To get the process moving I have started to put together a list of papers. Each of you is asked to select "one" paper that catches your interest. I will use FIFO to resolve ties.

Feel free to suggest another relevant machine learning paper

Reading papers

Presentation

Each paper presentation is give or take 20 minutes long, followed by a discussion. It is expected that all of you would've read the paper before coming to the lecture. The presentation should focus on the "key contribution" of the paper and how the topics covered in the paper fit into the larger machine learning landscape. Pay close attention to how paper is written, how ideas are presented, how methods are developed, how arguments are structured and how results are used to bolster the key idea of the paper.

Presentation schedule and papers are available via course Slack.

Project

The students can work on projects individually or in pairs. The project can be an interesting topic that the student comes up with himself/herself or with the help of the instructor. The grade will depend on the ideas, how well you present them in the report, how well you position your work in the related literature, how thorough are your experiments and how thoughful are your conclusions. (Taken from Raquel Urtasun CSCI 2515 project description.)

Administration

  1. In class, Wed., Nov. 29
  2. 10 min total time, 7 min presentation and demo, 3 min QA and discussion
  3. To ensure timely proceedings, please upload project presentations via Blackboard by Tue., Nov. 28 midnight in pdf format.
  4. Project report due on Fri., Dec. 8, 11:59 pm

Assignments

Resources

I recommend reading Part 1 of “Deep Learning” by I. Goodfellow, Y. Bengio and A. Courville to brush up on linear algebra and statistics. The book is available at here

We will be using Python for the programming part of this course. For Python, I recommend the Anaconda distribution, which comes pre-loaded for nearly all the packages that we will be using in this course. Of course you are welcome to use any variant/distribution of Python that suits you.

Here you’ll find a number of tutorials showcasing Python use in machine learning. I strongly recommend that you become comfortable with the following four Python packages/environment:

Books

Code

Code examples used in this course are available on Github (https://github.com/uoit-ml/mcsc-ml).