Faisal Qureshi

http://www.vclab.ca

- Claude Shannon, Father of Information Theory.
I visualise a time when we will be to robots what dogs are to humans, and I’m rooting for the machines.

- Jeff Hawkins, Founder of Palm Computing.

The key to artificial intelligence has always been the representation.

- Computational models of Neurons
- Pre-deep learning
- Imagenet 2012
- Takeaways
- What
- How
- Why now?
- Impact

- Ethical and social implications

- Proposed a model of nervous systems as a network of threshold units.
- Connections between simple units performing elementry operations give rise to intelligence.

- Artificial neuron

- Hebbian Learning (Donald Hebb, 1949) principle proposes to learn patterns by reinforcing connections between Neurons that tend to fire together.
- Biologically plausible, but it is not used in practice

- First artificial neural network consisting of 40 neurons (Marvin Minsky, 1951)
- Uses Hebbian Learning

- Frank Rosenblatt (1958) perceptron to classify 20x20 images
- Percpetron is neural network comprising a single neuron

- David Hubel and Torsten Wiesel studied cat visual cortex and showed that visual information goes through a series of processing steps: 1) edge detection; 2) edge combination; 3) motion perception; etc. (Hubeland Wiesel, 1959)

- Backpropagation for artificial neural networks (Paul Werbos, 1982)
- An application of chain-rule from differential calculus

- Fukushima (1980) implemented Neocognitnron that was capable of handwritten character recognition.
- This model was based upon the findings of Hubel and Wiesel.
- This model can be seen as a precursor of modern convolutional networks.

- Rumelhart et al. (1988) used backpropagation to train a network similar to Neocognitron.
- Units in hidden layers learn meaningful representations

- In 1989, LeCun et al. proposed LeNet, a convolution neural network very similar to networks that we see today
- Capable for recognizing hand-written digits
- Trained using backpropagation

- Large amount of training data is critical to the success of deep learning methods
- ImageNet challenge was devised to capture the performance of various image recognition methods
- 1 million images belonging to 1000 different classes
- It's size was key to the development early deep learning models

- Datasets used for deep learning model develop are divided into three sets:
- Training set is used train the deep learning model;
- Validation set is used to tune the hyperparameters, implement early stopping, etc.; and
- Test set is used to evaluate model performance.

- Krizhevsky et al. trained a convolution network, similar to LeNet5, but containing far more layers, neurons, and connections, on the ImageNet Challenge using Graphical Processing Units (GPUs). This model was able to beat the state-of-the-art image classification methods by a large margin.
- GPUs are criticial to the success of deep learning methods.

- Models may outperform humans!?

- Large datasets and vast GPU compute infrastructures led to larger and more complex deep learning models for solving problems in a variety of domains ranging
- from computer vision to speach recognition,
- from medical imaging to text understanding,
- from computer graphics to industrial design,
- from autonomous driving to drug discovery, etc.

- Deep learning is a natural extension of artificial neural networks of the 90s.
- Extracts useful patterns from data
- Learns powerful representations
- Reduces the "semantic gap"

- Chain rule (or backpropagation)
- Computes how error (or more generally, the quantity to optimize) changes when model parameters change

- Stochastic gradient descent
- Iteratively update network parameters to "minimize the error"
**(How)**

- Iteratively update network parameters to "minimize the error"
- Convolutions
- Bakes in the intuition that signal is structured and often has some stationary properties
- Allows processing of large signals

- Hidden layers

- GPUs that support vectorized processing (tensor operations)
- Large datasets

- Computationally speaking, a deep learning model can be formalized as a graph of tensor operations:
- Nodes perform tensor operations; and
- Results propagate along edges between nodes.

- Provides new ways of thinking about deep learning models.
- Recursive nature: each node is capable of sophisticated, non-trivial computation, perhaps leveraging another neural network

- Autodiff
- Techniques to evaluate the "derivative of a computer program"

- Deep learning frameworks
- PyTorch
- TensorFlow
- etc.

- Image classification
- Face recognition
- Speech recognition
- Text-to-speech generation
- Handwriting transcription
- Medical image analysis and diagnosis
- Ads
- Cars: lane-keeping, automatic cruise control

- Myth
- Killer robots will enslave us

- Reality
- Deep learning (and more generally, artificial intelligence) will have a profound effect on our society
- Legal, social, philosophical, political, and personal

- Deep learning (and more generally, artificial intelligence) will have a profound effect on our society

In [ ]:

```
```