All Matters AI

Optimizing Gradient Descent

Posted on 2018-06-12 | Post modified: 2018-06-14 | In AI , Deep Learning Fundamentals |

Reading time≈ 0:03

The Gradient descent algorithm we saw in an earlier post just gives you an idea of how gradient descent works. We’ve seen that

$$W \leftarrow W - \alpha E’(W)\\
$$

This is just a rough formula to get you started. In practice what $E’(W)$ is calculating is the average of all the gradients due to every point in your training data. This is known as Vanilla Gradient Descent or Batch Gradient Descent. Large datasets with millions of instances are easily available nowadays. This presents a challenge to the vanilla gradient descent. The weights are updated after calculating the average of all the data points available, which significantly increases training time.

Let’s see some optimizations to keep these under control.

Regularization

Posted on 2018-06-06 | Post modified: 2018-06-06 | In AI , Deep Learning Fundamentals |

Reading time≈ 0:05

In the last post on overfitting and underfitting. We’ve seen that to achieve a perfect classifier is a tedious task. In this post let’s see some common tricks and methods to control the generalization of a network.

Overfitting and Underfitting

Posted on 2018-06-02 | Post modified: 2018-06-04 | In AI , Deep Learning Fundamentals |

Reading time≈ 0:05

Just to recap what we’ve seen so far,

Loss function and Cross Entropy

Posted on 2018-05-30 | Post modified: 2018-05-30 | In AI , Deep Learning Fundamentals |

Reading time≈ 0:06

In the gradient descent post, you’ve seen what an error function is. What the characteristics of a good loss function are. Let’s take a deep dive into how we achieve this.

Forward and Backward Propagation

Posted on 2018-05-26 | Post modified: 2018-05-28 | In AI , Deep Learning Fundamentals |

Reading time≈ 0:10

Till now we’ve seen

Now let’s see some tricks that extend these ideas to Deep neural networks.

Cost Function and Gradient Descent

Posted on 2018-05-21 | Post modified: 2018-07-16 | In AI , Deep Learning Fundamentals |

Reading time≈ 0:04

In the last post we’ve discussed how a Multilayer Perceptron works. Now let’s see how to search/update a set of weights to achieve a good classifer.

Multilayer Perceptrons and Activation function

Posted on 2018-04-09 | Post modified: 2018-05-19 | In AI , Deep Learning Fundamentals |

Reading time≈ 0:05

The XOR problem

In the previous post you’ve seen how a perceptron works. Now let’s dive into some more interesting problems in deep learning. What follows is the classic XOR problem. Develop a method for the correct classification of the following points.

Neuron and the Perceptron Algorithm

Posted on 2018-03-28 | Post modified: 2018-05-24 | In AI , Deep Learning Fundamentals |

Reading time≈ 0:06

This is the first part of the series “Deep Learning Fundamentals”. The goal of this series is to explore the mechanisms of artificial neural networks. The focus is more on presenting an intuitive way of understanding neural networks. So, you can expect an emphasis on how and why things work rather than what does the job. More often than not I’ll try to use simple math without focusing on notation. Let’s jump into the fundamental unit of most of the neural networks - “a neuron”.