All Matters AI


  • Home

  • Topics

  • Categories3

  • Archives18

  • About

  • Search

Optimizing Gradient Descent

Posted on 2018-06-12 | Post modified: 2018-06-14 | In AI , Deep Learning Fundamentals |
Reading time≈ 0:03

The Gradient descent algorithm we saw in an earlier post just gives you an idea of how gradient descent works. We’ve seen that

$$W \leftarrow W - \alpha E’(W)\\
$$

This is just a rough formula to get you started. In practice what $E’(W)$ is calculating is the average of all the gradients due to every point in your training data. This is known as Vanilla Gradient Descent or Batch Gradient Descent. Large datasets with millions of instances are easily available nowadays. This presents a challenge to the vanilla gradient descent. The weights are updated after calculating the average of all the data points available, which significantly increases training time.

Let’s see some optimizations to keep these under control.

Read more »

Regularization

Posted on 2018-06-06 | Post modified: 2018-06-06 | In AI , Deep Learning Fundamentals |
Reading time≈ 0:05

In the last post on overfitting and underfitting. We’ve seen that to achieve a perfect classifier is a tedious task. In this post let’s see some common tricks and methods to control the generalization of a network.

Read more »

Overfitting and Underfitting

Posted on 2018-06-02 | Post modified: 2018-06-04 | In AI , Deep Learning Fundamentals |
Reading time≈ 0:05

Just to recap what we’ve seen so far,

  • Part 1: Neuron and the perceptron algorithm
  • Part 2: Multilayer Perceptrons and Activation Function
  • Part 3: Cost Function and Gradient Descent
  • Part 4: Forward and Backward Propagation
  • Part 5: Loss function and Cross-entropy
Read more »

Loss function and Cross Entropy

Posted on 2018-05-30 | Post modified: 2018-05-30 | In AI , Deep Learning Fundamentals |
Reading time≈ 0:06

In the gradient descent post, you’ve seen what an error function is. What the characteristics of a good loss function are. Let’s take a deep dive into how we achieve this.

Read more »

Forward and Backward Propagation

Posted on 2018-05-26 | Post modified: 2018-05-28 | In AI , Deep Learning Fundamentals |
Reading time≈ 0:10

Till now we’ve seen

  • How a Multilayer Perceptron works
  • How Gradient Descent helps Multilayer Perceptron

Now let’s see some tricks that extend these ideas to Deep neural networks.

Read more »

Cost Function and Gradient Descent

Posted on 2018-05-21 | Post modified: 2018-07-16 | In AI , Deep Learning Fundamentals |
Reading time≈ 0:04

In the last post we’ve discussed how a Multilayer Perceptron works. Now let’s see how to search/update a set of weights to achieve a good classifer.

Read more »

Multilayer Perceptrons and Activation function

Posted on 2018-04-09 | Post modified: 2018-05-19 | In AI , Deep Learning Fundamentals |
Reading time≈ 0:05

The XOR problem

In the previous post you’ve seen how a perceptron works. Now let’s dive into some more interesting problems in deep learning. What follows is the classic XOR problem. Develop a method for the correct classification of the following points.

Read more »

Neuron and the Perceptron Algorithm

Posted on 2018-03-28 | Post modified: 2018-05-24 | In AI , Deep Learning Fundamentals |
Reading time≈ 0:06

This is the first part of the series “Deep Learning Fundamentals”. The goal of this series is to explore the mechanisms of artificial neural networks. The focus is more on presenting an intuitive way of understanding neural networks. So, you can expect an emphasis on how and why things work rather than what does the job. More often than not I’ll try to use simple math without focusing on notation. Let’s jump into the fundamental unit of most of the neural networks - “a neuron”.

Read more »
12

Yeshwanth Arcot

RSS
0%
© 2018 Yeshwanth Arcot
Powered by Hexo
|
Theme — NexT.Gemini
Unless otherwise noted, all posts in All Matters AI by Yeshwanth Arcot are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.