AI Features

Optimization and Gradient Descent

Learn about the fundamental algorithm behind machine learning training: gradient descent.

In our 2D example, the loss function can be thought of as a parabolic-shaped function that reaches its minimum on a certain pair of w1w_1 and w2w_2. Visually, we have:

To find these weights, the core idea is to simply follow the slope of the curve. Although we don’t know the actual shape of the loss, we can calculate the slope in a point and then move towards the downhill direction.

You can think of the loss function as a mountain. The current loss gives us information about the ...