Optimization and Gradient Descent
Learn about the fundamental algorithm behind machine learning training: gradient descent.
In our 2D example, the loss function can be thought of as a parabolic-shaped function that reaches its minimum on a certain pair of and . Visually, we have:
To find these weights, the core idea is to simply follow the slope of the curve. Although we don’t know the actual shape of the loss, we can calculate the slope in a point and then move towards the downhill direction.
You can think of the loss function as a mountain. The current loss gives us information about the ...