Understanding Gradient Descent | Reinforcement, Supervised, Unsupervised Learning

Gradient Descent

Gradient descent is a fundamental optimization algorithm used to minimize a function — typically a loss or error function — by iteratively adjusting the model’s parameters in the direction of the negative gradient. In machine learning, it helps models learn by reducing the discrepancy between predictions and actual outcomes.

The algorithm works by computing the gradient (partial derivatives) of the function with respect to its parameters and updating each parameter by a small step in the opposite direction. The size of the step is determined by a learning rate. If the learning rate is too small, training is slow; if too large, the optimization might diverge.

Gradient descent comes in several variants, including batch, stochastic (SGD), and mini-batch, depending on how much data is used to compute each update. In reinforcement learning, gradient descent is often used to train value functions or neural network approximators. It is also used indirectly in policy gradient methods, where the policy is updated to improve performance using sampled rewards. Gradient descent is the workhorse behind most modern machine learning systems.

Gradient Descent

Mentioned in blog posts: