lesson1Title

lesson2Title

lesson3Title

lesson4Title

lesson5Title

lesson6Title

lesson7Title

lesson8Title

lesson9Title

lesson10Title

lesson11Title

lesson12Title

lesson13Title

aiFundamentalsDeepLearningChapter2Title

lesson14Title

lesson15Title

lesson16Title

lesson17Title

aiFundamentalsDeepLearningChapter1Title

aiFundamentalsDeepLearningChapter3Title

aiFundamentalsDeepLearningChapter4Title

aiFundamentalsDeepLearningChapter5Title

# The Core Principle of Neural Network Learning: Gradient Descent

`Gradient Descent` is an algorithm used in machine learning and deep learning for AI models to find **optimal weights**.

Gradient descent is often compared to descending a mountain to find the lowest point.

To minimize the difference between predicted values and actual values, it finds the steepest direction (gradient) and moves step-by-step.

AI models repeat this process to optimize weights for increasingly accurate predictions.

```plaintext title="Concept of Gradient Descent"
Loss Function = Height of the Mountain
Weight Adjustment = Adjusting the Descent Direction
Gradient = Indicates how steep it is
Learning Rate = Determines how much to move in one step
```

 

## How Gradient Descent Works

Gradient descent optimizes weights by repeating the following steps:

 

### 1. Calculate the Loss Function

Compute the difference between predicted values and actual values with the current weights.

Use a loss function to quantify the error.

```plaintext title="Loss Function Example"
Actual Value: 1.0, Predicted Value: 0.6
Loss (MSE) = (1.0 - 0.6)^2 = 0.16
```

 

### 2. Calculate the Gradient

Differentiate the loss function to find the gradient, which points in the direction that reduces the loss function's value the fastest from the current position.

```plaintext title="Gradient Calculation Example"
Current Weight: 0.5
Gradient: -0.3 (Direction of Decrease)
```

 

### 3. Update the Weight

Adjust the weight based on the gradient.

The step size is determined by multiplying the gradient by the `learning rate (α)`.

The formula is as follows:

$$
\text{New Weight} = \text{Current Weight} - (\text{Learning Rate} \times \text{Gradient})
$$

 

```plaintext title="Weight Update Example"
Current Weight: 0.8
Gradient: -0.2
Learning Rate: 0.1
New Weight: 0.8 - (0.1 * -0.2) = 0.82
```
Repeating this process gradually moves the weights closer to their optimal values, resulting in more accurate predictions by the neural network.

 

Gradient descent is a key method for neural networks to find optimal weights, with appropriate learning rate settings being crucial.

A learning rate that's too large may overshoot the optimal value, while one that's too small could slow down learning.

To address this, various gradient descent algorithms such as `Stochastic Gradient Descent (SGD)`, `Batch Gradient Descent (BGD)`, `Momentum`, and `Adam` are used.

In the next lesson, we will explore stochastic gradient descent in detail.

Gradient descent is an algorithm that adjusts weights to minimize the difference between predicted and actual values in an AI model. By repeating this process, the model becomes increasingly accurate.

### Gradient descent is an algorithm used to find the optimal weights during the learning process of a neural network.