![](https://cloudyard.in/wp-content/uploads/2021/06/Gradient-Descent-and-Cost-Function-image.jpg)
In continuation of previous post i.e. Gradient Descent and Cost function , we touched upon the below points. For instance..
- Forward propagation to calculate the Loss.
- Backward propagation to the know the derivation in order to get the new values of W and b for subsequent Gradient descent iterations.
However, Now its time to deep dive and see how things are derived for one GD iteration.
Case Study:
Firstly, Let’s assume we have a basic neural network which takes 2 features x1 and x2 as input and try to predict the value of Y.
![Neural Network](https://cloudyard.in/wp-content/uploads/2021/06/Neural-Network.jpg)
Backward propagation Steps:
In the first step of backward propagation we calculate the derivation of the loss function with respect to Ŷ in below way (Refer Fig 1.).
![Fig 1](https://cloudyard.in/wp-content/uploads/2021/06/Fig-1.jpg)
Now we can move to the second step of the backward propagation to calculate the derivation of the loss function . (Refer Fig 2).
![Fig 2](https://cloudyard.in/wp-content/uploads/2021/06/Fig-2.jpg)
Or we can simply calculate it using chain rule of derivation.
![Chain Rule of Derivation](https://cloudyard.in/wp-content/uploads/2021/06/Chain-Rule-of-Derivation-1.jpg)
Now we move to the third step of the backward propagation to find the new values of w1, w2, and b .
![Fig 3](https://cloudyard.in/wp-content/uploads/2021/06/Fig-3.jpg)
Similarly, we can calculate :
![](https://cloudyard.in/wp-content/uploads/2021/06/Fig-4-300x69.jpg)
Now we can easily calculate the new value for the gradient iteration based on the understanding we had in the first post.
![](https://cloudyard.in/wp-content/uploads/2021/06/Fig-5-1.jpg)
For the single training example which has two features x1 and x2, we can write new values of w’s and b for one GD iteration, as follows:
![](https://cloudyard.in/wp-content/uploads/2021/06/Fig-6-1.jpg)
Now extend the same logic and try to calculate the same for a problem where we have m training examples and we want to know the values of w1 and w2 and b, which would be good for the entire training set.
We have 2 loops, an outer loop for m training examples and an inner loop for calculating parameters values w1,w2 and so on.
![](https://cloudyard.in/wp-content/uploads/2021/06/Fig-7-1.jpg)
As we are calculating the average values of Loss and parameters
Loss = Loss/m
And the w and b are as..
![](https://cloudyard.in/wp-content/uploads/2021/06/Fig-8-1.jpg)
This way we can reach out to the values of parameters for entire training set for one round of Gradient descent algorithm.
![Output](https://cloudyard.in/wp-content/uploads/2021/06/Output.jpg)
To get the understanding of Gradient Descent and Cost function from scratch, Click here.