These are some of my notes taken from the first course of Deep Learning Specialization (Neural Networks and Deep Learning) by deeplearning.ai. This might help you gain insights on how a neural network is formed and how Logistic Regression is used to form one single neuron to stack up and form a neural network. Further down, there is a discussion on how vectorization helps us improve the efficiency of code and algorithms in Python using Numpy.
1. Binary Classification
2. Logistic Regression
- In the following figure, the loss function (L) computes the error for a single training example; the cost function (J) is the average of the loss functions of the entire training set.
3. Gradient Descent
- Gradient descent is a method to determine the optimum values of the parameters using slopes, derivatives and a learning rate
- For logistic regression, the parameters w & b are optimized as follows, where alpha is the learning rate
4. Computing Derivatives for Gradient Descent
- The following computation graph is made by breaking down the formula for evaluating the cost function of the model
- Forward pass: Evaluation of the cost function
- Backward pass: Evaluation of the authenticity of the model and also the best fit parameters by finding the derivative of final output w.r.t different model variables
- One backward pass in a computation graph gives us the derivative of the current variable with respect to the one that we are passing to.
- e.g. Moving from block J to block v, we find the derivative of J w.r.t. v
- We apply chain rule which propagating back through the graph
5. Logistic Regression Derivatives
- The following image shows the gradient descent for Logistic regression method using only one data point
- That is why, we are taking loss function (L) instead of the cost function (J) as we only have one data point for now
- After we calculate “da=dL/da” & “dz=dL/dz”, we can use the following formula to get the change in w1 and w2:
- w1 = w1 - alpha * dw1
- w2 = w2 - alpha * dw2
6. Logistic Regression on “m” examples
- Algorithm to implement logistic regression for a dataset (m input samples) with 2 features only
- Although there is a for loop being used in the algorithm below, we will need vectorization techniques to simplify and optimize the code to work on large datasets.
7. Python and Vectorization
- Here,
np.dot(W, X)
is simply evaluatingW transpose . X
8. Vectorizing Logistic Regression
- In the last equation, although “b” is a (1, 1) real number, python automatically converts it to a (1, m) row vector. This is called Broadcasting
9. Implementing Logistic Regression
- Finally, the code to implement logistic regression in python is given below (right part)
10. Broadcasting in Python
- When axis=0, the numpy operations are carried out on the vertical axis
11. Python and Numpy Tips
- Do not use Rank 1 vectors as shown in the image below
- Use assert() statements in your code to validate the dimensions of the arrays/vectors you are working with
12. Explanation (derivation) of Loss function for Logistic Regression
- Loss Function
- Cost Function