- Initializing parameters
- Calculating the cost function and its gradient
- Using an optimization algorithm (gradient descent)
- Gather all three functions above into a main model function, in the right order.
- Use a virtual environment in order to avoid installing packages directly in your host machine
- The following packages will be used: numpy, h5py, matplotlib and PIL/scipy to test the model
When loading the data, we have:
1. A training set with the format (209, 64, 64, 3)
with 209 images labels as cat (y=1) or non-cat (y=0)
2. A test set with the format (50, 64, 64, 3)
with 50 images labeled as cat or no-cat
3. Each image is of shape (num_px, num_px, 3)
where in this case num_px
is 64 and the number of channels is 3
The goal is to build a logistic regression using a neural network mindset.
In order to do so, we have to consider the following mathematical expressions for an example
The cost is then computed by summing over all training examples: $$ J = \frac{1}{m} \sum_{i=1}^m \mathcal{L}(a^{(i)}, y^{(i)})\tag{6}$$
The steps to follow from here are the following:
- Initialize the parameters of the model
w
andb
- Learning the parameters of the model by minimizing the cost
- Use the learned parameters to make predicitions
- Analyze the results and conclude
- Model structure definition
- Initialize model's parameters
- Loop where:
- current loss is calculated (forward propagation)
- current gradient is calculated (backward propagation)
- update parameters (gradient descent)
- Different learning rates give different costs and thus different predictions results.
- If the learning rate is too large (0.01), the cost may oscillate up and down. It may even diverge (though in this example, using 0.01 still eventually ends up at a good value for the cost).
- A lower cost doesn't mean a better model. You have to check if there is possibly overfitting. It happens when the training accuracy is a lot higher than the test accuracy.
- In deep learning, we usually recommend that you:
- Choose the learning rate that better minimizes the cost function.
- If your model overfits, use other techniques to reduce overfitting.