Small library of neural networks created from scratch with python and numpy.
This project was created during my Master's course at Leiden University. Its purpose was to learn in depth about gradient descent and backpropagation techniques.
The model architecture allows for the creation of a neural network with an arbitrary number of inputs, hidden nodes and output nodes.
The network has been tested by simulating the XOR logic gate.
Python 3
numpy >=1.19
To run a single experiment run the file XOR_experiment.py
python XOR_experiment.py
To run multiple experiments and get a plot of the results use the file experiments.py
python experiments.py
It is interesting to notice that when running the network with only two hidden nodes (which is the minimum requirement to emulate the XOR logic gate), the network sometimes fails to learn to simulate XOR.
In general, this is the result of bad (unlucky) weight initialization that push the network towards local minimums with high convergence rate. The more neurons we add to the hidden layer, the harder it is for weight inintialization to influence the learning of the model. In addition, the more weights are used in the hidden layer, the smoother and faster the descent of the loss becomes over epochs.
An example of these behaviours can be seen from the generated plots when running experiments.py with the seed set as np.random.seed(0) (enabled by default).
The image below is a representation of the above observations:
The labels represent the parameters of the experiments:
- hn_int : denotes number of hidden nodels in the hidden layer of the network.
- lr_float : denotes the learning rate used during training.
- ok_bool : boolean value representing if the test on the network gave the correct results after training.
It is obvious that the experiment with two hidden nodes was not able to learn the XOR logic gate properly, converging to a Mean Squared Error of around 0.15.
- Implement multiple hidden layers in the network.
- Create bash script to run experiemnts.
The following literature was used as reference:
- Deep Learning by Aaron Courville, Ian Goodfellow, and Yoshua Bengio
- Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Geron Aurelien