This repository contains a neural network written from scratch in rust using the nalgebra crate. It is part of a blog series currently in progress here. You may find it useful to read the blogs if you hope to understand the code, otherwise I hope it is readable enough to be understood without them.
This project can currently only be run by using cargo:
git clone https://github.com/max-amb/number_recognition.git
cd number_recognition
cargo run
If you are using nix there is no need to try install dependencies as they are all contained in the flake.nix
.
You can enter this dev shell as follows:
git clone https://github.com/max-amb/number_recognition.git
cd number_recognition
nix develop
This section hopes to detail how I obtained my testing data. If you are unsure of anything I recommend you read my blog post which walks through the mathematics!
We start with a network with 3 layers (one input, one hidden and one output). They have sizes: 3, 2, and 3. The weights look like this:
Then we will have biases:
We will have mock data:
and expect an output of
For the activation functions, we use a leaky relu (
Everything is shortened to 3 decimal points for conciseness, but the full output can be found in the wolfram outputs.
Layer 1: $$ \begin{aligned} a^{[1]} & = ReLU((\omega^{[1]} \times a^{[0]}) + b^{[1]}) \ & = ReLU(\begin{bmatrix}-0.75 & 0.5 & -0.25 \ 0.25 & -0.5 & 0.75 \end{bmatrix} \times \begin{bmatrix} 0.25 \ 0.5 \ 0.75 \end{bmatrix} + \begin{bmatrix} 0 \ 1 \end{bmatrix})\ & = \begin{bmatrix} -0.025 \ 1.375 \end{bmatrix} \end{aligned} $$
Layer 2: $$ \begin{aligned} a^{[2]} & = \sigma((\omega^{[2]} \times a^{[1]}) + b^{[2]}) \ & = \sigma(\begin{bmatrix}-0.75 & 0.75 \\ 0.5 & -0.5 \\ -0.25 & 0.25 \end{bmatrix} \times \begin{bmatrix} -0.025 \\ 1.375 \end{bmatrix} + \begin{bmatrix} 0.75 \\ 0.5 \\ 0.25 \end{bmatrix})\ & = \begin{bmatrix} 0.858 \\ 0.450 \\ 0.646 \end{bmatrix} \end{aligned} $$
Wolfram input: 2 * ({(1/(1+e^{-1.8})),(1/(1+e^{0.2})),(1/(1+e^{-0.6}))} - {0,1,0}) * ({(1/(1+e^{-1.8}))*(1-(1/(1+e^{-1.8}))), (1/(1+e^{0.2}))*(1-(1/(1+e^{0.2}))), (1/(1+e^{-0.6}))*(1-(1/(1+e^{-0.6})))})
Output: {{0.208924}, {-0.272186}, {0.295432}}
Wolfram input: {{0.208924}, {-0.272186}, {0.295432}} * {{-0.025, 1.375}}
Output: {{-0.0052231, 0.287271}, {0.00680465, -0.374256}, {-0.0073858, 0.406219}}
Wolfram input: ({{-0.75, 0.5, -0.25}, {0.75, -0.5, 0.25}} * {{0.208924}, {-0.272186}, {0.295432}}) * {{0.2}, {1}}
Output: {{-0.0733288}, {0.366644}}
Wolfram input: {{-0.0733288}, {0.366644}} {{0.25, 0.5, 0.75}}
Output: {{-0.0183322, -0.0366644, -0.0549966}, {0.091661, 0.183322, 0.274983}}