Understanding Autograd

Thank you for taking our course. Completing the following tasks will prepare you for the exercise sessions in the coming week. Consequently, we will use workstations running Ubuntu Linux. We highly recommend to use Linux systems instead of Windows.

Task 0: Python and VSCode setup

If you are unfamiliar with GitHub or our exercise setup please follow Task 1 to 5 here.

Task 1: Operator overloading

This exercise studies the implementation of an Algorithmic differentiation engine via operator overloading.

Python supports overloading plus (+) and times (*) via the magic methods __add__ and __mul__. Both are vital for this project. Navigate to the src folder and open src/autograd.py. The TODOs mark parts of the code that require your attention.

Run nox -s test to check your code after implementing the class ADiffFloat. If all checks pass move on to the src/fit_neuron.py module.

When overloading __add__ please consider,

$$ \begin{align} y = x_1 + x_2 & \\ & \rightarrow \delta x_1 = \frac{\partial (x_1 + x_2)}{\partial x_1} \cdot \delta y = 1 \cdot \delta y \\ & \rightarrow \delta x_2 = \frac{\partial (x_1 + x_2)}{\partial x_2} \cdot \delta y = 1 \cdot \delta y , \end{align} $$

with $\delta y$ as the inner derivative or seed value.

When overloading __mul__ please consider,

$$ \begin{align} y = x_1 \cdot x_2 & \\ & \rightarrow \delta x_1 = \frac{\partial (x_1 \cdot x_2)}{\partial x_1} \cdot \delta y = x_2 \cdot \delta y \\ & \rightarrow \delta x_2 = \frac{\partial (x_1 \cdot x_2)}{\partial x_2} \cdot \delta y = x_1 \cdot \delta y. \end{align} $$

Finally for element-wise functions

$$ \begin{align} y = f(x) & \\ & \rightarrow \delta x = f'(x)\delta y . \end{align} $$

Task 2: Gradient descent

Now we want to use the autograd engine from the previous exercise to solve a simple optimisation problem using gradient descent. Move on to src/fit_neuron.py and resolve all TODOs.

To do so recall that the multivariate chain rule requires us to sum up contributions from each path. More formally for an input $x_j$ we compute,

$$ \delta x_j = \sum_i \frac{\partial y_i}{\partial x_j} \delta y_i. $$

$\frac{\partial y_i}{\partial x_j}$ can be a chain of multiple elemental operations.

You can now execute the script with python src/fit_neuron.py. If everything is implemented correctly, the training process should result in a test accuracy of 1.0 after 10 epochs.

Optional further reading:

Andreas Griewank, Andrea Walther, Evaluating Derivatives
- ( https://bonnus.ulb.uni-bonn.de/permalink/49HBZ_ULB/sol2rl/alma991037247969706467 )
Autograd via operator overloading:
- https://github.com/karpathy/micrograd
Autograd via source transformation:
- https://github.com/mattjj/autodidact/tree/master
- https://jax.readthedocs.io/en/latest/autodidax.html
Wikipedia's article on Automatic Differentiation
- https://en.wikipedia.org/wiki/Automatic_differentiation

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
.github/workflows		.github/workflows
src		src
tests		tests
.flake8		.flake8
.gitignore		.gitignore
README.md		README.md
README.pdf		README.pdf
noxfile.py		noxfile.py
pytest.ini		pytest.ini
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Understanding Autograd

Task 0: Python and VSCode setup

Task 1: Operator overloading

Task 2: Gradient descent

Optional further reading:

About

Releases

Packages

Contributors 2

Languages

Advanced-Machine-Learning-UBonn/day_01_exercise_opt

Folders and files

Latest commit

History

Repository files navigation

Understanding Autograd

Task 0: Python and VSCode setup

Task 1: Operator overloading

Task 2: Gradient descent

Optional further reading:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages