Skip to content

Commit 14b6ace

Browse files
jstacmmcky
andauthored
keras lecture first build (#196)
* keras lecture first build * Update lectures/keras.md * Update lectures/keras.md * fix incomplete sentence * add hide-output tag to pip install * add as a list --------- Co-authored-by: Matt McKay <mmcky@users.noreply.github.com> Co-authored-by: mmcky <mamckay@gmail.com>
1 parent 0a64ed6 commit 14b6ace

File tree

2 files changed

+294
-0
lines changed

2 files changed

+294
-0
lines changed

lectures/_toc.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ parts:
4141
numbered: true
4242
chapters:
4343
- file: mle
44+
- file: keras
4445
- caption: Other
4546
numbered: true
4647
chapters:

lectures/keras.md

Lines changed: 293 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,293 @@
1+
---
2+
jupytext:
3+
text_representation:
4+
extension: .md
5+
format_name: myst
6+
format_version: 0.13
7+
jupytext_version: 1.16.4
8+
kernelspec:
9+
display_name: Python 3 (ipykernel)
10+
language: python
11+
name: python3
12+
---
13+
14+
# Simple Neural Network Regression with Keras and JAX
15+
16+
```{include} _admonition/gpu.md
17+
```
18+
19+
In this lecture we show how to implement one-dimensional nonlinear regression
20+
using a neural network.
21+
22+
We will use the popular deep learning library [Keras](https://keras.io/), which
23+
provides a simple and elegant interface to deep learning.
24+
25+
The emphasis in Keras on providing an intuitive API, while the heavy lifting is
26+
done by another library.
27+
28+
Currently the backend library can be Tensorflow, PyTorch, or JAX.
29+
30+
In this lecture we will use JAX.
31+
32+
The objective of this lecture is to provide a very simple introduction to deep
33+
learning in a regression setting.
34+
35+
We begin with some standard imports.
36+
37+
```{code-cell} ipython3
38+
import numpy as np
39+
import matplotlib.pyplot as plt
40+
```
41+
42+
Let's install Keras.
43+
44+
```{code-cell} ipython3
45+
:tags: [hide-output]
46+
47+
!pip install keras
48+
```
49+
50+
Now we specify that the desired backend is JAX.
51+
52+
```{code-cell} ipython3
53+
import os
54+
os.environ['KERAS_BACKEND'] = 'jax'
55+
```
56+
57+
Next we import some tools from Keras.
58+
59+
```{code-cell} ipython3
60+
import keras
61+
from keras.models import Sequential
62+
from keras.layers import Dense
63+
```
64+
65+
```{code-cell} ipython3
66+
Dense?
67+
```
68+
69+
## Data
70+
71+
First let's write a function to generate some data.
72+
73+
The data has the form
74+
75+
$$
76+
y_i = f(x_i) + \epsilon_i,
77+
\qquad i=1, \ldots, n
78+
$$
79+
80+
The map $f$ is specified inside the function and $\epsilon_i$ is an independent
81+
draw from a fixed normal distribution.
82+
83+
Here's the function that creates vectors `x` and `y` according to the rule
84+
above.
85+
86+
```{code-cell} ipython3
87+
def generate_data(x_min=0, x_max=5, data_size=400):
88+
x = np.linspace(x_min, x_max, num=data_size)
89+
x = x.reshape(data_size, 1)
90+
ϵ = 0.2 * np.random.randn(*x.shape)
91+
y = x**0.5 + np.sin(x) + ϵ
92+
x, y = [z.astype('float32') for z in (x, y)]
93+
return x, y
94+
```
95+
96+
Now we generate some data to train the model.
97+
98+
```{code-cell} ipython3
99+
x, y = generate_data()
100+
```
101+
102+
Here's a plot of the training data.
103+
104+
```{code-cell} ipython3
105+
fig, ax = plt.subplots()
106+
ax.scatter(x, y)
107+
ax.set_xlabel('x')
108+
ax.set_ylabel('y')
109+
plt.show()
110+
```
111+
112+
We'll also use data from the same process for cross-validation.
113+
114+
```{code-cell} ipython3
115+
x_validate, y_validate = generate_data()
116+
```
117+
118+
## Models
119+
120+
We supply functions to build two types of models.
121+
122+
The first implements linear regression.
123+
124+
This is achieved by constructing a neural network with just one layer, that maps
125+
to a single dimension (since the prediction is real-valued).
126+
127+
The input `model` will be an instance of `keras.Sequential`, which is used to
128+
group a stack of layers into a single prediction model.
129+
130+
```{code-cell} ipython3
131+
def build_regression_model(model):
132+
model.add(Dense(units=1))
133+
model.compile(optimizer=keras.optimizers.SGD(),
134+
loss='mean_squared_error')
135+
return model
136+
```
137+
138+
In the function above you can see that we use stochastic gradient descent to
139+
train the model, and that the loss is mean squared error (MSE).
140+
141+
MSE is the standard loss function for ordinary least squares regression.
142+
143+
The second function creates a dense (i.e., fully connected) neural network with
144+
3 hidden layers, where each hidden layer maps to a k-dimensional output space.
145+
146+
```{code-cell} ipython3
147+
def build_nn_model(model, k=10, activation_function='tanh'):
148+
# Construct network
149+
model.add(Dense(units=k, activation=activation_function))
150+
model.add(Dense(units=k, activation=activation_function))
151+
model.add(Dense(units=k, activation=activation_function))
152+
model.add(Dense(1))
153+
# Embed training configurations
154+
model.compile(optimizer=keras.optimizers.SGD(),
155+
loss='mean_squared_error')
156+
return model
157+
```
158+
159+
The following function will be used to plot the MSE of the model during the
160+
training process.
161+
162+
Initially the MSE will be relatively high, but it should fall at each iteration,
163+
as the parameters are adjusted to better fit the data.
164+
165+
```{code-cell} ipython3
166+
def plot_loss_history(training_history, ax):
167+
ax.plot(training_history.epoch,
168+
np.array(training_history.history['loss']),
169+
label='training loss')
170+
ax.plot(training_history.epoch,
171+
np.array(training_history.history['val_loss']),
172+
label='validation loss')
173+
ax.set_xlabel('Epoch')
174+
ax.set_ylabel('Loss (Mean squared error)')
175+
ax.legend()
176+
```
177+
178+
## Training
179+
180+
Now let's go ahead and train our models.
181+
182+
183+
### Linear regression
184+
185+
We'll start with linear regression.
186+
187+
First we create a `Model` instance using `Sequential()`.
188+
189+
```{code-cell} ipython3
190+
model = Sequential()
191+
regression_model = build_regression_model(model)
192+
```
193+
194+
Now we train the model using the training data.
195+
196+
```{code-cell} ipython3
197+
training_history = regression_model.fit(
198+
x, y, batch_size=x.shape[0], verbose=0,
199+
epochs=4000, validation_data=(x_validate, y_validate))
200+
```
201+
202+
Let's have a look at the evolution of MSE as the model is trained.
203+
204+
```{code-cell} ipython3
205+
fig, ax = plt.subplots()
206+
plot_loss_history(training_history, ax)
207+
plt.show()
208+
```
209+
210+
Let's print the final MSE on the cross-validation data.
211+
212+
```{code-cell} ipython3
213+
print("Testing loss on the validation set.")
214+
regression_model.evaluate(x_validate, y_validate)
215+
```
216+
217+
Here's our output predictions on the cross-validation data.
218+
219+
```{code-cell} ipython3
220+
y_predict = regression_model.predict(x_validate)
221+
```
222+
223+
We use the following function to plot our predictions along with the data.
224+
225+
```{code-cell} ipython3
226+
def plot_results(x, y, y_predict, ax):
227+
ax.scatter(x, y)
228+
ax.plot(x, y_predict, label="fitted model", color='black')
229+
ax.set_xlabel('x')
230+
ax.set_ylabel('y')
231+
```
232+
233+
Let's now call the function on the cross-validation data.
234+
235+
```{code-cell} ipython3
236+
fig, ax = plt.subplots()
237+
plot_results(x_validate, y_validate, y_predict, ax)
238+
plt.show()
239+
```
240+
241+
### Deep learning
242+
243+
Now let's switch to a neural network with multiple layers.
244+
245+
We implement the same steps as before.
246+
247+
```{code-cell} ipython3
248+
model = Sequential()
249+
nn_model = build_nn_model(model)
250+
```
251+
252+
```{code-cell} ipython3
253+
training_history = nn_model.fit(
254+
x, y, batch_size=x.shape[0], verbose=0,
255+
epochs=4000, validation_data=(x_validate, y_validate))
256+
```
257+
258+
```{code-cell} ipython3
259+
fig, ax = plt.subplots()
260+
plot_loss_history(training_history, ax)
261+
plt.show()
262+
```
263+
264+
Here's the final MSE for the deep learning model.
265+
266+
```{code-cell} ipython3
267+
print("Testing loss on the validation set.")
268+
nn_model.evaluate(x_validate, y_validate)
269+
```
270+
271+
You will notice that this loss is much lower than the one we achieved with
272+
linear regression, suggesting a better fit.
273+
274+
To confirm this, let's look at the fitted function.
275+
276+
```{code-cell} ipython3
277+
y_predict = nn_model.predict(x_validate)
278+
```
279+
280+
```{code-cell} ipython3
281+
def plot_results(x, y, y_predict, ax):
282+
ax.scatter(x, y)
283+
ax.plot(x, y_predict, label="fitted model", color='black')
284+
ax.set_xlabel('x')
285+
ax.set_ylabel('y')
286+
```
287+
288+
```{code-cell} ipython3
289+
fig, ax = plt.subplots()
290+
plot_results(x_validate, y_validate, y_predict, ax)
291+
plt.show()
292+
```
293+

0 commit comments

Comments
 (0)