You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The first two exercises handle training neural networks on GPUs instead of CPUs. Even though this is extremely important for reducing the training time, we postponed it to the exercises because some course participants may not have a compatible GPU for training. If you are not able to do these two exercises for this reason, we apologize.
53
+
The first two exercises handle training neural networks on GPUs instead of CPUs. Even though this is extremely important for reducing the training time, we postponed it to the exercises because some course participants may not have a compatible GPU for training. If anyone is not able to do these two exercises, we apologize.
65
54
66
55
67
56
```@raw html
@@ -70,7 +59,7 @@ The first two exercises handle training neural networks on GPUs instead of CPUs.
70
59
```
71
60
While most computer operations are performed on CPUs (central processing unit), neural networks are trained on other hardware such as GPUs (graphics processing unit) or specialized hardware such as TPUs.
72
61
73
-
To use GPUs, include packages Flux and CUDA. Then generate a random matrix ``A\in \mathbb{R}^{100\times 100}`` and a random vector ``b\in \mathbb{R}^{100}``. They will be stored in the memory (RAM) and the computation will be performed on CPU. To move them to the GPU memory and allow computations on GPU, use ```gpu(A)``` or the more commonly used ```A |> gpu```.
62
+
To use GPUs, include packages Flux and CUDA. Then generate a random matrix ``A\in \mathbb{R}^{100\times 100}`` and a random vector ``b\in \mathbb{R}^{100}``. They will be stored in the memory (RAM), and the computation will be performed on CPU. To move them to the GPU memory and allow computations on GPU, use ```gpu(A)``` or the more commonly used ```A |> gpu```.
74
63
75
64
Investigate how long it takes to perform multiplication ``Ab`` if both objects are on CPU, GPU or if they are saved differently. Check that both multiplications resulted in the same vector.
76
65
```@raw html
@@ -99,7 +88,7 @@ To test the time, we measure the time for multiplication
99
88
0.806913 seconds (419.70 k allocations:22.046 MiB)
100
89
0.709140 seconds (720.01 k allocations:34.860 MiB, 1.53% gc time)
101
90
```
102
-
We see that all three times are different. Can we infer anything from it? No! The problem is that during a first call to a function, some compilation usually takes place. We should always compare only the second time.
91
+
We see that all three times are different. Can we infer anything from it? No! The problem is that during the first call to a function, some compilation usually takes place. We should always compare only the second time.
103
92
```julia
104
93
@time A*b;
105
94
@time A_g*b_g;
@@ -110,7 +99,7 @@ We see that all three times are different. Can we infer anything from it? No! Th
110
99
0.000154 seconds (11 allocations:272 bytes)
111
100
0.475280 seconds (10.20 k allocations:957.125 KiB)
112
101
```
113
-
We conclude that while the computation on CPU and GPU takes approximately the same time, when using the mixed types, it takes much longer.
102
+
We conclude that while the computation on CPU and GPU takes approximately the same time, it takes much longer when using the mixed types.
114
103
115
104
To compare the results, the first idea would be to run
116
105
```julia
@@ -144,7 +133,7 @@ we realize that one of the arrays is stored in ```Float64``` while the second on
144
133
145
134
146
135
147
-
The previous exercise did not show any differences when performing a matrix-vector multiplication. The probable reason was that the running times were too short. The next exercise shows the time difference when applied to a larger problem.
136
+
The previous exercise did not show any differences when performing a matrix-vector multiplication. The probable reason was that the running times were too short. The following exercise shows the time difference when applied to a larger problem.
148
137
149
138
150
139
@@ -179,12 +168,12 @@ m = Chain(
179
168
)
180
169
181
170
file_name =joinpath("data", "mnist.bson")
182
-
train_or_load!(file_name, m, X_train, y_train)
171
+
train_or_load!(file_name, m)
183
172
184
173
m_g = m |> gpu
185
174
X_test_g = X_test |> gpu
186
175
```
187
-
Now we can measure the evaluation time. Remember that before doing so, we need to compile all the functions by evaluating at least one sample.
176
+
Now we can measure the evaluation time. Remember that we need to compile all the functions by evaluating at least one sample before doing so.
188
177
```julia
189
178
m(X_test[:,:,:,1:1])
190
179
m_g(X_test_g[:,:,:,1:1])
@@ -264,7 +253,7 @@ m = Chain(
264
253
)
265
254
266
255
file_name = joinpath("data", "mnist.bson")
267
-
train_or_load!(file_name, m, X_train, y_train)
256
+
train_or_load!(file_name, m)
268
257
```
269
258
When creating a table, we specify that its entries are ```Int```. We save the predictions ```y_hat``` and labels ```y```. Since we do not use the second argument to ```onecold```, the entries of ```y_hat``` and ```y``` are between 1 and 10. Then we run a for loop over all misclassified samples and add to the error counts.
270
259
```@example gpuu
@@ -315,15 +304,18 @@ Plot all images which are ``9`` but were classified as ``7``.
315
304
<details class = "solution-body">
316
305
<summary class = "solution-header">Solution:</summary><p>
317
306
```
318
-
To plot all these misclassified images, we find their indices and use the function ```plot_image```. Since ```y``` are stored in the 1:10 format, we need to shift the indices by one. Since there are 11 of these images, and since 11 is a prime number, we cannot plot it in a ```layout```. We use a hack and add an empty plot ```p_empty```. When plotting, we specify ```layout``` and to minimize the empty space between images also ```size```.
307
+
308
+
To plot all these misclassified images, we find their indices and use the function `imageplot`. Since `y` are stored in the 1:10 format, we need to specify `classes`.
309
+
319
310
```@example gpuu
320
-
i1 = 9
321
-
i2 = 7
311
+
using ImageInspector
322
312
323
-
p = [plot_image(X_test[:,:,:,i]) for i in findall((y.==i1+1) .& (y_hat.==i2+1))]
Before plotting, we perform a for loop over the digits. Then ```onecold(y_train, classes) .== i``` creates a ```BitArray``` with ones if the condition is satisfied, and zeros if the condition is not satisfied. Then ```findall(???)``` selects all ones, and ```???[1:5]``` finds the first five indices. Since we need to plot the original image, and the images after the second and fourth layer (there is always a convolutional layer before the pooling layer), we save these values into ```z1```, ```z2``` and ```z3```. Since ```plot_image(z1[:,:,1,i])``` plots the first channel of the ``i^{\rm th}`` samples from ```z1```, we create an array of plots by ```p1 = [plot_image(z1[:,:,1,i]) for i in 1:size(z1,4)]```. As the length of ```z1``` is five, the length of ```p1``` is also five. This is the first row of the final plot. We create the other rows in the same way. To plot the final plot, we do ```plot(p1..., p2a..., p2b..., p3a..., p3b...)```, which unpacks the 5 arrays into 25 inputs to the ```plot``` function.
377
+
378
+
Before plotting, we perform a for loop over the digits. Then ```onecold(y_train, classes) .== i``` creates a ```BitArray``` with ones if the condition is satisfied, and zeros if the condition is not satisfied. Then ```findall(???)``` selects all ones, and ```???[1:5]``` finds the first five indices. Since we need to plot the original image, and the images after the second and fourth layer (there is always a convolutional layer before the pooling layer), we save these values into ```z1```, ```z2``` and ```z3```. Then we need to access to desired channels and plot then via the `ImageInspector` package.
379
+
386
380
```@example gpuu
381
+
using ImageInspector
382
+
387
383
classes = 0:9
384
+
plts = []
388
385
for i in classes
389
-
ii = findall(onecold(y_train, classes) .== i)[1:5]
386
+
jj = 1:5
387
+
ii = findall(onecold(y_train, classes) .== i)[jj]
390
388
391
389
z1 = X_train[:,:,:,ii]
392
390
z2 = m[1:2](X_train[:,:,:,ii])
393
391
z3 = m[1:4](X_train[:,:,:,ii])
394
392
395
-
p1 = [plot_image(z1[:,:,1,i]) for i in 1:size(z1,4)]
396
-
p2a = [plot_image(z2[:,:,1,i]) for i in 1:size(z2,4)]
397
-
p3a = [plot_image(z3[:,:,1,i]) for i in 1:size(z3,4)]
398
-
p2b = [plot_image(z2[:,:,end,i]) for i in 1:size(z2,4)]
399
-
p3b = [plot_image(z3[:,:,end,i]) for i in 1:size(z3,4)]
0 commit comments