-
Notifications
You must be signed in to change notification settings - Fork 420
Description
I'm trying to run the Deep Learning demo notebook, and it's taking a really long time on the training. It also doesn't look like it's using the GPU. I'm on an Amazon EC2 g2.2xlarge with the NVIDIA Corporation GK104GL [GRID K520](rev a1). I tried some of the solutions here: karpathy/char-rnn#89, like
require 'cunn'
require 'cutorch'
and th -l cutorch
and th -l cunn
from the command line. However, when I run the line
trainer:train(trainset)
it just seems to sit there in progress and doesn't go anywhere. I also checked the GPU usage with nvidia-smi, and it looks like this:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 361.77 Driver Version: 361.77 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GRID K520 Off | 0000:00:03.0 Off | N/A |
| N/A 31C P8 26W / 125W | 121MiB / 4036MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 7379 C /home/ubuntu/torch/install/bin/luajit 119MiB |
+-----------------------------------------------------------------------------+
It jumps up in memory usage and starts the PID after require cutorch
, and the memory usage never increases after that. GPU-Util sits at 0%. I have CUDA installed; nvcc --version
gives:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Wed_May__4_21:01:56_CDT_2016
Cuda compilation tools, release 8.0, V8.0.26
It's running on Ubuntu 16.04. I verified the samples are working, and CUDA isn't giving any errors.
Any ideas why it wouldn't be using the GPU?