Skip to content

cuDNN error: CUDNN_STATUS_EXECUTION_FAILED #35

@Tclz

Description

@Tclz

Hi, i meet a problem like this:
File "train.py", line 131, in
output_dict = model(**batch)
File "/root/anaconda3/envs/r2c_1/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "../models/multiatt/model.py", line 157, in forward
obj_reps = self.detector(images=images, boxes=boxes, box_mask=box_mask, classes=objects, segms=segms)
File "/root/anaconda3/envs/r2c_1/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "../utils/detector.py", line 111, in forward
img_feats = self.backbone(images)
File "/root/anaconda3/envs/r2c_1/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/root/anaconda3/envs/r2c_1/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward
input = module(input)
File "/root/anaconda3/envs/r2c_1/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/root/anaconda3/envs/r2c_1/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward
input = module(input)
File "/root/anaconda3/envs/r2c_1/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/root/anaconda3/envs/r2c_1/lib/python3.6/site-packages/torchvision/models/resnet.py", line 98, in forward
out = self.conv2(out)
File "/root/anaconda3/envs/r2c_1/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/root/anaconda3/envs/r2c_1/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 338, in forward
self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

The environment i use:
python3.6.6
cuda9.0.176
cudnn7.5.1
torch1.1.0
torchvision0.3.0
and i have tried several environment configs(like cudnn7.4, torch1.0, etc) but none of them works.
what should i do?
thank you :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions