RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP)

使用的环境是由作者提供的docker镜像
使用的显卡是 Tesla P100-PCIE 16GB
在运行./scripts/text2image.sh --debug报错
报错代码如下：
`Generate Samples
WARNING: No training data specified
using world size: 1 and model-parallel size: 1 
 > using dynamic loss scaling
> initializing model parallel with size 1
> initializing model parallel cuda seeds on global rank 0, model parallel rank 0, and data parallel rank 0 with model parallel seed: 3952 and data parallel seed: 1234
> padded vocab (size: 58219) with 21 dummy tokens (new size: 58240)
prepare tokenizer done
building CogView2 model ...
 > number of parameters on model parallel rank 0: 3928849920
current device: 1
Load model file pretrained/cogview/cogview-base/142000/mp_rank_00_model_states.pt
Working on No. 0 on 0... 
show raw text: 一只可爱的小猫。
Traceback (most recent call last):
  File "generate_samples.py", line 329, in <module>
    main()
  File "generate_samples.py", line 326, in main
    generate_images_continually(model, args)
  File "generate_samples.py", line 221, in generate_images_continually
    generate_images_once(model, args, raw_text, seq, num=args.batch_size, output_path=output_path)
  File "generate_samples.py", line 166, in generate_images_once
    output_tokens_list.append(filling_sequence(model, seq.clone(), args))
  File "/root/cogview/generation/sampling.py", line 128, in filling_sequence
    logits, *mems = model(tokens, position_ids, attention_mask, txt_indices_bool, img_indices_bool, is_sparse=args.is_sparse, *mems)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/root/cogview/fp16/fp16.py", line 65, in forward
    return fp16_to_fp32(self.module(*(fp32_to_fp16(inputs)), **kwargs))
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/root/cogview/model/gpt2_modeling.py", line 112, in forward
    transformer_output = self.transformer(embeddings, position_ids, attention_mask, txt_indices_bool, img_indices_bool, is_sparse, *mems)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/root/cogview/mpu/sparse_transformer.py", line 604, in forward
    hidden_states = layer(*args, mem=mem_i)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/root/cogview/mpu/sparse_transformer.py", line 322, in forward
    attention_output = self.attention(layernorm_output1, ltor_mask, pivot_idx, is_sparse, mem)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/root/cogview/mpu/sparse_transformer.py", line 166, in forward
    output = self.dense(context_layer)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/root/cogview/mpu/layers.py", line 319, in forward
    output_parallel = F.linear(input_parallel, self.weight)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/functional.py", line 1753, in linear
    return torch._C._nn.linear(input, weight, bias)
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP)`
`
希望有人能为我解答这个问题，谢谢

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP) #55

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP) #55

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions