RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP) #55

acerhp · 2022-03-31T07:47:32Z

使用的环境是由作者提供的docker镜像
使用的显卡是 Tesla P100-PCIE 16GB
在运行./scripts/text2image.sh --debug报错
报错代码如下：
`Generate Samples
WARNING: No training data specified
using world size: 1 and model-parallel size: 1

using dynamic loss scaling
initializing model parallel with size 1
initializing model parallel cuda seeds on global rank 0, model parallel rank 0, and data parallel rank 0 with model parallel seed: 3952 and data parallel seed: 1234
padded vocab (size: 58219) with 21 dummy tokens (new size: 58240)
prepare tokenizer done
building CogView2 model ...
number of parameters on model parallel rank 0: 3928849920
current device: 1
Load model file pretrained/cogview/cogview-base/142000/mp_rank_00_model_states.pt
Working on No. 0 on 0...
show raw text: 一只可爱的小猫。
Traceback (most recent call last):
File "generate_samples.py", line 329, in
main()
File "generate_samples.py", line 326, in main
generate_images_continually(model, args)
File "generate_samples.py", line 221, in generate_images_continually
generate_images_once(model, args, raw_text, seq, num=args.batch_size, output_path=output_path)
File "generate_samples.py", line 166, in generate_images_once
output_tokens_list.append(filling_sequence(model, seq.clone(), args))
File "/root/cogview/generation/sampling.py", line 128, in filling_sequence
logits, *mems = model(tokens, position_ids, attention_mask, txt_indices_bool, img_indices_bool, is_sparse=args.is_sparse, *mems)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(input, **kwargs)
File "/root/cogview/fp16/fp16.py", line 65, in forward
return fp16_to_fp32(self.module((fp32_to_fp16(inputs)), **kwargs))
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/cogview/model/gpt2_modeling.py", line 112, in forward
transformer_output = self.transformer(embeddings, position_ids, attention_mask, txt_indices_bool, img_indices_bool, is_sparse, *mems)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/cogview/mpu/sparse_transformer.py", line 604, in forward
hidden_states = layer(*args, mem=mem_i)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/cogview/mpu/sparse_transformer.py", line 322, in forward
attention_output = self.attention(layernorm_output1, ltor_mask, pivot_idx, is_sparse, mem)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/cogview/mpu/sparse_transformer.py", line 166, in forward
output = self.dense(context_layer)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/cogview/mpu/layers.py", line 319, in forward
output_parallel = F.linear(input_parallel, self.weight)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/functional.py", line 1753, in linear
return torch._C._nn.linear(input, weight, bias)
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP)
`
希望有人能为我解答这个问题，谢谢

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP) #55

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP) #55

acerhp commented Mar 31, 2022

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP) #55

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP) #55

Comments

acerhp commented Mar 31, 2022