You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am running into an issue where the exact same code if I use t5-small, but if I switch to Helsinki-NLP/opus-mt-zh-en it does not work anymore. The error is:
Traceback (most recent call last):
File "translation/run_translation.py", line 686, in<module>main()
File "translation/run_translation.py", line 603, in main
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "/home/CORP/r.lenain/miniconda3/envs/mt_opus-mt/lib/python3.7/site-packages/transformers/trainer.py", line 1504, in train
ignore_keys_for_eval=ignore_keys_for_eval,
File "/home/CORP/r.lenain/miniconda3/envs/mt_opus-mt/lib/python3.7/site-packages/transformers/trainer.py", line 1742, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/home/CORP/r.lenain/miniconda3/envs/mt_opus-mt/lib/python3.7/site-packages/transformers/trainer.py", line 2486, in training_step
loss = self.compute_loss(model, inputs)
File "/home/CORP/r.lenain/miniconda3/envs/mt_opus-mt/lib/python3.7/site-packages/transformers/trainer.py", line 2518, in compute_loss
outputs = model(**inputs)
File "/home/CORP/r.lenain/miniconda3/envs/mt_opus-mt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/CORP/r.lenain/miniconda3/envs/mt_opus-mt/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 168, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/CORP/r.lenain/miniconda3/envs/mt_opus-mt/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 178, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/CORP/r.lenain/miniconda3/envs/mt_opus-mt/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
output.reraise()
File "/home/CORP/r.lenain/miniconda3/envs/mt_opus-mt/lib/python3.7/site-packages/torch/_utils.py", line 461, in reraise
raise exception
TypeError: Caught TypeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/CORP/r.lenain/miniconda3/envs/mt_opus-mt/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/home/CORP/r.lenain/miniconda3/envs/mt_opus-mt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/CORP/r.lenain/miniconda3/envs/mt_opus-mt/lib/python3.7/site-packages/transformers/models/marian/modeling_marian.py", line 1455, in forward
return_dict=return_dict,
File "/home/CORP/r.lenain/miniconda3/envs/mt_opus-mt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/CORP/r.lenain/miniconda3/envs/mt_opus-mt/lib/python3.7/site-packages/transformers/models/marian/modeling_marian.py", line 1229, in forward
return_dict=return_dict,
File "/home/CORP/r.lenain/miniconda3/envs/mt_opus-mt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/CORP/r.lenain/miniconda3/envs/mt_opus-mt/lib/python3.7/site-packages/transformers/models/marian/modeling_marian.py", line 751, in forward
embed_pos = self.embed_positions(input_shape)
File "/home/CORP/r.lenain/miniconda3/envs/mt_opus-mt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/CORP/r.lenain/miniconda3/envs/mt_opus-mt/lib/python3.7/site-packages/deepspeed/compression/basic_layer.py", line 130, in forward
self.sparse)
File "/home/CORP/r.lenain/miniconda3/envs/mt_opus-mt/lib/python3.7/site-packages/torch/nn/functional.py", line 2199, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
TypeError: embedding(): argument 'indices' (position 2) must be Tensor, not torch.Size
Has anyone ever encountered this issue?
The text was updated successfully, but these errors were encountered:
Hello,
I am trying to use OPUS-MT together with DeepSpeed compression (examples can be found at this link https://github.com/microsoft/DeepSpeedExamples under
model_compression
).I am running into an issue where the exact same code if I use
t5-small
, but if I switch toHelsinki-NLP/opus-mt-zh-en
it does not work anymore. The error is:Has anyone ever encountered this issue?
The text was updated successfully, but these errors were encountered: