Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug report: empy batch when using --max-tokens < 128 #347

Closed
robertodessi opened this issue Nov 5, 2018 · 0 comments
Closed

bug report: empy batch when using --max-tokens < 128 #347

robertodessi opened this issue Nov 5, 2018 · 0 comments

Comments

@robertodessi
Copy link

I noticed that when setting max tokens < 128 in the dummy step before starting the actual training an error occurs.

| training on 1 GPUs
| max tokens per GPU = 25 and max sentences per GPU = None
Traceback (most recent call last):
  File "train.py", line 358, in <module>
    main(args)
  File "train.py", line 78, in main
    trainer.dummy_train_step([dummy_batch])
  File "/home/roberto.dessi/new_fairseq/fairseq/trainer.py", line 326, in dummy_train_step
    self.train_step(dummy_batch, dummy_batch=True)
  File "/home/roberto.dessi/new_fairseq/fairseq/trainer.py", line 176, in train_step
    ignore_grad
  File "/home/roberto.dessi/new_fairseq/fairseq/tasks/fairseq_task.py", line 169, in train_step
    loss, sample_size, logging_output = criterion(model, sample)
  File "/home/roberto.dessi/.virtualenvs/work/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/roberto.dessi/new_fairseq/fairseq/criterions/cross_entropy.py", line 30, in forward
    net_output = model(**sample['net_input'])
TypeError: 'NoneType' object is not subscriptable

The error is caused by https://github.com/robertodessi/fairseq/blob/master/fairseq/data/language_pair_dataset.py#L195

It seems to me that a small value of max_tokens causes that division to be 0 and to create an empty dummy batch that raise the above error.

It works when I replace that line with:

bsz = max(num_tokens // max(src_len, tgt_len), 1)
moussaKam pushed a commit to moussaKam/language-adaptive-pretraining that referenced this issue Sep 29, 2020
Summary: Pull Request resolved: facebookresearch#366

Differential Revision: D13058513

Pulled By: myleott

fbshipit-source-id: a146d2cfb345d404775ed8d6b8e4a4ad4e7a33b4
yfyeung pushed a commit to yfyeung/fairseq that referenced this issue Dec 6, 2023
…ookresearch#347)

* do some changes for aishell/ASR/transducer_stateless/export.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant