Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Solved] ValueError: not enough values to unpack (expected 2, got 1) [BUG in Model Inference Stage] #2

Open
M7un opened this issue Jul 1, 2022 · 0 comments
Labels
good first issue Good for newcomers

Comments

@M7un
Copy link
Owner

M7un commented Jul 1, 2022

When the model makes a prediction, the following error occurs:

outputs = model(texts)
......
batch_size, seq_length = input_shape
ValueError: not enough values to unpack (expected 2, got 1)

  • It seems that there is a problem with the dimension of the data fed into the model. More precisely, there is a dimension error in the certain batch of data.

Looking up some information, it is recommended to use the unsqueeze(0) operation on the ids and masks output by tokenizer of BERT to increase a dimension, but it does not work.
Finally I found that the bug appeared in the DatasetIterator stage, if your dataset size is exactly an integer multiple of the batch size, then the following code in class DatasetIterator(object) should be modified.

def __next__(self):
    if self.residue and self.index == self.n_batches:
        batches = self.dataset[self.index * self.batch_size : len(self.dataset)]
        self.index += 1
        batches = self._to_tensor(batches)
        return batches

    elif self.index > self.n_batches:
        self.index = 0
        raise StopIteration

    else:
        batches = self.dataset[self.index * self.batch_size : (self.index + 1) * self.batch_size]
        self.index += 1
        batches = self._to_tensor(batches)
        return batches`
  • And line 8 elif self.index > self.n_batches should be changed to self.index >= self.n_batches. Otherwise the last batch is an empty tensor.

There is a lot of code on Github for dataset processing before model training, maybe should check it out if you have time for these basic codes :)

@M7un M7un added the good first issue Good for newcomers label Jul 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

1 participant