You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I get an error when I run the training step for instruction fine-tuning in this notebook. The training job starts properly, but after ~10min it fails and raises: ErrorMessage "raise RuntimeError( RuntimeError: Could not find response key [1, 32002] in token IDs tensor([ 1, 20811, 349, ..., 302, 15637, 266])
To reproduce
Upload the notebook in a Sagemaker Notebook
Run every cell, the error appears when running the instruction-fine tuning training job (1.3 Starting Training section)
Logs
Attaching some screenshots of the logs
Any idea on how to fix this ?
The text was updated successfully, but these errors were encountered:
louishourcade
changed the title
[Bug Report]
[Bug Report] RuntimeError when running instruction fine-tuning on mistral 7b, Sagemaker Jumpstart
May 3, 2024
Link to the notebook
https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/jumpstart-foundation-models/mistral-7b-instruction-domain-adaptation-finetuning.ipynb
Describe the bug
I get an error when I run the training step for instruction fine-tuning in this notebook. The training job starts properly, but after ~10min it fails and raises:
ErrorMessage "raise RuntimeError( RuntimeError: Could not find response key [1, 32002] in token IDs tensor([ 1, 20811, 349, ..., 302, 15637, 266])
To reproduce
Logs
Attaching some screenshots of the logs
Any idea on how to fix this ?
The text was updated successfully, but these errors were encountered: