-
-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
memory leak when running mistralai/Mistral-7B-Instruct-v0.1 #1321
Comments
Hi @captify-sivakhno, thanks for reporting the bug. Could you share your |
@WoosukKwon please find attached the prompt - it's a simple summary promp |
I have the same issue running inference on multiple GPU |
The issue seems to occur when the Prompt is too long; I estimate that this threshold of input_id length is above 2048. |
@WoosukKwon Hi, is there any progress on this issue? |
Closing in preference to #1556 |
Watch out - the mistral v0.1 has a sliding window of 4096. If you text is above, it will run into #1556 - either patch the config.json (set "sliding_window": null) or make sure you are below the 4k. |
@manzke 'Mistral-7B-Instruct-v0.3' also has this problem, and editing the config.json (set "sliding_window": null) doesn't work |
I have an error suggesting memory leak when running
ValueError: Double free! PhysicalTokenBlock(device=Device.GPU, block_number=415, ref_count=0) is already freed.
Any suggestions on what to look into would be most appreciated!
The text was updated successfully, but these errors were encountered: