-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
../aten/src/ATen/native/cuda/IndexKernel.cu:93: operator(): block: [8320,0,0], thread: [64,0,0] Assertion -sizes[i] <= index && in dex < sizes[i] && "index out of bounds"
failed.
#9965
Comments
cc @WoosukKwon |
also cc @fyabc: if you can provide any insight on this that would be much appreciated since |
@ywang96 Yes, so it's difficult for me to debug this CUDA error due to its particularity. Specifically, I implemented get_input_vanilla_positions by following this function:
vllm/vllm/worker/model_runner.py Line 672 in b67feb1
|
@Wiselnn570 Hi, can you add Also, can you check |
Thank you; it looks like your suggestion was spot-on. Increasing the cos_sin_cache resolved the issue. |
Recently, I have encountered an issue while modifying the positional encoding in the mrope_input_positions section of the Qwen2-VL code, and I have tried but don't know how to resolve it. In short, I'm aiming to explore the model's performance when extrapolating to a 60k context on the Qwen2-VL 7B model, using video data for testing. I tried replacing this section (
vllm/vllm/worker/model_runner.py
Line 672 in 3bb4bef
I have already tested the original M-RoPE, which outputs correctly with a 60k context, and the maximum mrope_input_positions value is around 300. So, I am wondering if the position value is too large, causing it to exceed the index. How should I modify it to support vanilla-RoPE (Or perhaps some other 3D positional encoding, where the positional encoding values are quite large.) for evaluation? Thanks!
p.s. I noticed that this function (
vllm/vllm/worker/model_runner.py
Line 637 in 3bb4bef
using environment
Name: vllm
Version: 0.6.3.post2.dev171+g890ca360
Originally posted by @Wiselnn570 in #9875 (comment)
The text was updated successfully, but these errors were encountered: