-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AssertionError: Prefix caching is currently not supported with sliding window attention #3355
Comments
But,when I just predict two example, it can finfish
|
I met the same problem. Besides, I want to confirm two questions
|
My scene has sped up by more than four times. my batchsize is 100 |
Yesterday, I encountered and resolved the same problem as well. However, I only submitted the PR today, not realizing someone had already preemptively fixed it ahead of time. This community is incredibly active. |
Thanks for your reply. |
Just add more data to the prompts list. The first prompt in the prompts list will be quite slow, but subsequent ones will be much faster, which is not entirely consistent with the description in the official example code comments. It seems that the warm-up during the first run did not take effect as illustrated in the example. |
Hi, @Chenghao-Jia, @Limingxing00, @a516072575. FYI, as a temporary fix, we can modify the config.json in Qwen1.5 and set {
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 11008,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 32,
"rms_norm_eps": 1e-06,
"rope_theta": 1000000.0,
"sliding_window": null, // replace with null here
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.37.0",
"use_cache": true,
"use_sliding_window": false, // turn off sliding_window
"vocab_size": 151936
}
|
hello @chenxu2048, what's the path of this config.json file? |
code
error
The text was updated successfully, but these errors were encountered: