-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
can model Qwen/Qwen-VL-Chat work well? #962
Comments
same question here |
Same issue. I think this is because |
same issue |
有解决了没 |
same issue |
Please stop saying same issue, just react to the original message to show your support |
For text-only inputs, we can run the model with this patch #5710 . |
I am looking into adding support for image inputs for Qwen-VL/Qwen-VL-Chat 😄 |
@alex-jw-brooks thanks for the upcoming contribution! When you have updates, please post them here as I've closed the other issues as duplicates. |
Great, thank you @hmellor - It's almost ready, I've been able to load and get reasonable looking stuff out of qwen-vl/qwen-vl-chat, just need to work through some cleanup, small fixes, and tests. I will open the PR in the next couple of days 🤞 |
PR #8029 |
when i use Qwen/Qwen-VL-Chat I do not know why!
throw a error
Traceback (most recent call last): File "test.py", line 20, in <module> model = LLM(model=model_path, tokenizer=model_path,tokenizer_mode='slow',tensor_parallel_size=1,trust_remote_code=True) File "/usr/local/miniconda3/lib/python3.8/site-packages/vllm/entrypoints/llm.py", line 66, in __init__ self.llm_engine = LLMEngine.from_engine_args(engine_args) File "/usr/local/miniconda3/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 220, in from_engine_args engine = cls(*engine_configs, File "/usr/local/miniconda3/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 101, in __init__ self._init_workers(distributed_init_method) File "/usr/local/miniconda3/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 133, in _init_workers self._run_workers( File "/usr/local/miniconda3/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 470, in _run_workers output = executor(*args, **kwargs) File "/usr/local/miniconda3/lib/python3.8/site-packages/vllm/worker/worker.py", line 67, in init_model self.model = get_model(self.model_config) File "/usr/local/miniconda3/lib/python3.8/site-packages/vllm/model_executor/model_loader.py", line 57, in get_model model.load_weights(model_config.model, model_config.download_dir, File "/usr/local/miniconda3/lib/python3.8/site-packages/vllm/model_executor/models/qwen.py", line 308, in load_weights param = state_dict[name] KeyError: 'transformer.visual.positional_embedding'
the code is
`from vllm import LLM, SamplingParams
from transformers import AutoModelForCausalLM, AutoTokenizer,AutoConfig
import time
model_path="Qwen/Qwen-VL-Chat"
model = LLM(model=model_path, tokenizer=model_path,tokenizer_mode='slow',tensor_parallel_size=1,trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(model_path, legacy=True, trust_remote_code=True)
sampling_params = SamplingParams(temperature=0,max_tokens=8096)
start=time.time()
prompts = ["你好!"]
outputs = model.generate(prompts, sampling_params)
end = time.time()
for output in outputs:
prompt = output.prompt
generated_text = output.outputs[0].text
length = len(generated_text)
print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
print(end-start)
cost = end-start
print(f"{length/cost}tokens/s")`
The text was updated successfully, but these errors were encountered: