Skip to content

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q4 2024
#9006 opened Oct 1, 2024 by simon-mo
Open 19
vLLM's V1 Engine Architecture
#8779 opened Sep 24, 2024 by simon-mo
Open 9
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

[Doc]: Compare LMDeploy vs vLLM AWQ Triton kernels documentation Improvements or additions to documentation
#10420 opened Nov 18, 2024 by casper-hansen
1 task done
[Bug]: NCCL error with 2-way pipeline parallelism. bug Something isn't working
#10419 opened Nov 18, 2024 by Pl4tiNuM
1 task done
[Bug]: KV Cache Quantization with GGUF turns out quite poorly. bug Something isn't working
#10411 opened Nov 18, 2024 by phazei
1 task done
[Bug]: 使用vllm和transformer部署Qwen2vl,同一张图片输出结果不一致 bug Something isn't working
#10408 opened Nov 18, 2024 by Apricot1225
1 task done
[New Model]: fishaudio/fish-speech-1.4 new model Requests to new models
#10404 opened Nov 17, 2024 by cavities
1 task done
[Bug]: Hermes tool parser output error stream arguments in some cases. bug Something isn't working
#10395 opened Nov 16, 2024 by xiyuan-lee
1 task done
[Bug]: Granite 3.0 disconnect between parser and example template bug Something isn't working
#10379 opened Nov 15, 2024 by wilbry
1 task done
[Feature]: NVIDIA Triton GenAI Perf Benchmark feature request good first issue Good for newcomers help wanted Extra attention is needed
#10377 opened Nov 15, 2024 by simon-mo
1 task done
[Bug]: Guided Decoding Broken in Streaming mode bug Something isn't working
#10376 opened Nov 15, 2024 by JC1DA
1 task done
[Bug]: Torch profiling does not stop and cannot get traces for all workers bug Something isn't working
#10365 opened Nov 15, 2024 by ruisearch42
1 task done
[Usage]: cuda oom when serving multi task on same server usage How to use vllm
#10345 opened Nov 15, 2024 by reneix
1 task done
[Misc]: Snowflake Arctic out of memory error with TP-8 bug Something isn't working
#10344 opened Nov 14, 2024 by rajagond
1 task done
[Bug]: Out of Memory (OOM) Issues During MMLU Evaluation with lm_eval bug Something isn't working
#10325 opened Nov 14, 2024 by wchen61
1 task done
[Installation]: Request to include vllm==0.6.2 for cuda 11.8 installation Installation problems
#10319 opened Nov 14, 2024 by amew0
1 task done
ProTip! Find all open issues with in progress development work with linked:pr.