Why max_model_len only 8192 when inferencing with vLLM for DeepSeek-V2-Chat? #72

ybdesire · 2024-07-18T03:33:33Z

And what is the max value of max_model_len for DeepSeek-V2-Chat?

from transformers import AutoTokenizer
from vllm import LLM, SamplingParams

max_model_len, tp_size = 8192, 8

The text was updated successfully, but these errors were encountered:

ybdesire changed the title ~~Why max_model_len only 8192 when inferencing with vLLM?~~ Why max_model_len only 8192 when inferencing with vLLM for DeepSeek-V2-Chat? Jul 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why max_model_len only 8192 when inferencing with vLLM for DeepSeek-V2-Chat? #72

Why max_model_len only 8192 when inferencing with vLLM for DeepSeek-V2-Chat? #72

ybdesire commented Jul 18, 2024 •

edited

Loading

Why max_model_len only 8192 when inferencing with vLLM for DeepSeek-V2-Chat? #72

Why max_model_len only 8192 when inferencing with vLLM for DeepSeek-V2-Chat? #72

Comments

ybdesire commented Jul 18, 2024 • edited Loading

ybdesire commented Jul 18, 2024 •

edited

Loading