[Feature]: make _init_tokenizer optional and support initiate LLMEngine without tokenizer #3647

GeauxEric · 2024-03-26T22:16:45Z

🚀 The feature, motivation and pitch

Currently the generate method supports inference based on prompt_token_ids:

    def generate(
        self,
        prompts: Optional[Union[str, List[str]]] = None,
        sampling_params: Optional[SamplingParams] = None,
        prompt_token_ids: Optional[List[List[int]]] = None,
        use_tqdm: bool = True,
        lora_request: Optional[LoRARequest] = None,
    ) -> List[RequestOutput]:

that means tokenizer is optional to the LLM engine.

However, to initiate an LLM engine, it always calls _init_tokenizer , which effectively makes tokenizer required.

The LLM engine cannot be initialized without a valid tokenizer argument.

In our application, we would love to use LLM's powerful engine for inference, but want to keep tokenizer as a separate service.

Alternatives

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

simon-mo · 2024-03-26T23:36:46Z

I think the main blocker is tokenizer is also used during decode. See #3635

GeauxEric added the feature request label Mar 26, 2024

keli-wen mentioned this issue Mar 28, 2024

[Usage]: How to use vLLM with Tensor input (customized tokenizer). #3655

Closed

This was referenced Mar 28, 2024

[Usage]: Is it possible to generate without detokenizing? #3635

Closed

Make initialization of tokenizer and detokenizer optional #3748

Merged

ywang96 closed this as completed in #3748 Apr 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: make _init_tokenizer optional and support initiate LLMEngine without tokenizer #3647

[Feature]: make _init_tokenizer optional and support initiate LLMEngine without tokenizer #3647

GeauxEric commented Mar 26, 2024

simon-mo commented Mar 26, 2024

[Feature]: make _init_tokenizer optional and support initiate LLMEngine without tokenizer #3647

[Feature]: make _init_tokenizer optional and support initiate LLMEngine without tokenizer #3647

Comments

GeauxEric commented Mar 26, 2024

🚀 The feature, motivation and pitch

Alternatives

Additional context

simon-mo commented Mar 26, 2024