Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decompose sample_from_logits for clarity and further development #190

Merged
merged 3 commits into from
Feb 2, 2024

Conversation

vvchernov
Copy link

  • refactor of model_common.py: transform duplicated code in sample_from_logits to two additional function. It makes the function more readable and help me further for loglikelihood development.
  • Add generalized type. I also plan add new Request type for loglikelihood and it is convenient to keep union of all request types in one place

@masahi
Copy link
Member

masahi commented Feb 2, 2024

Sorry the PR that added sample_from_logits has been reverted in #189 due to a bug, I'm debugging it

@masahi
Copy link
Member

masahi commented Feb 2, 2024

Ok fixed by #191, recommended to rebase.

@vvchernov vvchernov force-pushed the vc/tvm-model-refactor branch from 9332aaf to 4a6156c Compare February 2, 2024 09:27
@vvchernov vvchernov force-pushed the vc/tvm-model-refactor branch 2 times, most recently from d97c662 to 01c8fb6 Compare February 2, 2024 09:47
@vvchernov vvchernov force-pushed the vc/tvm-model-refactor branch from 01c8fb6 to fc13568 Compare February 2, 2024 09:49
@vvchernov
Copy link
Author

Hello @masahi! please review

@masahi masahi merged commit c36d47c into octoml:batch-serving Feb 2, 2024
1 check passed
@vvchernov vvchernov deleted the vc/tvm-model-refactor branch February 2, 2024 10:53
@sunggg
Copy link
Member

sunggg commented Feb 2, 2024

Hey, @vvchernov. Thank you for the refactoring!
I think request type of loglikelihood is interesting, but I'm unsure what benefit it can bring.
If there are mixture of non-logprob and logprob requests, how can this benefit us?
Currently, all the requests in the batch make progress in a lock step fashion so it is unclear to me how it can change this without significant change in the engine design.

Also, I'm about to send the huge refactoring of sampler, so just heads-up. Probably it will be ready for review early next week.

@vvchernov
Copy link
Author

Hello @sunggg! Thank you for the comments!
About loglikelihood request, you can see details in #69. Due to pipeline is strongly different, I also added new packed func on topology side, new request type was reasonable solution.
It should be noted that logprobs support in #82 and loglikelihood is not the same or close, though the latter uses logprob output API.
About mixture of requests with/without logprobs can be problem (I wrote about it in #82) due to logprob processing requires much time. They can be also separated to new request regime, but simultaneously it should be done in async regime here or stream with different type of requests should be separated on higher level.
And last thing, where is I can follow sampler refactoring? Looks like loglikelihood feature also will be ready next week

@sunggg
Copy link
Member

sunggg commented Feb 2, 2024

I see. Thank you for the explanation, let me follow-up with #69 as I'm not familiar with it yet.
Here is my WIP PR: #192 and I expect this will be ready by early next week.
This aims to vectorize the sampling computation and utilize CPU to prepare sampling metadata asynchronously.

Lunderberg pushed a commit to Lunderberg/mlc-llm that referenced this pull request Feb 27, 2024
filesystem path cannot be implicitly converted to string
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants