-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tests for models #922
Add tests for models #922
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Left some comments.
max_new_tokens=max_tokens) | ||
|
||
|
||
@pytest.fixture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the main reason to add @pytest.fixture
decorator compared to directly use these runners in the test function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To my understanding, there's no good way to import these runners into tests. Please let me know if there is.
@zhuohan123 Thanks for your comments. I've updated the PR. PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for the PR!
This PR adds basic tests for the vLLM models (that can run on a single GPU). The tests compare the outputs of HF and vLLM when using greedy sampling. NOTE: vLLM currently does not pass some of the tests.
@zhuohan123 I think you can extend
HfRunner
andVllmRunner
for the tests for #857.TODOs (in the future PRs):
tensor_parallel_size
> 1.