-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: improve qwen2-vl startup #2802
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
drbh
force-pushed
the
improve-qwen2-vl-warmup
branch
from
December 6, 2024 17:40
066addd
to
60b9c18
Compare
5 tasks
I do not understand, why should we impose anything on the user for the images. If 20px x 20x is not supported we should:
|
drbh
force-pushed
the
improve-qwen2-vl-warmup
branch
2 times, most recently
from
December 9, 2024 21:32
32a9564
to
a3049f1
Compare
drbh
force-pushed
the
improve-qwen2-vl-warmup
branch
from
January 7, 2025 22:35
a3049f1
to
d671f6e
Compare
4 tasks
drbh
changed the title
feat: tokenize each request individually and increase warmup image size
feat: improve qwen2-vl startup
Jan 8, 2025
drbh
force-pushed
the
improve-qwen2-vl-warmup
branch
from
January 13, 2025 18:50
d671f6e
to
320b520
Compare
drbh
force-pushed
the
improve-qwen2-vl-warmup
branch
from
January 16, 2025 16:04
35b528e
to
bd59f96
Compare
optimistically merging this PR as all tests pass, comments have been addressed, this image has been test/deployed in production and it fixes a bug when starting TGI with qwen2-vl. Will watch for regressions and roll back if needed |
drbh
added a commit
that referenced
this pull request
Jan 17, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR resolves some small issues with qwen2-vl.
WARMUP_IMAGE_BASE64
from 20x20px to 40x40px (meets qwens minimal requirement without hacky fix)r.truncate
to be passed for each request - as previouslly it was not respected when one of the request was smaller than others in the batch.max_s
to the max of max_s or the input size. This is required so the rotary and createself._cos_cached
of the correct size in relation to the position ids.these changes resolve a startup issue reproducible with:
*(note the underlying issue triggers when
max-input-tokens
is less thanmax-batch-prefill-tokens
)