-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HuggingFaceM4/idefics2: TGI would crash when I set do_image_splitting to False #2029
Closed
2 of 4 tasks
Comments
danieldk
added a commit
that referenced
this issue
Jun 17, 2024
Before this change, the number of reserved image tokens was not the same as the number of images. Fixes #2029.
5 tasks
danieldk
added a commit
that referenced
this issue
Jun 18, 2024
Before this change, the number of reserved image tokens was not the same as the number of images. Fixes #2029.
danieldk
added a commit
that referenced
this issue
Jun 18, 2024
Before this change, the number of reserved image tokens was not the same as the number of images. Fixes #2029.
danieldk
added a commit
that referenced
this issue
Jun 18, 2024
Before this change, the number of reserved image tokens was not the same as the number of images. Fixes #2029.
danieldk
added a commit
that referenced
this issue
Jun 18, 2024
Before this change, the number of reserved image tokens was not the same as the number of images. Fixes #2029.
danieldk
added a commit
that referenced
this issue
Jun 18, 2024
Before this change, the number of reserved image tokens was not the same as the number of images. Fixes #2029.
danieldk
added a commit
that referenced
this issue
Jun 19, 2024
Before this change, the number of reserved image tokens was not the same as the number of images. Fixes #2029. While at it, also remove all the image token handling duplication in `prepare_input`.
danieldk
added a commit
that referenced
this issue
Jun 19, 2024
Before this change, the number of reserved image tokens was not the same as the number of images. Fixes #2029. While at it, also remove all the image token handling duplication in `prepare_input`.
danieldk
added a commit
that referenced
this issue
Jun 20, 2024
Before this change, the number of reserved image tokens was not the same as the number of images. Fixes #2029. While at it, also remove all the image token handling duplication in `prepare_input`.
danieldk
added a commit
that referenced
this issue
Jun 20, 2024
Before this change, the number of reserved image tokens was not the same as the number of images. Fixes #2029. While at it, also remove all the image token handling duplication in `prepare_input`.
danieldk
added a commit
that referenced
this issue
Jun 25, 2024
Before this change, the number of reserved image tokens was not the same as the number of images. Fixes #2029. While at it, also remove all the image token handling duplication in `prepare_input`.
danieldk
added a commit
that referenced
this issue
Jun 25, 2024
Before this change, the number of reserved image tokens was not the same as the number of images. Fixes #2029. While at it, also remove all the image token handling duplication in `prepare_input`.
danieldk
added a commit
that referenced
this issue
Jun 25, 2024
Before this change, the number of reserved image tokens was not the same as the number of images. Fixes #2029. While at it, also remove all the image token handling duplication in `prepare_input`.
danieldk
added a commit
that referenced
this issue
Jun 27, 2024
Before this change, the number of reserved image tokens was not the same as the number of images. Fixes #2029. While at it, also remove all the image token handling duplication in `prepare_input`.
glegendre01
pushed a commit
that referenced
this issue
Jul 2, 2024
Before this change, the number of reserved image tokens was not the same as the number of images. Fixes #2029. While at it, also remove all the image token handling duplication in `prepare_input`.
yuanwu2017
pushed a commit
to yuanwu2017/tgi-gaudi
that referenced
this issue
Sep 26, 2024
Before this change, the number of reserved image tokens was not the same as the number of images. Fixes huggingface#2029. While at it, also remove all the image token handling duplication in `prepare_input`.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
System Info
When I try to start a service with HuggingFaceM4/idefics2-8b-base
I met the following errors:
Traceback (most recent call last):
File "/opt/conda/bin/text-generation-server", line 8, in
sys.exit(app())
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 311, in call
return get_command(self)(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 778, in main
return _main(
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 216, in _main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 683, in wrapper
return callback(**use_params) # type: ignore
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 90, in serve
server.serve(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 253, in serve
asyncio.run(
File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
self.run_forever()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
self._run_once()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
handle._run()
File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/lib/python3.10/site-packages/grpc_interceptor/server.py", line 165, in invoke_intercept_method
return await self.intercept(
Considering 320 / 64 =5, I doubt it caused by HuggingFaceM4/idefics2-8b-base set do_image_splitting as False.
While HuggingFaceM4/idefics2-8b has do_image_splitting = True, I think that's the reason why tgi can work with HuggingFaceM4/idefics2-8b.
I also have a fintuned model based on HuggingFaceM4/idefics2-8b, and set the processor do_image_splitting to true when saved the processor. Same error.
However, do_image_splitting will consume huge amount of vram which is not necessary in most of times.
Please help!
Information
Tasks
Reproduction
CUDA_VISIBLE_DEVICES=0 docker run
--gpus "device=0"
--shm-size 10g -p 8000:80 ghcr.io/huggingface/text-generation-inference:2.0.3
--model-id HuggingFaceM4/idefics2-8b-base
--dtype bfloat16
--max-total-tokens 32768
--max-input-tokens 32767
--sharded false
Expected behavior
The TGI should work with do_image_splitting = False
The text was updated successfully, but these errors were encountered: