You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
scripts/bench.py:141: in <module>
bench_captions(
scripts/bench.py:63: in bench_captions
seconds, text = duration(
scripts/bench.py:48: in duration
result = callable()
scripts/bench.py:64: in <lambda>
lambda: caption(
scripts/bench.py:22: in caption
inputs = processor(prompt, image, return_tensors="pt")
/home/louis/miniconda3/envs/uform/lib/python3.11/site-packages/transformers/models/instructblip/processing_instructblip.py:89: in __call__
text_encoding = self.tokenizer(
/home/louis/miniconda3/envs/uform/lib/python3.11/site-packages/transformers/tokenization_utils_base.py:2802: in __call__
encodings = self._call_one(text=text, text_pair=text_pair, **all_kwargs)
/home/louis/miniconda3/envs/uform/lib/python3.11/site-packages/transformers/tokenization_utils_base.py:2860: in _call_one
raise ValueError(
E ValueError: text input must of type `str` (single example), `List[str]` (batch or single pretokenized example) or `List[List[str]]` (batch of pretokenized examples).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> entering PDB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PDB post_mortem >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> /home/louis/miniconda3/envs/uform/lib/python3.11/site-packages/transformers/tokenization_utils_base.py(2860)_call_one()
I expanded this code out [no lambda] and it still gives the same error but the data flow is clearer
I've tried running the code and found what looks like a bug in the benchmark script, I'm just diagnosing now
The traceback seems to point to the type of the
image
parameter at line 68:Click to expand traceback (captured by pytest)
I expanded this code out [no lambda] and it still gives the same error but the data flow is clearer
The traceback is pointing to the loading of the processor of the InstructBLIP model.
It was reported but not resolved in transformers (I think unrelated huggingface/transformers#21366)
The bug seems to be that we are passing unnamed arguments, and they're getting misused as a result:
The
InstructBLIP
signature is__call__(self, images, text)
The docs say that
I think this must be what is supposed to be getting called.
Debugging in PDB shows this is what is happening
Does this reproduce for you?
Cause
Update I found the cause is indeed passing positional args, if you print the processor param names they are, respectively:
I'm surprised this benchmark was working before
Solution
Since the parameter order varies you can't use positional args, but the parameter names differ too: text/texts.
In fact the odd one out here is from uform itself, so that should change, and this will work.
You can't just pass
images=image
(InstructBlipProcessor will get multiple values for the argumentimages
)This cannot be solved by passing text=text to Uform-Gen's VLMProcessor, that leads to a later error in the
model.generate
step.It looks like switching the order of these arguments in
VLMProcessor
is the best solution.If I patch it, everything works (but that's not to say don't fix the
VLMProcessor
argument order!).Environment details
Click to show full pip list
Click to show full conda list
The text was updated successfully, but these errors were encountered: