You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ValueError: text input must of type str (single example), List[str] (batch or single pretokenized example) or List[List[str]] (batch of pretokenized examples).
#21366
Closed
alhuri opened this issue
Jan 30, 2023
· 4 comments
I am trying to run the evaluation of both MCLIP on zero-shot learning task found on this notebook colab.
the model is loaded using the below code
if MODEL_TYPE == 'mClip':
from sentence_transformers import SentenceTransformer
# Here we load the multilingual CLIP model. Note, this model can only encode text.
# If you need embeddings for images, you must load the 'clip-ViT-B-32' model
se_language_model = SentenceTransformer('clip-ViT-B-32-multilingual-v1')
se_image_model = SentenceTransformer('clip-ViT-B-32')
language_model = lambda queries: se_language_model.encode(queries, convert_to_tensor=True, show_progress_bar=False).cpu().detach().numpy()
image_model = lambda images: se_image_model.encode(images, batch_size=1024, convert_to_tensor=True, show_progress_bar=False).cpu().detach().numpy()
when running the below prediction cell
top_ns = [1, 5, 10, 100]
acc_counters = [0. for _ in top_ns]
n = 0.
for i, (images, target) in enumerate(tqdm(loader)):
images = images
target = target.numpy()
# predict
image_features = image_model(images)
image_features = image_features / np.linalg.norm(image_features, axis=-1, keepdims=True)
logits = 100. * image_features @ zeroshot_weights
# measure accuracy
accs = accuracy(logits, target, topk=top_ns)
for j in range(len(top_ns)):
acc_counters[j] += accs[j]
n += images.shape[0]
tops = {f'top{top_ns[i]}': acc_counters[i] / n * 100 for i in range(len(top_ns))}
print(tops)
I get the below error
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-41-3500c9b4df73> in <module>
11 target = target.numpy()
12 # predict
---> 13 image_features = image_model(images)
14 image_features = image_features / np.linalg.norm(image_features, axis=-1, keepdims=True)
15 logits = 100. * image_features @ zeroshot_weights
6 frames
<ipython-input-39-f2cc72683291> in <lambda>(images)
6 se_image_model = SentenceTransformer('clip-ViT-B-32')
7 language_model = lambda queries: se_language_model.encode(queries, convert_to_tensor=True, show_progress_bar=False).cpu().detach().numpy()
----> 8 image_model = lambda images: se_image_model.encode(images, batch_size=64, convert_to_tensor=False, show_progress_bar=False).cpu().detach().numpy()
9 elif MODEL_TYPE == 'bothclip':
10 import jax
/usr/local/lib/python3.8/dist-packages/sentence_transformers/SentenceTransformer.py in encode(self, sentences, batch_size, show_progress_bar, output_value, convert_to_numpy, convert_to_tensor, device, normalize_embeddings)
159 for start_index in trange(0, len(sentences), batch_size, desc="Batches", disable=not show_progress_bar):
160 sentences_batch = sentences_sorted[start_index:start_index+batch_size]
--> 161 print("sentences_batch")
162 print(sentences_batch)
163 features = self.tokenize(sentences_batch)
/usr/local/lib/python3.8/dist-packages/sentence_transformers/SentenceTransformer.py in tokenize(self, texts)
317 def tokenize(self, texts: Union[List[str], List[Dict], List[Tuple[str, str]]]):
318 """
--> 319 Tokenizes the texts
320 """
321 return self._first_module().tokenize(texts)
/usr/local/lib/python3.8/dist-packages/sentence_transformers/models/CLIPModel.py in tokenize(self, texts)
69 images = None
70
---> 71 inputs = self.processor(text=texts_values, images=images, return_tensors="pt", padding=True)
72 inputs['image_text_info'] = image_text_info
73 return inputs
/usr/local/lib/python3.8/dist-packages/transformers/models/clip/processing_clip.py in __call__(self, text, images, return_tensors, **kwargs)
97
98 if text is not None:
---> 99 encoding = self.tokenizer(text, return_tensors=return_tensors, **kwargs)
100
101 if images is not None:
/usr/local/lib/python3.8/dist-packages/transformers/tokenization_utils_base.py in __call__(self, text, text_pair, text_target, text_pair_target, add_special_tokens, padding, truncation, max_length, stride, is_split_into_words, pad_to_multiple_of, return_tensors, return_token_type_ids, return_attention_mask, return_overflowing_tokens, return_special_tokens_mask, return_offsets_mapping, return_length, verbose, **kwargs)
2525 if not self._in_target_context_manager:
2526 self._switch_to_input_mode()
-> 2527 encodings = self._call_one(text=text, text_pair=text_pair, **all_kwargs)
2528 if text_target is not None:
2529 self._switch_to_target_mode()
/usr/local/lib/python3.8/dist-packages/transformers/tokenization_utils_base.py in _call_one(self, text, text_pair, add_special_tokens, padding, truncation, max_length, stride, is_split_into_words, pad_to_multiple_of, return_tensors, return_token_type_ids, return_attention_mask, return_overflowing_tokens, return_special_tokens_mask, return_offsets_mapping, return_length, verbose, **kwargs)
2583
2584 if not _is_valid_text_input(text):
-> 2585 raise ValueError(
2586 "text input must of type `str` (single example), `List[str]` (batch or single pretokenized example) "
2587 "or `List[List[str]]` (batch of pretokenized examples)."
ValueError: text input must of type `str` (single example), `List[str]` (batch or single pretokenized example) or `List[List[str]]` (batch of pretokenized examples).
How can this be fixed?
The text was updated successfully, but these errors were encountered:
@alhuri could you provide a functioning notebook with the reproduction script? This one does not work for me (missing packages etc) with the config you are using? Thanks
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
I am trying to run the evaluation of both MCLIP on zero-shot learning task found on this notebook colab.
the model is loaded using the below code
when running the below prediction cell
I get the below error
How can this be fixed?
The text was updated successfully, but these errors were encountered: