[OOM] Fine tuning CLIP #1573

AndrMoura · 2022-06-01T20:12:23Z

Hello, Im trying to fine-tune a CLIP model with my own data (image-description pairs) with a GPU but mid-training Im getting OOM RAM. The RAM memory during training slowly goes up and up until oom.

This is a sample from my code:

model = SentenceTransformer("sentence-transformers/clip-ViT-B-32")

train_examples = [InputExample(texts=[Image.open(os.path.join(img_path, row[1]['img_name'])), row[1]['description']])  for row in train_captions.iterrows()]

train_dataloader = DataLoader(train_examples, 
                              shuffle=True, 
                              batch_size=16)

train_loss = losses.MultipleNegativesRankingLoss(model=model)
# setup evaluator
...

model.fit([(train_dataloader, train_loss)], 
            show_progress_bar=True,
            epochs=10,
            output_path=output_path)

I believe the problem lies on the 2nd line. When I change the train_examples to load text only:
train_examples = [InputExample(texts=[row[1]['description'], row[1]['description']]) for row in train_captions.iterrows()]

The model trains without any memory issues!! I must be doing something wrong with the image loading. What is the proper way to load images in train_examples variable?

Thank you.

The text was updated successfully, but these errors were encountered:

jpzhangvincent · 2022-07-19T22:53:04Z

I'm also looking for examples for finetuning the CLIP model with sentence-transformer. Thanks!

rhkenne · 2022-08-03T08:48:14Z

Hey @nreimers, congrats on your move/promotion to cohere.ai. I would like to open a PR and address this issue. Any pointers on how to approach it ?

yash-120304 · 2023-01-19T10:58:54Z

@AndrMoura did u solve ur problem?
can u tell me why did u not put label in ur InputExample

AndrMoura · 2023-01-19T11:11:05Z

@AndrMoura did u solve ur problem? can u tell me why did u not put label in ur InputExample

I didn't. I used the HF library to train my own CLIP.

As for the label, check the MultipleNegativesRankingLoss .

yash-120304 · 2023-01-19T11:14:50Z

image= mapping.keys()
captions=mapping.values()
train_samples=[]
for img,caps in tqdm(zip(image,captions)):
img=img+'.jpg'
img=Image.open(os.path.join(img_dir,img))
image_emb = clip.encode([img], convert_to_tensor=True, show_progress_bar=False)
for cap in caps:
cap_emb = clip.encode([cap], convert_to_tensor=True, show_progress_bar=False)
score = util.semantic_search(image_emb,cap_emb)
input_example=InputExample(texts=[img,cap],label=score)
train_samples.append(input_example)

This is how i am computing my score is it correct?

yash-120304 · 2023-01-19T11:17:43Z

HF library to train my own CLIP

Ohh how can I do that?

bcoz for now i am simply using sentenceTransformers library to import my clip model and am getting good results but i cant evaluate it and this is where i am stuck.

httplups · 2024-12-03T13:46:38Z

HI, I am having the same error with memory

chschroeder mentioned this issue Dec 22, 2022

GPU out of memory issues #1793

Open

tomaarsen mentioned this issue Nov 23, 2023

How to Finetune a Clip Model with Custom Data #2355

Closed

unmo mentioned this issue Dec 22, 2023

Add training script for CLIP model. #2390

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OOM] Fine tuning CLIP #1573

[OOM] Fine tuning CLIP #1573

AndrMoura commented Jun 1, 2022 •

edited

Loading

jpzhangvincent commented Jul 19, 2022

rhkenne commented Aug 3, 2022

yash-120304 commented Jan 19, 2023 •

edited

Loading

AndrMoura commented Jan 19, 2023

yash-120304 commented Jan 19, 2023 •

edited

Loading

yash-120304 commented Jan 19, 2023

httplups commented Dec 3, 2024

[OOM] Fine tuning CLIP #1573

[OOM] Fine tuning CLIP #1573

Comments

AndrMoura commented Jun 1, 2022 • edited Loading

jpzhangvincent commented Jul 19, 2022

rhkenne commented Aug 3, 2022

yash-120304 commented Jan 19, 2023 • edited Loading

AndrMoura commented Jan 19, 2023

yash-120304 commented Jan 19, 2023 • edited Loading

yash-120304 commented Jan 19, 2023

httplups commented Dec 3, 2024

AndrMoura commented Jun 1, 2022 •

edited

Loading

yash-120304 commented Jan 19, 2023 •

edited

Loading

yash-120304 commented Jan 19, 2023 •

edited

Loading