Features: float16 if GPU available, float32 if CPU only #30

woctezuma · 2021-01-31T13:41:05Z

Hello,

I have noticed a discrepancy between the dtype of the features (both for images and texts) depending on the availability of a GPU.

if I run the code with CPU only on Colab, image_features.dtype returns torch.float32.
This happens if do not install the package properly, and then do not pay attention that device is set to cpu.

%pip install git+https://github.com/openai/CLIP.git

if I run the code with GPU on Colab, image_features.dtype returns torch.float16.
This happens if I follow the installation process properly and install proper versions of PyTorch (1.7.1+cu101) for Colab:

torch==1.7.1+cu101
torchvision==0.8.2+cu101

Q1: Is there a good reason why both versions do not return the same dtype? Is it due to AMP with GPU?

Moreover, if I wanted to store normalized features, float16 would allow me to cut in half the file size, so I would like to ensure that casting the float32 results (obtained with CPU only) to float16 would not actually lead to a loss of precision.

Q2: Would casting the results to float16 be totally safe? Or would it be safer to cast to float32 instead?

Finally, the discrepancy can be slightly confusing for people who would pre-compute features on a machine with GPU, and then use the pre-computed features along with features computed on the fly in a web app with CPU only. This is how I noticed the discrepancy when running this line:

logits = 100. * image_features @ zeroshot_weights

where image_features were computed on the fly (float32) and zeroshot_weights had been pre-computed (float16).

The text was updated successfully, but these errors were encountered:

openai/CLIP#30

jongwook · 2021-02-01T19:26:52Z

Q1: This part of the loader code patches all fp16 operations to fp32 when loading on CPU. The main reason for this is because PyTorch does not support all fp16 operations in CPU mode.

Q2: In general, there can be a slight loss of accuracy compared to fp32, and training with fp16 weights can become unstable. Empirically, we have been doing CLIP inference in fp16 without much problem, and that's how the model was trained for anyway.

AFAIK, there's no runtime performance benefit of using FP16 operations on CPUs even if it's supported, because they are cast to fp32 anyway. BF16 SIMD was introduced in Cooper Lake last year, so this might change in the future.

woctezuma · 2021-02-01T21:00:47Z

Thank you for the precision! :)

ProGamerGov · 2022-02-03T22:19:59Z

For anyone else who comes across this feature and wants to use torch.float32 with the GPU: You can load CLIP models on to the GPU by using torch.device("cpu") for the device argument, and then placing the model onto the GPU after it's been downloaded and loaded:

import clip
clip_model = clip.load('RN50x4', jit=False, device=torch.device("cpu"))[0].to(device)

woctezuma added a commit to woctezuma/match-steam-banners that referenced this issue Jan 31, 2021

Store features as float16 instead of float64

728dc16

openai/CLIP#30

woctezuma added a commit to woctezuma/match-steam-banners that referenced this issue Jan 31, 2021

Store features as float16 instead of float64

f2c9ac6

openai/CLIP#30

woctezuma closed this as completed Feb 1, 2021

FelixGoetze mentioned this issue Feb 4, 2021

Use Pytorch to make search faster haltakov/natural-language-image-search#16

Merged

spetryk mentioned this issue Apr 10, 2021

Half precision error on forward pass of RN50 and RN101 models #92

Closed

MaxKochanov mentioned this issue Nov 2, 2021

Change dtype on inference #171

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Features: float16 if GPU available, float32 if CPU only #30

Features: float16 if GPU available, float32 if CPU only #30

woctezuma commented Jan 31, 2021 •

edited

Loading

jongwook commented Feb 1, 2021 •

edited

Loading

woctezuma commented Feb 1, 2021

ProGamerGov commented Feb 3, 2022 •

edited

Loading

Features: float16 if GPU available, float32 if CPU only #30

Features: float16 if GPU available, float32 if CPU only #30

Comments

woctezuma commented Jan 31, 2021 • edited Loading

jongwook commented Feb 1, 2021 • edited Loading

woctezuma commented Feb 1, 2021

ProGamerGov commented Feb 3, 2022 • edited Loading

woctezuma commented Jan 31, 2021 •

edited

Loading

jongwook commented Feb 1, 2021 •

edited

Loading

ProGamerGov commented Feb 3, 2022 •

edited

Loading