-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Features: float16 if GPU available, float32 if CPU only #30
Comments
Q1: This part of the loader code patches all fp16 operations to fp32 when loading on CPU. The main reason for this is because PyTorch does not support all fp16 operations in CPU mode. Q2: In general, there can be a slight loss of accuracy compared to fp32, and training with fp16 weights can become unstable. Empirically, we have been doing CLIP inference in fp16 without much problem, and that's how the model was trained for anyway. AFAIK, there's no runtime performance benefit of using FP16 operations on CPUs even if it's supported, because they are cast to fp32 anyway. BF16 SIMD was introduced in Cooper Lake last year, so this might change in the future. |
Thank you for the precision! :) |
For anyone else who comes across this feature and wants to use
|
Hello,
I have noticed a discrepancy between the
dtype
of the features (both for images and texts) depending on the availability of a GPU.image_features.dtype
returnstorch.float32
.This happens if do not install the package properly, and then do not pay attention that
device
is set tocpu
.image_features.dtype
returnstorch.float16
.This happens if I follow the installation process properly and install proper versions of PyTorch (1.7.1+cu101) for Colab:
Q1: Is there a good reason why both versions do not return the same
dtype
? Is it due to AMP with GPU?Moreover, if I wanted to store normalized features,
float16
would allow me to cut in half the file size, so I would like to ensure that casting thefloat32
results (obtained with CPU only) tofloat16
would not actually lead to a loss of precision.Q2: Would casting the results to
float16
be totally safe? Or would it be safer to cast tofloat32
instead?Finally, the discrepancy can be slightly confusing for people who would pre-compute features on a machine with GPU, and then use the pre-computed features along with features computed on the fly in a web app with CPU only. This is how I noticed the discrepancy when running this line:
where
image_features
were computed on the fly (float32
) andzeroshot_weights
had been pre-computed (float16
).The text was updated successfully, but these errors were encountered: