Data type not aligned #39

KKKLeon · 2024-07-01T03:48:00Z

Dear author, during training the data type of img embedding is float16, but the input type for llama is bf16, I am wondering if this misalignment leads to data precision loss during training? Since float16 is much more precise than bf16.

RenShuhuai-Andy · 2024-07-08T02:50:39Z

Hi, thanks for your interest.

I thinks it's ok to use fp16 for img embedding since it can reduce GPU memory, and it's a common practice for other MLLMs like video-llama. However, it would be possiable to improve performance by using bf16.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data type not aligned #39

Data type not aligned #39

KKKLeon commented Jul 1, 2024

RenShuhuai-Andy commented Jul 8, 2024

Data type not aligned #39

Data type not aligned #39

Comments

KKKLeon commented Jul 1, 2024

RenShuhuai-Andy commented Jul 8, 2024