Fix fp16 overflow #532

TechxGenus · 2024-07-02T16:31:43Z

Try to fix the precision problems of fp16.
The previous PRs fixed some cases like division by 0, but it may still report an error when the maximum range of fp16 is exceeded, such as the overflow of the value after weight scaling. Here we try to modify two places:

Cancel scaling when overflow
Convert the calculation of pseudo_quantize_tensor to fp32

(However I doubt whether this extreme case will actually occur, as model may not work at all under fp16 at this time)

TechxGenus added 2 commits July 2, 2024 16:11

fix fp16 overflow

4eecfb0

adjust position

be15142

TechxGenus marked this pull request as draft July 2, 2024 16:33

TechxGenus mentioned this pull request Jul 2, 2024

add deepseek v2 support #508

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix fp16 overflow #532

Fix fp16 overflow #532

TechxGenus commented Jul 2, 2024 •

edited

Loading

Fix fp16 overflow #532

Are you sure you want to change the base?

Fix fp16 overflow #532

Conversation

TechxGenus commented Jul 2, 2024 • edited Loading

TechxGenus commented Jul 2, 2024 •

edited

Loading