Avoid compiling kernels for double data type #933

WoosukKwon · 2023-09-01T14:36:37Z

This PR fixes a dispatch logic for our custom CUDA kernels. Currently, vLLM uses AT_DISPATCH_FLOATING_TYPES_AND2 which actually includes the double data type that is never used. This leads to unnecessary increase in compilation time and binary size. The PR solves this issue by limiting the data types to float, half and bfloat16.

zhuohan123

LGTM! Thanks for the fix!

WoosukKwon added 2 commits September 1, 2023 14:27

Fix dispatch logic

e486f52

Minor

46d9982

WoosukKwon requested a review from zhuohan123 September 1, 2023 14:36

zhuohan123 approved these changes Sep 2, 2023

View reviewed changes

WoosukKwon merged commit 8ce9c50 into main Sep 2, 2023
2 checks passed

WoosukKwon deleted the fix-kernels branch September 2, 2023 05:59

liuyanyi pushed a commit to liuyanyi/vllm that referenced this pull request Sep 12, 2023

Avoid compiling kernels for double data type (vllm-project#933)

8c0766a

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

Avoid compiling kernels for double data type (vllm-project#933)

18f673d

sjchoi1 pushed a commit to casys-kaist-internal/vllm that referenced this pull request May 7, 2024

Avoid compiling kernels for double data type (vllm-project#933)

f10696b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid compiling kernels for double data type #933

Avoid compiling kernels for double data type #933

WoosukKwon commented Sep 1, 2023 •

edited

Loading

zhuohan123 left a comment

Avoid compiling kernels for double data type #933

Avoid compiling kernels for double data type #933

Conversation

WoosukKwon commented Sep 1, 2023 • edited Loading

zhuohan123 left a comment

Choose a reason for hiding this comment

WoosukKwon commented Sep 1, 2023 •

edited

Loading