[Kernel] Marlin_24: Ensure the mma.sp instruction is using the ::ordered_metadata modifier (introduced with PTX 8.5) #5136

alexm-neuralmagic · 2024-05-30T14:19:49Z

A recently released PTX 8.5 (on May 9, 2024) introduced a new modifier, called ordered_metadata, for the mma.sp 2:4 warp sparse instruction. This modifier requires that the indices in the sparsity metadata are sorted in an increasing order starting from the LSB bit. In our case, in format_24.py lines 96-102, we already have this requirement met due to the following encodings:

    # Encoding quadruples of True/False values as follows:
    #     [True,  True,  False, False] -> 0b0100
    #     [True,  False, True,  False] -> 0b1000
    #     [False, True,  True,  False] -> 0b1001
    #     [True,  False, False, True ] -> 0b1100
    #     [False, True,  False, True ] -> 0b1101
    #     [False, False, True,  True ] -> 0b1110

Therefore, we can simply add the new modifier and nothing else is required.

…ata modifier (introduced with PTX 8.5)

tlrmchlsmth

Thanks for the fix!

…red_metadata modifier (introduced with PTX 8.5) (vllm-project#5136)

simon-mo · 2024-05-31T04:27:38Z

Hi @alexm-neuralmagic, I'm going to revert this PR because it doesn't compile on CUDA 12.1 and 11.8 and our release pipeline is currently stuck there. Sorry!

https://github.com/vllm-project/vllm/actions/runs/9311899829/job/25631836467#step:8:1873

ptxas /tmp/tmpxft_0000a2c3_00000000-9_marlin_24_cuda_kernel.compute_80.ptx, line 557; fatal   : Parsing error near ':': syntax error

…red_metadata modifier (introduced with PTX 8.5) (vllm-project#5136)

alexm-neuralmagic added 2 commits May 30, 2024 14:10

marlin_24: Ensure the mma.sp instruction is using the ::ordered_metad…

71dc27c

…ata modifier (introduced with PTX 8.5)

clang-format

071bb64

robertgshaw2-neuralmagic approved these changes May 30, 2024

View reviewed changes

robertgshaw2-neuralmagic enabled auto-merge (squash) May 30, 2024 14:44

tlrmchlsmth approved these changes May 30, 2024

View reviewed changes

simon-mo disabled auto-merge May 31, 2024 02:02

simon-mo merged commit 6d21fa1 into vllm-project:main May 31, 2024
56 of 64 checks passed

blinkbear pushed a commit to blinkbear/vllm that referenced this pull request May 31, 2024

[Kernel] Marlin_24: Ensure the mma.sp instruction is using the ::orde…

8c542fa

…red_metadata modifier (introduced with PTX 8.5) (vllm-project#5136)

simon-mo mentioned this pull request May 31, 2024

Revert "[Kernel] Marlin_24: Ensure the mma.sp instruction is using the ::ordered_metadata modifier (introduced with PTX 8.5)" #5149

Merged

dtrifiro pushed a commit to opendatahub-io/vllm that referenced this pull request May 31, 2024

[Kernel] Marlin_24: Ensure the mma.sp instruction is using the ::orde…

0318cbe

…red_metadata modifier (introduced with PTX 8.5) (vllm-project#5136)

blinkbear pushed a commit to blinkbear/vllm that referenced this pull request Jun 6, 2024

[Kernel] Marlin_24: Ensure the mma.sp instruction is using the ::orde…

5cc29ef

…red_metadata modifier (introduced with PTX 8.5) (vllm-project#5136)

robertgshaw2-neuralmagic pushed a commit to neuralmagic/nm-vllm that referenced this pull request Jun 8, 2024

[Kernel] Marlin_24: Ensure the mma.sp instruction is using the ::orde…

dcaf819

…red_metadata modifier (introduced with PTX 8.5) (vllm-project#5136)

tlrmchlsmth mentioned this pull request Jun 11, 2024

[Kernel] Suppress mma.sp warning on CUDA 12.5 and later #5401

Merged

joerunde pushed a commit to joerunde/vllm that referenced this pull request Jun 17, 2024

[Kernel] Marlin_24: Ensure the mma.sp instruction is using the ::orde…

adcf9cb

…red_metadata modifier (introduced with PTX 8.5) (vllm-project#5136)

robertgshaw2-neuralmagic pushed a commit to neuralmagic/nm-vllm that referenced this pull request Jul 14, 2024

[Kernel] Marlin_24: Ensure the mma.sp instruction is using the ::orde…

1324c62

…red_metadata modifier (introduced with PTX 8.5) (vllm-project#5136)

Temirulan pushed a commit to Temirulan/vllm-whisper that referenced this pull request Sep 6, 2024

[Kernel] Marlin_24: Ensure the mma.sp instruction is using the ::orde…

79be4df

…red_metadata modifier (introduced with PTX 8.5) (vllm-project#5136)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Kernel] Marlin_24: Ensure the mma.sp instruction is using the ::ordered_metadata modifier (introduced with PTX 8.5) #5136

[Kernel] Marlin_24: Ensure the mma.sp instruction is using the ::ordered_metadata modifier (introduced with PTX 8.5) #5136

alexm-neuralmagic commented May 30, 2024

tlrmchlsmth left a comment

simon-mo commented May 31, 2024

[Kernel] Marlin_24: Ensure the mma.sp instruction is using the ::ordered_metadata modifier (introduced with PTX 8.5) #5136

[Kernel] Marlin_24: Ensure the mma.sp instruction is using the ::ordered_metadata modifier (introduced with PTX 8.5) #5136

Conversation

alexm-neuralmagic commented May 30, 2024

tlrmchlsmth left a comment

Choose a reason for hiding this comment

simon-mo commented May 31, 2024