Refactor GELU and Sigmoid epilogue to use a common template (and add SiLu, Hardswish epilogue) #379

masahi · 2021-12-13T12:12:48Z

Since Sigmoid and GeLU epilogue functors are almost identical, I propose to refactor them. The motivation is to make it easy to add another functors.

Now Sigmoid and GeLU epilogue functors are defined as:

using LinearCombinationSigmoid = LinearCombinationGeneric<Sigmoid, ElementOutput_, Count, ElementAccumulator_,
                                                          ElementCompute_, FloatRoundStyle::round_to_nearest>;

using LinearCombinationGELU = LinearCombinationGeneric<GELU, ElementOutput_, Count, ElementAccumulator_,
                                                          ElementCompute_, FloatRoundStyle::round_to_nearest>;

I'm going to add two new functors for SiLU and HardSwish activation, which can be implemented in the same manner.

What do you think? @hwu36

masahi · 2021-12-14T05:27:18Z

ok I've pushed two activation using this common template.

d-k-b · 2021-12-14T15:27:52Z

Thanks for helping clean this up! Can you add documentation with a short description and links to resources to the SiLu and HardSwish structs for future reference?

masahi · 2021-12-14T18:46:41Z

@d-k-b Thanks for the feedback, the doc is updated.

hwu36 · 2021-12-17T22:29:20Z

Overall, looks good to me. I think just change FloatRoundStyle::round_to_nearest to Round everywhere.

masahi · 2021-12-18T01:36:44Z

Oops thanks for spotting this, fixed.

masahi · 2021-12-18T01:41:14Z

Not sure why Ci has failed

include/cutlass/epilogue/thread/linear_combination_gelu.h

include/cutlass/epilogue/thread/activation.h

hwu36 · 2021-12-18T02:24:46Z

These 4 activations look expensive to me. You can try to set kIsHeavy to true, check the generated code is different, and compare the performance to verify it.

If you setting them to true, the epilogue will not be fully unrolled which is good for the I$. Usually complex activations such as gelu need to be set to true.

masahi · 2021-12-18T02:47:14Z

These 4 activations look expensive to me. You can try to set kIsHeavy to true, check the generated code is different, and compare the performance to verify it.

If you setting them to true, the epilogue will not be fully unrolled which is good for the I$. Usually complex activations such as gelu need to be set to true.

My patch respects the current choice for the existing epilogues: isHeavy = true for GELU

cutlass/include/cutlass/epilogue/thread/linear_combination_gelu.h

Line 57 in 80e8fe7

ElementCompute_, Round, true>;

and whatever the default for Sigmoid which I assumed was false .

cutlass/include/cutlass/epilogue/threadblock/epilogue_base.h

Lines 73 to 76 in 80e8fe7

    
           // If the epilogue functor does not define `kIsHeavy` or if it is `false`, then 
        
           // the behavior from CUTLASS 2.5 and before is retained. The epilogue is fully 
        
           // unrolled and inlined. 
        
           //

For the two new activations, my reasoning was false for Silu because it is just sigmoid + mul, and also false for hardswish since it is similar to relu.

hwu36 · 2021-12-18T03:02:20Z

SGTM.

kIsHeavy was newly introduced. We benchmarked gelu, but not sigmoid. If you find sigmoid is expensive in the future, you can create a PR and flip the bool.

masahi · 2021-12-18T05:43:57Z

Making sigmoid and silu isHeavy = true did improve performance on efficientnetv2 slightly: 8.58 msec -> 8.26 msec where 8.58 was the number from my PR apache/tvm#9746. So I pushed that change. The change didn't help when the vectorized variant of sigmoid is used.

hwu36 · 2021-12-18T05:45:18Z

The change didn't help when the vectorized variant of sigmoid is used.

Vectorized one has less instruction and less burden to I$.

hwu36 · 2021-12-18T19:56:31Z

Making sigmoid and silu isHeavy = true did improve performance on efficientnetv2 slightly: 8.58 msec -> 8.26 msec where 8.58 was the number from my PR apache/tvm#9746.

You can update your TVM PR now. :)

masahi added 7 commits December 13, 2021 06:51

Support half precision sigmoid activation

c04a3f8

introduce a vectorized variant using fast_tanh

1c9bead

refactored sigmoid using the new interface

30e58d9

refactored gelu

6528a01

add silu activation

ff6eafe

Merge branch 'sigmoid-half' into silu-hardswish-epilogue

950aa4b

add hardswish

2f27d8c

remove sigmoid for now

d7b5297

masahi changed the title ~~Refactor GELU and Sigmoid epilogue to use a common template~~ Refactor GELU and Sigmoid epilogue to use a common template (and add SiLu, Hardswish epilogue) Dec 14, 2021

add description to silu and hardswish, and other doc update

8ac1036

masahi mentioned this pull request Dec 15, 2021

[CUTLASS] Support conv2d activation fusion apache/tvm#9746

Merged

Do not ignore Round

80e8fe7

hwu36 reviewed Dec 18, 2021

View reviewed changes

include/cutlass/epilogue/thread/linear_combination_gelu.h Outdated Show resolved Hide resolved

include/cutlass/epilogue/thread/activation.h Outdated Show resolved Hide resolved

use constant N

7b9dde3

Set isHeavy = true in sigmoid and silu epilogue

0e0fe49

hwu36 approved these changes Dec 18, 2021

View reviewed changes

hwu36 merged commit 0dc3ba6 into NVIDIA:master Dec 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor GELU and Sigmoid epilogue to use a common template (and add SiLu, Hardswish epilogue) #379

Refactor GELU and Sigmoid epilogue to use a common template (and add SiLu, Hardswish epilogue) #379

masahi commented Dec 13, 2021

masahi commented Dec 14, 2021

d-k-b commented Dec 14, 2021

masahi commented Dec 14, 2021

hwu36 commented Dec 17, 2021

masahi commented Dec 18, 2021

masahi commented Dec 18, 2021

hwu36 commented Dec 18, 2021

masahi commented Dec 18, 2021 •

edited

Loading

hwu36 commented Dec 18, 2021

masahi commented Dec 18, 2021

hwu36 commented Dec 18, 2021

hwu36 commented Dec 18, 2021

Refactor GELU and Sigmoid epilogue to use a common template (and add SiLu, Hardswish epilogue) #379

Refactor GELU and Sigmoid epilogue to use a common template (and add SiLu, Hardswish epilogue) #379

Conversation

masahi commented Dec 13, 2021

masahi commented Dec 14, 2021

d-k-b commented Dec 14, 2021

masahi commented Dec 14, 2021

hwu36 commented Dec 17, 2021

masahi commented Dec 18, 2021

masahi commented Dec 18, 2021

hwu36 commented Dec 18, 2021

masahi commented Dec 18, 2021 • edited Loading

hwu36 commented Dec 18, 2021

masahi commented Dec 18, 2021

hwu36 commented Dec 18, 2021

hwu36 commented Dec 18, 2021

masahi commented Dec 18, 2021 •

edited

Loading