[Paddle Inference ]use python to generate cutlass code #50603

zhoutianzi666 · 2023-02-17T06:39:12Z

PR types

Others

PR changes

Others

Describe

用python脚本生成CUTLASS conv代码

paddle-bot · 2023-02-17T06:39:15Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

MARD1NO · 2023-02-24T06:54:26Z

我觉得有个小问题，虽然kernel是生成的，但我觉得你这个PR应该包含生成的cu文件。

假设我开发另外一个功能，我重新cmake，cmake走到你生成kernel的逻辑，会生成 conv2d_bias_act.cu，那此时我git commit还需要避免把这个cu文件提交上去。

zhoutianzi666 · 2023-02-28T03:04:13Z

我觉得有个小问题，虽然kernel是生成的，但我觉得你这个PR应该包含生成的cu文件。

假设我开发另外一个功能，我重新cmake，cmake走到你生成kernel的逻辑，会生成 conv2d_bias_act.cu，那此时我git commit还需要避免把这个cu文件提交上去。

感谢review！已经添加到了.gitignore了！

zhangjun

该PR完成功能：将之前合入代码进行了模版化生成替换

zhangjun · 2023-03-03T06:13:10Z

paddle/phi/kernels/fusion/cutlass/conv2d/conv2d_bias_residual.py

+class CbrAct(enum.Enum):
+    Identity = 1
+    Relu = 2
+    Silu = 3
+
+
+ActCutlassTag = {
+    CbrAct.Identity: 'cutlass::epilogue::thread::Identity',
+    CbrAct.Silu: 'cutlass::epilogue::thread::SiLu',
+    CbrAct.Relu: 'cutlass::epilogue::thread::ReLu',
+}
+
+# some global variables used, now we only support these residual blocks
+EpiResBlocks = [
+    (CbrAct.Silu, "cutlass::plus", CbrAct.Identity),
+    (CbrAct.Identity, "cutlass::plus", CbrAct.Relu),
+]
+
+UnderScoreName = {
+    EpiResBlocks[0]: "conv2d_bias_silu_add",
+    EpiResBlocks[1]: "conv2d_bias_add_relu",
+}
+
+CamelName = {
+    EpiResBlocks[0]: "Conv2dBiasSiluAdd",
+    EpiResBlocks[1]: "Conv2dBiasAddRelu",
+}


这里枚举定义，是不是能单独拿出来共用

这里枚举定义，是不是能单独拿出来共用

这里是为了在生成函数代码时候，用来生成函数名字的。不同的后处理对应不同的函数名字。

paddle/phi/kernels/fusion/cutlass/conv2d/conv2d_bias_residual.py

zhangjun · 2023-03-03T06:29:11Z

.gitignore 加上

paddle/phi/kernels/fusion/cutlass/conv2d/conv2d_bias_act.cu
paddle/phi/kernels/fusion/cutlass/conv2d/conv2d_bias_residual.cu

zhangjun · 2023-03-03T06:35:06Z

paddle/phi/kernels/CMakeLists.txt

+  execute_process(
+    COMMAND ${sh_cmd} ${sh_arg0}
+    WORKING_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}/fusion/cutlass")


改成add_custom_target 形式

Paddle/paddle/fluid/eager/auto_code_generator/generator/CMakeLists.txt

Lines 61 to 70 in c3d1e7e

add_custom_target(

eager_python_c_codegen

COMMAND

"${PYTHON_EXECUTABLE}"

"${PADDLE_SOURCE_DIR}/paddle/fluid/eager/auto_code_generator/generator/python_c_gen.py"

"--api_yaml_path=${api_yaml_path},${fwd_api_yaml_path}"

"--output_path=${tmp_python_c_output_path}"

COMMAND ${CMAKE_COMMAND} -E copy_if_different ${tmp_python_c_output_path}

${python_c_output_path}

VERBATIM)

改成add_custom_target 形式

Paddle/paddle/fluid/eager/auto_code_generator/generator/CMakeLists.txt

Lines 61 to 70 in c3d1e7e

add_custom_target(

eager_python_c_codegen

COMMAND

"${PYTHON_EXECUTABLE}"

"${PADDLE_SOURCE_DIR}/paddle/fluid/eager/auto_code_generator/generator/python_c_gen.py"

"--api_yaml_path=${api_yaml_path},${fwd_api_yaml_path}"

"--output_path=${tmp_python_c_output_path}"

COMMAND ${CMAKE_COMMAND} -E copy_if_different ${tmp_python_c_output_path}

${python_c_output_path}

VERBATIM)

由于kernel_declare的函数原因，必须在cmake时候产生文件，所以目前只能用execute_process(来生成文件

zhoutianzi666 · 2023-03-06T06:10:38Z

.gitignore 加上

paddle/phi/kernels/fusion/cutlass/conv2d/conv2d_bias_act.cu
paddle/phi/kernels/fusion/cutlass/conv2d/conv2d_bias_residual.cu

done!

zhoutianzi666 · 2023-03-07T00:30:16Z

.gitignore

@@ -96,4 +96,6 @@ paddle/fluid/prim/api/generated/prim_api/*
 paddle/fluid/framework/__init__.py
 paddle/phi/api/profiler/__init__.py
 python/paddle/incubate/fleet/parameter_server/pslib/ps_pb2.py
+paddle/phi/kernels/fusion/cutlass/conv2d/conv2d_bias_act.cu


生成的文件加入.gitignore

zhoutianzi666 · 2023-03-07T00:31:34Z

paddle/phi/kernels/fusion/cutlass/conv2d/conv2d_bias_act.py

+
+# this is used for leaky_relu, this activation need a fuse_alpha parameter
+
+cba_kernel_alpha = cba_kernel_no_alpha.replace(


有些激活函数。例如leaky_relu需要参数alpha，因此定义了cba_kernel_alpha供使用！

zhoutianzi666 · 2023-03-07T00:32:19Z

paddle/phi/kernels/fusion/cutlass/conv2d/conv2d_bias_act.py

+    "epi_part": "${epi_func}< ${element_c}, ${epilogue_vector_length}, ${element_accum}, ${element_epilogue}>",
+}
+
+cba_kernel_no_alpha = (


这部分是传递一些参数给conv kernel使用的代码

zhoutianzi666 · 2023-03-07T00:52:36Z

paddle/phi/kernels/fusion/cutlass/conv2d/conv2d_util.cu

@@ -62,12 +62,14 @@ __global__ void naive_conv2d_kernel(const half *input,
                                    int dilation_w,
                                    int oh,
                                    int ow,
+                                    int groups,


这个baseline函数支持了group conv，为以后的cutlass group conv和depthwise conv做支持！

qingqing01

1，有单元测试不？
2，后续需要文档，每增加一个新的模板生成代码，需要修改那些文件。看起来C++、Python都需要修改

qingqing01 · 2023-03-07T08:12:36Z

paddle/phi/kernels/fusion/cutlass/conv2d/conv2d_bias_act.py

+).replace(
+    "typename ImplicitG", "float alpha = params.alpha; typename ImplicitG"
+)
+


Need more comments in this file, in orde to easy to maintain and update for others

paddle/phi/kernels/fusion/cutlass/conv2d/conv2d_bias_act.py

paddle/phi/kernels/fusion/cutlass/conv2d/conv2d_bias_residual.py

zhoutianzi666 · 2023-03-08T00:32:58Z

1，有单元测试不？
2，后续需要文档，每增加一个新的模板生成代码，需要修改那些文件。看起来C++、Python都需d

1，有单元测试不？
2，后续需要文档，每增加一个新的模板生成代码，需要修改那些文件。看起来C++、Python都需要修改

单元测试在python/paddle/fluid/tests/unittests/ir/inference/test_cutlass_conv2d_fusion_op.py中。
几乎不需要改动C++文件，C++的代码仅仅是作为baseline来验证cutlass 各种kernel的正确性。

paddle/phi/kernels/fusion/cutlass/conv2d/conv2d_bias_act.py

paddle/phi/kernels/fusion/cutlass/conv2d/conv2d_common.py

qingqing01

后续此类代码，需要更详细注释，写清楚使用限制等。

zhoutianzi666 · 2023-03-11T23:20:06Z

后续此类代码，需要更详细注释，写清楚使用限制等。

ok

zhangjun

LGTM

后续TODO：
group>1 情况；
padding_algorithm非EXPLICIT支持；

use python to generate cutlass code

83f469a

paddle-bot bot added contributor External developers status: proposed labels Feb 17, 2023

zhoutianzi666 added 8 commits February 22, 2023 00:14

make code look good

19032b1

merge develop

04cb56e

refine code

7907607

refine CommonConvKernelPart1, CommonConvKernelPart2

4e7d1d1

refine code

590313f

remove useless code in generate_cutlass_code.sh

e342adb

add more config in conv2d_residual

d6fce71

CommonCutlassConvKernelPart1 and CommonCutlassConvKernelPart2

f58b372

add group conv support in util.cu

618ed8f

zhangjun reviewed Mar 3, 2023

View reviewed changes

zhoutianzi666 added 11 commits March 3, 2023 08:54

remove .sh

aefdb26

revert to execute_process

72403c9

revert to execute_process

ae0472b

revert to execute_process

02e0724

merge develop

991d4ed

remove std::cout

97b5909

refine code

d3c9b54

fix typo

d564574

refine name

74f356e

make name goodgit status!

c36436d

merge develop

52633af

zhoutianzi666 commented Mar 7, 2023

View reviewed changes

add fuse_alpha

a9c1f7c

qingqing01 reviewed Mar 7, 2023

View reviewed changes

make code easy to understand

a0df170

zhoutianzi666 added 2 commits March 8, 2023 01:44

Merge branch 'develop' into add_cutlass_template

9300f74

mot fopen generate in py

65e54a0

MARD1NO reviewed Mar 8, 2023

View reviewed changes

paddle/phi/kernels/fusion/cutlass/conv2d/conv2d_bias_act.py Show resolved Hide resolved

MARD1NO reviewed Mar 9, 2023

View reviewed changes

paddle/phi/kernels/fusion/cutlass/conv2d/conv2d_common.py Outdated Show resolved Hide resolved

qingqing01 previously approved these changes Mar 9, 2023

View reviewed changes

zhoutianzi666 dismissed qingqing01’s stale review via 7e93dbc March 9, 2023 23:31

use python script to generate conv2d,group=1 cutlass code

06b647e

zhoutianzi666 force-pushed the add_cutlass_template branch from 7e93dbc to 06b647e Compare March 10, 2023 00:46

zhoutianzi666 closed this Mar 10, 2023

zhoutianzi666 reopened this Mar 10, 2023

zhoutianzi666 added 5 commits March 10, 2023 05:51

revert &

7553aba

Merge branch 'develop' into add_cutlass_template

48efef8

revert no use &

f70f91e

use const &

cb1a5d3

use const & && use python script to generate conv2d/group=1 code

31a500d

qingqing01 approved these changes Mar 13, 2023

View reviewed changes

zhangjun approved these changes Mar 13, 2023

View reviewed changes

zhangjun merged commit 4e9e23c into PaddlePaddle:develop Mar 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Paddle Inference ]use python to generate cutlass code #50603

[Paddle Inference ]use python to generate cutlass code #50603

zhoutianzi666 commented Feb 17, 2023 •

edited

Loading

paddle-bot bot commented Feb 17, 2023

MARD1NO commented Feb 24, 2023

zhoutianzi666 commented Feb 28, 2023 •

edited

Loading

zhangjun left a comment

zhangjun Mar 3, 2023

zhoutianzi666 Mar 6, 2023

zhangjun commented Mar 3, 2023

zhangjun Mar 3, 2023

zhoutianzi666 Mar 6, 2023

zhoutianzi666 commented Mar 6, 2023

zhoutianzi666 Mar 7, 2023

zhoutianzi666 Mar 7, 2023

zhoutianzi666 Mar 7, 2023

zhoutianzi666 Mar 7, 2023

qingqing01 left a comment

qingqing01 Mar 7, 2023

zhoutianzi666 commented Mar 8, 2023

qingqing01 left a comment

zhoutianzi666 commented Mar 11, 2023

zhangjun left a comment •

edited

Loading

	add_custom_target(
	eager_python_c_codegen
	COMMAND
	"${PYTHON_EXECUTABLE}"
	"${PADDLE_SOURCE_DIR}/paddle/fluid/eager/auto_code_generator/generator/python_c_gen.py"
	"--api_yaml_path=${api_yaml_path},${fwd_api_yaml_path}"
	"--output_path=${tmp_python_c_output_path}"
	COMMAND ${CMAKE_COMMAND} -E copy_if_different ${tmp_python_c_output_path}
	${python_c_output_path}
	VERBATIM)


		# this is used for leaky_relu, this activation need a fuse_alpha parameter

		cba_kernel_alpha = cba_kernel_no_alpha.replace(

[Paddle Inference ]use python to generate cutlass code #50603

[Paddle Inference ]use python to generate cutlass code #50603

Conversation

zhoutianzi666 commented Feb 17, 2023 • edited Loading

PR types

PR changes

Describe

paddle-bot bot commented Feb 17, 2023

MARD1NO commented Feb 24, 2023

zhoutianzi666 commented Feb 28, 2023 • edited Loading

zhangjun left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhangjun commented Mar 3, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhoutianzi666 commented Mar 6, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qingqing01 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhoutianzi666 commented Mar 8, 2023

qingqing01 left a comment

Choose a reason for hiding this comment

zhoutianzi666 commented Mar 11, 2023

zhangjun left a comment • edited Loading

Choose a reason for hiding this comment

zhoutianzi666 commented Feb 17, 2023 •

edited

Loading

zhoutianzi666 commented Feb 28, 2023 •

edited

Loading

zhangjun left a comment •

edited

Loading