Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multihead matmul sparse #5

Closed
wants to merge 19 commits into from
Closed

Multihead matmul sparse #5

wants to merge 19 commits into from

Conversation

minghaoBD
Copy link

PR types

Performance optimization

PR changes

OPs

Describe

Made several optimizations in this PR:

  1. Reduce the runtime memory consumptions by introducing shared_ptr in spmm_plugin.cu.
  2. Support FP16 inference by changing the bias from FP32 to FP16 while in fp16-mode inference.
  3. Apply the spmm_plugin into the fused multihead_matmul op to further enhance the inference performance.
  4. Add a UT accordingly.

@@ -6,6 +6,6 @@ nv_library(tensorrt_engine SRCS engine.cc trt_int8_calibrator.cc DEPS ${GLOB_OPE
endif()
nv_library(tensorrt_op_teller SRCS op_teller.cc DEPS framework_proto device_context boost)
nv_test(test_tensorrt SRCS test_tensorrt.cc DEPS dynload_cuda device_context dynamic_loader)
nv_test(test_tensorrt_engine SRCS test_engine.cc DEPS dynload_cuda tensorrt_engine)
nv_test(test_tensorrt_engine SRCS test_engine.cc test_dynamic_engine DEPS dynload_cuda tensorrt_engine tensorrt_plugin)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

少了.cc?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks

@@ -101,6 +101,7 @@ pass_library(matmul_scale_fuse_pass inference)
pass_library(gpu_cpu_map_matmul_to_mul_pass inference)
pass_library(mixed_precision_configure_pass inference)
pass_library(replace_dense_with_sparse_pass inference)
pass_library(replace_dense_multihead_matmul_with_sparse_pass inference)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

把replace_dense_with_sparse_pass改成replace_dense_fc_with_sparse_pass或者把replace_dense_multihead_matmul_with_sparse_pass合入到replace_dense_with_sparse_pass。

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace_dense_with_sparse_pass -> replace_dense_fc_with_sparse_pass

@minghaoBD minghaoBD requested a review from b3602sss June 1, 2022 02:29
@minghaoBD minghaoBD closed this Jun 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants