Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double buffering improvements #1511

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Commits on May 13, 2024

  1. Double buffering improvements

    - Split the LDS reads and MFMA/WMMA into two independent loops
    - Have them into two separate stages (so that they can be executed in
    parallel)
    
    This is to make our pipeline similar to what CK is doing in:
    - https://github.com/ROCm/composable_kernel/blob/6d073d31bbc7d39d8b170d549f2af61970378150/include/ck/tensor_operation/gpu/block/blockwise_gemm_pipeline_xdlops_v4.hpp
    giuseros committed May 13, 2024
    Configuration menu
    Copy the full SHA
    c9930bc View commit details
    Browse the repository at this point in the history