-
Notifications
You must be signed in to change notification settings - Fork 34
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Dev][TL] Integrate TL Dequant Implementation into BitBLAS OPs (#214)
* Refactor tilelang dequantize module and add matmul_blocked_weight_only function * remove un-implemented code. * Implement BaseScheduler to wrap some related items. * lint fix * test skip * Refactor tilelang dequantize module and add matmul_blocked_weight_only function * test fix * hardware tuning demo * remove debug related items. * imlement tuner and cache fix * lint fix * test case fix. * Adapt Tuning Space generation with Roller * lint fix * Refactor select_scheduler function for fine-grained interface The select_scheduler function in the dense/__init__.py module has been refactored to use a fine-grained interface. This change provides more flexibility and enables the implementation of high-performance kernels. Update MatmulScheduler class in matmul_tensorcore.py The MatmulScheduler class in the matmul_tensorcore.py module has been updated to calculate the number of threads based on the block size and warp size. This ensures optimal GPU warp configuration for NVIDIA GPUs. Improve test_general_matmul_tilelang_kernel.py The test_general_matmul_tilelang_kernel.py module has been improved to include additional test cases and assertions for correctness. * Refactor select_scheduler function for fine-grained interface * Refactor NotImplementedError message in BaseTLHint class * Update submodule reference in 3rdparty/tvm * Refactor matmul_finetune function to use topk=20 for hardware-aware finetuning * Refactor submodule reference in 3rdparty/tvm * lint fix * Refactor test_general_matmul_tilelang_impl.py and test_tilelang_gemm.py * Refactor MatmulConfig to enable weight propagation on supported devices * Refactor test_general_matmul_tilelang_impl.py and test_general_matmul_tilelang_kernel.py to use centered random values for input tensors * test fix * test fix * Refactor flash attention tests to use centered random values for input tensors * Refactor flash attention tests to use centered random values for input tensors * Refactor flash attention tests to skip test if flash_attn is not installed * lint fix * test fix * test fix * test fix * Refactor quantization module imports * lint fix * Update yapf version in requirements-dev.txt and requirements-test.txt * Refactor shared memory to global memory storage in MatmulFineGrainScheduler * test fix * format * test fix * Refactor tensorcore policy to use list comprehension for readability * lint fix
- Loading branch information
1 parent
314b2a1
commit a6d627c
Showing
30 changed files
with
1,450 additions
and
234 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.