Releases: mirage-project/mirage
Releases · mirage-project/mirage
v0.2.2
What's Changed
- [Search] Support input/output strides specification in #108
- [Docs] Update Documentation to build C++ library in #109
- [Transpiler] Parallel transpile to accelerate superoptimize speed by 10x faster by @GuangyaoZhang in #119
- [Transpiler] Adding mechanism to skip invalid transpiled kernels in #117
- [Visualizer] Add functionality to visualize mugraphs by @NorthmanPKU in #113
- [Transpiler] Add shared memory usage as part of the cost when determining the layouts for stensors in #130
New Contributors
- @preejackie made their first contribution in #109
- @GuangyaoZhang made their first contribution in #119
- @NorthmanPKU made their first contribution in #113
Full Changelog: v0.2.1...v0.2.2
v0.2.1
What's Changed
- [Docs] Add doc files by @jiazhihao in #90
- fix silu by @xinhaoc in #100
- [Layout] adding initial support that allows users to define customized input/output strides for kernel graphs. by @jiazhihao in #98
- Set default strides for outputs by @wmdi in #105
Full Changelog: v0.2.0...v0.2.1
v0.2.0
Major release with a range of changes to Python interface, search implementation, transpiler, and documentation.
What's Changed
- [Triton CodeGen] Fix an issue when generating Triton programs from mugraphs
- [LoRA demo] Add the checkpoint file for the lora demo
- [DeviceMemoryManager] Use offsets instead of pointers to locate tensors and fingerprints in device memory
- [Graph Generator] Parallelize the generation algorithm
- Improve parallel search performance
- [Accumulator] Decouples accumulator from output saver in threadblock graphs
- Update the setup workflow for packaging
- Add more element_unary & element_binary operators at the kernel and threadblock levels
- [CUDA Transpiler] Supporting JIT transpilation and compilation
- [Search] Range-based pruning
- Fix some existing issues by @xinhaoc in #63
- [Transpiler] Support threadblock matmul using cute when the input/output stensors have more than 2 dimensions
- Include header files for JIT compilation. MIRAGE_ROOT is no longer required.
- [Python] update python interface to support search
- [Search] Adjust the expansion phase of search
- [Search] Improve the display of search statistics
- Set default max_num_threadblock_graphs to 1
New Contributors
- @wmdi made their first contribution in #3
- @geohotstan made their first contribution in #14
- @jiakunw made their first contribution in #20
- @interestingLSY made their first contribution in #36
Full Changelog: https://github.com/mirage-project/mirage/commits/v0.2.0