Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test merge unity #2

Closed
wants to merge 876 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
876 commits
Select commit Hold shift + click to select a range
d2a9f39
[Fix] Fix Build with RCCL (#15863)
junrushao Oct 3, 2023
8954139
[Unity] Add R.quantize and R.dequantize ops (#15849)
ibsidorenko Oct 3, 2023
7d53953
[Unity] Check the PassContext within RewriteCUDAGraph transform (#15861)
Lunderberg Oct 3, 2023
9f0ac49
[CMake] Prefer a static NCCL (#15869)
junrushao Oct 4, 2023
6d2b44d
[Bugfix] Fix Disco-CUDAGraph Integration (#15870)
junrushao Oct 5, 2023
b486210
[Disco] Add -lrt to TVM runtime for NCCL (#15876)
junrushao Oct 5, 2023
2e30dbe
[Unity][Fix] Remove duplicated words from comments, NFC (#15875)
Oct 5, 2023
8a2ca34
[Unity] Use PrimValue as offset in R.tril and R.triu (#15783)
Lunderberg Oct 5, 2023
59ec81f
[Unity][MSC][pre M1.2] Reconstruct codegen (#15813)
Archermmt Oct 7, 2023
7c4c2c2
[Disco][Fix] Remove Dependency to PyTest (#15886)
junrushao Oct 7, 2023
969e31a
[Unity][NN] Enhance ReLU and GELU support (#15885)
Hzfengsy Oct 7, 2023
b9a02b1
Merge remote-tracking branch 'apache-upstream/main' into apache-upstr…
junrushao Oct 7, 2023
9f4d0fa
[Unity][Transform] Allow static Relax arguments to dynamic PrimFunc (…
Lunderberg Oct 9, 2023
dd57556
[Unity][Op] Introduce `call_inplace_packed` as a counterpart to `call…
slyubomirsky Oct 9, 2023
ec1184e
[Unity] [Bugfix] Fix KeyError:'None' in layer_norm and correctly use …
Thrsu Oct 9, 2023
b138005
[Unity] [Bugfix] Fix TypeError in TVM PyTorch frontend for LayerNorm …
Thrsu Oct 10, 2023
67d6193
[Unity] [Bugfix] Fix MaxPool TypeError in ONNX frontend (#15908)
Thrsu Oct 11, 2023
58e00f3
[Unity]Add FastMathTransform pass to Relax (#15814)
HongHongHongL Oct 11, 2023
f154026
[Unity] Improve FuseOps error messages (#15899)
Lunderberg Oct 11, 2023
465d691
[Unity] Ignore R.ExternFunc in EliminateCommonSubexpr (#15900)
Lunderberg Oct 11, 2023
bde8a87
[Unity][Relax] Support Dynamic Tensor as Index, torch frontend (#15884)
guoyaol Oct 11, 2023
95a89d2
[Unity] Fix wrong variable name in test_optimize_layout_transform (#1…
HongHongHongL Oct 11, 2023
1e789f3
[Fix][Unity] Fix TVMError when loading ONNX model with CumSum operato…
dmilosevic252 Oct 12, 2023
589d919
[Unity] Fix TVMScript Issues in Testcases (#15920)
Hzfengsy Oct 12, 2023
c62dcfa
[Unity] Propagate extra symbolic vars through LiftTransformParams (#1…
Lunderberg Oct 12, 2023
559e94a
[Unity][TVMScript] Avoid dangling reference when printing Call attrs …
Lunderberg Oct 13, 2023
e676780
[Unity][VM] Improved error message in CodeGenVM::EmitKillObject (#15822)
Lunderberg Oct 13, 2023
9725144
[Unity][Transform] Canonicalize and use CSE between pattern matches (…
Lunderberg Oct 13, 2023
9892675
[Unity][Relax]support torch.arange()+ (int) in torch frontend (#15917)
guoyaol Oct 14, 2023
c606fcf
[Unity] Paged KV Cache as LM Support (#15910)
MasterJH5574 Oct 15, 2023
10d5cae
[Unity][MSC][M1.2] Add translate && codegen for tensorflow (#15905)
Archermmt Oct 16, 2023
354c5f1
[Unity] [Bugfix] Fix bug in interpolate operator's default mode param…
Thrsu Oct 16, 2023
5f4412b
[Unity][Frontend][Onnx] Add support for Elu operator (#15937)
HongHongHongL Oct 17, 2023
ecb689e
[Unity] [Bugfix] Fix TypeError in interpolate caused by scale_factor …
Thrsu Oct 17, 2023
4a0151c
[Web] Further increase default EMCC compilation total memory size (#1…
MasterJH5574 Oct 18, 2023
8dc4bd6
[Unity][Dlight] Add software pipeline to matmul tensorization rule (#…
vinx13 Oct 18, 2023
3acb2ca
[Unity] Fix bug in onnx_frontend when multi inputs have same symbolic…
HongHongHongL Oct 18, 2023
8874f8b
[Unity][MSC][Bugfix] Trilu bugfix && special ops support (#15938)
Archermmt Oct 18, 2023
ec9e0a0
[Unity] Allow FLegalize to produce Relax operations (#15842)
Lunderberg Oct 19, 2023
1ba11f6
[Unity][BYOC] Add support for sliding window in attention op (#15951)
masahi Oct 19, 2023
ea70b02
[Unity] `nn.Mutator` for `nn.Module` level transform (#15958)
cyx-6 Oct 22, 2023
b8a1d63
[Fix][Unity] Fix TVMError when using relax to load model with Trilu o…
dmilosevic252 Oct 22, 2023
9ff2450
[Unity][SWA] Overriding windowed cache support (#15963)
davidpissarra Oct 22, 2023
5c6b6bb
[Unity][Pass] Include `FoldDataflowBlockOutput` in `CanonicalizeBindi…
slyubomirsky Oct 23, 2023
ea588c1
[Unity] Update relax_dynamo (#15962)
liquanfeng Oct 24, 2023
bcdbc3e
[UNITY] Fix the symbolic var handling (#15973)
tqchen Oct 24, 2023
4d19c8a
[Unity][Transform] Improved canonicalization of non-dataflow Var (#15…
Lunderberg Oct 25, 2023
5808cea
[Unity][BYOC] CoreML Scaffolding (#15556)
sunggg Oct 25, 2023
7ef36eb
[Unity] Support symbolic PrimValue arguments (#15980)
Lunderberg Oct 25, 2023
04c6863
[Unity][BYOC] Fix cuBLAS BYOC compatibilty with Disco + `ThreadedSess…
masahi Oct 25, 2023
ef39f37
[Unity][BYOC] Support variable-length attention by flash attention (#…
masahi Oct 25, 2023
5d62e9b
[Unity][MSC][M1.3] Add translate && codegen for tensorrt (#15950)
Archermmt Oct 26, 2023
ebbe38f
[Unity] Include LegalizeOps in the default relax.build lowering flow …
Lunderberg Oct 26, 2023
8f97a76
[Unity][Testing] Show failing module in WellFormedInstrument (#15898)
Lunderberg Oct 26, 2023
484bd44
[Unity][BYOC] Do not use cudaMemcpy for max_seqlen in var len attenti…
masahi Oct 26, 2023
6408cc4
[Disco] Explicitly set the session on DPackedFunc and DModule (#15996)
yelite Oct 29, 2023
3184a80
[MERGE] Merge main into unity 2023-10-29
tqchen Oct 29, 2023
393aaa3
Fix after merge
tqchen Oct 29, 2023
53ccf18
[Unity] Allow Pipeline Registration (#16008)
junrushao Oct 30, 2023
6936829
[Unity] Remove end-of-life handling from StaticPlanBlockMemory (#15841)
Lunderberg Oct 30, 2023
d932608
[Unity][UnitTest] Enable BindParams test for R.Prim (#15978)
Lunderberg Oct 30, 2023
90bb10e
[Unity][nn.Module] Support Parameter Packing (#16007)
junrushao Oct 30, 2023
16af021
[Unity][nn.Module] Support `nn.SourceModule` (#16006)
junrushao Oct 31, 2023
7486476
[Unity] Deterministic Ordering when Iterating IRModule::functions (#1…
junrushao Oct 31, 2023
8c7aaa6
[Unity][UnitTest] Cleanup test_vm_build.py (#15981)
Lunderberg Oct 31, 2023
49a3a51
[Unity] Ensure one VM register for each relax binding (#15855)
Lunderberg Oct 31, 2023
a9c81a7
[Unity] Replace relax_vm/memory_manager with memory/memory_manager (#…
yongwww Oct 31, 2023
3c86037
[Unity] Avoid Emitting Redandunt Bindings in TensorExpr Op (#16026)
junrushao Oct 31, 2023
0f8186f
[Fix] Windows Build (#16028)
junrushao Nov 1, 2023
853732e
[Unity] Support getting variable mapping for FunctionCopier (#16012)
Ubospica Nov 1, 2023
7833f4e
[Unity][BlockBuilder] Allow emitting nested tuple (#15993)
Lunderberg Nov 1, 2023
a801064
[Unity] Alias IntTuple <= ShapeTuple (#16035)
junrushao Nov 2, 2023
23371ca
[Unity][MSC][M1.4] Add Runner and test with relax (#15997)
Archermmt Nov 2, 2023
2329b1a
[Fix] Update mutator name rule (#16046)
LeshengJin Nov 3, 2023
0ed1e30
[Unity][Support] Sample from top-p supports offset (#16069)
MasterJH5574 Nov 4, 2023
9202f4b
[Bugfix] Compilation Error with Clang (#16071)
junrushao Nov 5, 2023
151aa74
[Unity][Dlight] Metal Performance (#15985)
spectrometerHBH Nov 5, 2023
021f31b
[Unity] Fix FuseTIR pass for gather/take cases (#16064)
Hzfengsy Nov 5, 2023
7d0e60a
Revert "[Unity][Support] Sample from top-p supports offset" (#16077)
MasterJH5574 Nov 5, 2023
664beaa
[Unity] Loading NDArrayCache by parameter names (#16078)
junrushao Nov 6, 2023
cf013c2
[Unity] Handle duplicate outputs in LazyTransformParams (#15942)
Lunderberg Nov 6, 2023
3f1347c
[Unity] Enhance Python Annotations for Relax Expr (#16075)
Hzfengsy Nov 6, 2023
3de77f8
Merge remote-tracking branch 'main' into unity
junrushao Nov 6, 2023
e7d12be
[Unity][Training] Support intermediate vars as require_grads for Grad…
Ubospica Nov 6, 2023
e506bff
[Unity] Implement FNormalize attribute for operators (#16067)
Lunderberg Nov 7, 2023
eb20534
[Unity][Bugfix] Track variable usage from input to impure functions (…
Lunderberg Nov 7, 2023
8e8799d
[Unity][Dlight] Enhance matmul tensorizer with Int8 support (#16084)
ibsidorenko Nov 7, 2023
6ce1602
[Unity] support symbolic var in RewriteDataflowReshape (#16086)
jinhongyii Nov 8, 2023
9eeb5bc
[Unity] Support Regular expression matching in globalvar dataflow pat…
jinhongyii Nov 8, 2023
9100a8e
[Unity] [LiftTransformParams] Treat symbolic var in weight shape as c…
jinhongyii Nov 8, 2023
e37165f
[Unity][MSC][M1.5-1.7] Add Runner and test with torch, tensorflow && …
Archermmt Nov 9, 2023
6b9c277
[Unity][Training] Simplify matmul patterns after gradient (#16082)
Ubospica Nov 9, 2023
e359e7a
[Disco] Add loader for presharded params. (#15957)
Lunderberg Nov 9, 2023
384f9b6
[Unity] Add `axis` field to scatter_from_worker0 (#16092)
jinhongyii Nov 9, 2023
276b4ce
[Unity][Fix] Fix `topi.rms_norm` with float32 upscale (#16099)
cyx-6 Nov 9, 2023
171ef61
[Unity] make LazyTransformParam more general (#16088)
jinhongyii Nov 9, 2023
7dd248b
[Unity][Dlight] Choose perfect spatial factor in reduction rule (#16101)
vinx13 Nov 10, 2023
bc10f76
[Unity] Add LoadParamOnWorker0 function in shard loader (#16093)
jinhongyii Nov 10, 2023
a3d9108
[Smallfix][WEB] Change memory manager import for web (#16107)
CharlieFRuan Nov 11, 2023
06a4899
[Unity][Fix] Fix `rms_norm` tests (#16109)
cyx-6 Nov 11, 2023
7892af0
[Unity][WebGPU] Allow lower max storage buffer binding size (#16108)
CharlieFRuan Nov 11, 2023
7a0c3f9
[Unity][Support] PagedKVCache support growth control (#16112)
MasterJH5574 Nov 12, 2023
835bc82
[Unity][TVMJS] Add md5sum to weight shards (#16122)
junrushao Nov 14, 2023
d5daa98
[Unity] Allow Customized Pipeline in `relax.build` (#16121)
junrushao Nov 14, 2023
26a20ee
[Unity] Improved error message in relax::Normalizer (#16114)
Lunderberg Nov 14, 2023
e7c7314
[Unity][Transform] Keep R.ExternFunc in dead-code elimination (#16118)
Lunderberg Nov 14, 2023
6f650db
[Unity][DistIR] Legalize redistribute (#16098)
jinhongyii Nov 14, 2023
c9de001
[Unity] [Transform] Skip constants in CSE pass (#16125)
quic-sanirudh Nov 14, 2023
0ddfc65
[Unity] Implement FNormalize for relax.op.call_tir (#16068)
Lunderberg Nov 14, 2023
684a8ca
[Unity][DLight] Enhance the inline consumer rule (#16124)
Hzfengsy Nov 15, 2023
bc2415b
[Unity][BYOC] Support IGEMM in cuBLASLt (#16134)
ibsidorenko Nov 15, 2023
2dbab71
Merge branch 'main' into 'unity'
MasterJH5574 Nov 16, 2023
44d80e5
[Unity][Bugfix] Reset window cache current pos when clearing (#16132)
CharlieFRuan Nov 17, 2023
8a2ffee
[Unity] Do not import SciPy by default (#16136)
junrushao Nov 17, 2023
90edf76
[Unity][LLM] Add NaN checks during sampling for better error reportin…
Hzfengsy Nov 17, 2023
5e61adc
[Unity][DistIR] Enhance PropagateSharding pass (#16094)
jinhongyii Nov 17, 2023
165b84b
Always use int64 in JSON parser (#16145)
junrushao Nov 18, 2023
4c07f6a
[Runtime] Introduce Type-Checked `TVMArgs::At<T>(i)` (#16147)
junrushao Nov 19, 2023
4e70c28
[Runtime] Allowing Packed Arguments in TVM Module VTable (#16148)
junrushao Nov 19, 2023
29450b9
[Unity][MSC] Enable add attributes while fuse ops (#16128)
Archermmt Nov 20, 2023
bafd49d
[Unity] Flash infer integration (#16146)
jinhongyii Nov 20, 2023
9a98571
[Unity] Migrate Relax Executable/VM to `TVM_MODULE_VTABLE` Convention…
junrushao Nov 21, 2023
925cb2b
[Unity][BYOC] Add cutlass finegrained decode matmul (#16144)
vinx13 Nov 21, 2023
aae1112
[Unity] Support constant args in `nn.ExternModule` (#16130)
cyx-6 Nov 22, 2023
756ce99
[Unity][3rdparty] Remove TVM in 3rdparty of FlashInfer (#16155)
MasterJH5574 Nov 22, 2023
1de8b34
[Unity][DistIR] LowerGlobalViewToLocalView (#16095)
jinhongyii Nov 26, 2023
2dcb871
[Unity][BlockBuilder] Depracate `BlockBuilder.get()` and change it to…
Ubospica Nov 28, 2023
8f24a27
[Unity][MSC][M2.1] Add Manager for compile pipeline (#16163)
Archermmt Nov 28, 2023
af803cf
[Unity][DLight] Fix `general_reduction` for GroupNorm (#16161)
Hzfengsy Nov 28, 2023
64fe5a8
[Unity][DistIR] Add DTensor struct info propagation rule for stop_lif…
jinhongyii Nov 28, 2023
c640d0a
[Unity][Web] Fix missing function NVTXScopedRange for web (#16177)
CharlieFRuan Nov 29, 2023
8a6184c
[Unity, BYOC] Add check for leaking intemediate variables for cublas …
vinx13 Nov 29, 2023
a6adaae
[Unity][DistIR] LowerDistIR (#16169)
jinhongyii Nov 29, 2023
85389ef
[Unity][BYOC] Fix Flash var_len attention with sliding window (#16185)
masahi Nov 30, 2023
6844348
[Unity][Bugfix] Handle symbolic matching with non-structural match (#…
Lunderberg Nov 30, 2023
d52a9bf
[Unity][Transform] Implement RemoveUnusedOutputs (#16117)
Lunderberg Nov 30, 2023
d6015c5
[Unity][BugFix] Fix a bug in relax gelu_tanh computation (#16188)
rickzx Nov 30, 2023
fe9d2fe
[Unity][Transform] Implement ExpandTupleArguments (#16115)
Lunderberg Dec 1, 2023
fc324d0
[Unity][Transform] Implement RemoveUnusedParameters (#16116)
Lunderberg Dec 1, 2023
74667b9
[Unity] Enable ccache for `nn.SourceModule` (#16189)
cyx-6 Dec 2, 2023
ed2772f
[Unity][MSC][M2.1] Add pruner for model pruning (#16186)
Archermmt Dec 2, 2023
9e4e17c
[Unity][WebGPU] Get params from cache by name (#16198)
CharlieFRuan Dec 3, 2023
a2f55a8
[WEBGPU] Update to latest compilationHints API (#16197)
tqchen Dec 3, 2023
8f95f61
[Unity] [Transform] Remove iteration over functions in function pass …
quic-sanirudh Dec 4, 2023
3c7067d
[Unity] Minor: Remove debug logging (#16200)
junrushao Dec 4, 2023
34fd234
[Unity] Check usage location when canonicalizing trivial bindings (#1…
Lunderberg Dec 5, 2023
4e8c975
[Unity][Bugfix] Fix `tests/python/topi/test_topi_transform.py::test_r…
sunggg Dec 6, 2023
d050402
[Unity] Update FlashInfer (#16208)
cyx-6 Dec 7, 2023
ebbad09
[Unity] Upgrade cutlass_fpA_intB_gemm (#16206)
vinx13 Dec 7, 2023
03fc4f6
[Dlight] Change max_threads on CUDA (#16203)
jinhongyii Dec 8, 2023
58e622b
[Unity][Transform] Implement Relax function inlining (#16194)
Lunderberg Dec 9, 2023
e0518da
[Unity][MSC][M2.3] Add tracker for track layer datas (#16207)
Archermmt Dec 10, 2023
35e8404
[Disco] Expose `DiscoWorker` and `ndarray_cache_support` in header (#…
LeshengJin Dec 10, 2023
f18d186
[Unity] Speed up NormalizeGlobalVar (#16219)
Hzfengsy Dec 11, 2023
b5b980e
[Unity] Support out dtype for nn.Linear and nn.MultiLinear (#16220)
CharlieFRuan Dec 11, 2023
8241385
[Unity] De-duplicate calls to TensorStructInfo constructor (#16209)
Lunderberg Dec 11, 2023
2772fb0
[Unity] Fix upstream tests that fail on unity branch (#16196)
Hzfengsy Dec 12, 2023
c6d4926
[Dlight] Fix NormalizePrimFunc with scalar block (#16156)
vincentccc Dec 12, 2023
af14fbb
[Relax] Fix to enable emit_te of topi scan/sort kernels (#16226)
spectrometerHBH Dec 12, 2023
943508a
[Unity] Fix typo in dlight fallback rule (#16230)
vinx13 Dec 12, 2023
cbcb67c
[Unity][Frontend] Add the `sum` op to frontend ops (#16225)
MasterJH5574 Dec 13, 2023
fe89ccc
[Unity][Transform] Pass for automatically extracting DataflowBlocks (…
slyubomirsky Dec 13, 2023
f7b0193
[Unity] Fix IndexDataTypeNormalizer so that it correctly handles corn…
jinhongyii Dec 14, 2023
e100a13
[Unity] Fix legalizing strided slice (#16232)
vinx13 Dec 14, 2023
6741678
Revert "[Unity] Fix IndexDataTypeNormalizer so that it correctly hand…
tqchen Dec 14, 2023
6118b77
[Unity] Improved error checking for DataflowBlock in nested SeqExpr (…
Lunderberg Dec 14, 2023
e1964ec
[Unity] Add runtime debugging method to RelaxVM (#16238)
junrushao Dec 14, 2023
cd9445d
[Unity][lm_support] window kvcache sink (#16240)
davidpissarra Dec 15, 2023
a2e19d2
[Unity] Fix IndexDataTypeNormalizer so that it correctly handles corn…
jinhongyii Dec 15, 2023
8edfee8
[Unity][MSC][M2.4] Add quantizer for quantize model (#16228)
Archermmt Dec 15, 2023
5b1fa29
[Unity][VM] Allow `pipeline=None` in `relax.build` (#16246)
junrushao Dec 15, 2023
f794db4
[Unity] Avoid to use `std::regex` (#16249)
Hzfengsy Dec 16, 2023
2d0d4e4
[Unity] Enable spot nodes in CI (#16253)
Hzfengsy Dec 16, 2023
a0e5898
[Unity][nn.Module] Refactor `ExternModule` (#16247)
junrushao Dec 16, 2023
76e239e
[Unity] Fix Cutlass Codegen for Dense (#16252)
Hzfengsy Dec 16, 2023
95f1b5c
[Unity] Hot Fix Unity CI (#16256)
Hzfengsy Dec 17, 2023
e98fdea
[Unity] Bump fpA_intB_gemm (#16244)
vinx13 Dec 14, 2023
4e66690
[Fix] add TVM_DLL to disco functions (#16258)
LeshengJin Dec 17, 2023
45eeb8c
[Unity] Fix ccache env for `nn.SourceModule` (#16257)
cyx-6 Dec 18, 2023
f328e9b
[Unity] Add missing library import (#16263)
Kartik14 Dec 20, 2023
1c35c39
[Unity] Add Relax multi-device e2e cases (#15823)
yongwww Dec 20, 2023
3de5e86
[Unity][nn.Module] Support Runtime-Calling Any PackedFunc via `op.ext…
junrushao Dec 25, 2023
5c8caa6
[Unity] Unified KV cache interface and PagedKVCache refactor (#16273)
MasterJH5574 Dec 25, 2023
2bf3a0a
[Unity][MSC][M3.1] Add distiller for distill model (#16264)
Archermmt Dec 26, 2023
889d2f6
[Unity][Frontend] NNModule `tensor_ir_op` support (#16278)
Hzfengsy Dec 26, 2023
8946efa
Update FlashInfer (#16281)
junrushao Dec 26, 2023
58daeb4
Update FlashInfer (#16292)
junrushao Dec 28, 2023
303afdb
[Unity][MSC][M3.2] Add gym for pruning and quantization, enable auto …
Archermmt Dec 31, 2023
2f7e0d5
[Unity] Ensure memory planning cross-function independence (#16318)
MasterJH5574 Dec 31, 2023
beb8326
[Unity] Update cutlass FpA IntB GeMM submodule (#16320)
cyx-6 Dec 31, 2023
8867de8
[Unity][MSC][Bugfix] Use random workspace for test (#16322)
Archermmt Dec 31, 2023
9030522
[Unity][Frontend] Introducing Object (#16316)
MasterJH5574 Dec 31, 2023
b1df4b0
[Unity][Web][Fix] Fix fetchNDArray for f32-to-bf16 (#16294)
CharlieFRuan Jan 1, 2024
faa8a0a
[Unity][nn.Module] Introduce operator `empty` (#16327)
junrushao Jan 2, 2024
ac568eb
[Unity] Fix PagedKVCache per FlashInfer update (#16317)
MasterJH5574 Jan 2, 2024
09c44e6
[Unity] Upgrade flashinfer 3rdparty submodule (#16323)
yzh119 Jan 2, 2024
163c7ac
[Unity] Cutlass kernel compatibility with cmake 3.18+ (#16302)
Lunderberg Jan 2, 2024
b3f0e55
Change metal dtype of ceil_log2 to fp32 (#16332)
jinhongyii Jan 2, 2024
4a7e4fe
[Unity] Fix nn.op.tensor_ir_op signature (#16333)
jinhongyii Jan 3, 2024
1af82ad
[Unity] Validate struct info in relax::Call constructor (#16311)
Lunderberg Jan 3, 2024
ec542da
[Unity][Transform] Extract partial-tuple-usage from FuseTIR (#16120)
Lunderberg Jan 3, 2024
6f2fe45
[Unity][UnitTest] Increase atol to resolve flaky CI failure (#16340)
Lunderberg Jan 3, 2024
0cf5f47
[Unity] Dispatch cumsum and sort (#16254)
yongwww Jan 3, 2024
7dfc863
[Unity] Alter op impl handling empty transform for output (#16331)
rasagna-quic Jan 4, 2024
d88cc42
[Unity][Transform] Implement UpdateParamStructInfo (#16305)
Lunderberg Jan 4, 2024
d509661
[Unity][Analysis] Handle PrimStructInfo in EraseToWellDefined (#16304)
Lunderberg Jan 4, 2024
49fc613
[Unity][WEBGPU] Enable wasm exception propagation (#16330)
tqchen Jan 4, 2024
c3aa71a
[Unity][Analysis] Add utility for collecting compile-time bindings (#…
Lunderberg Jan 4, 2024
8d72091
[DLight] Skip rule if target is not suitable (#16321)
Hzfengsy Jan 4, 2024
31659b6
[Unity][Dlight] Support dlight gemv rule on nested inner block (#16251)
jinhongyii Jan 5, 2024
fe5f616
[Unity][MSC][Legalize] legalize codes and mute logging (#16325)
Archermmt Jan 5, 2024
f215a41
[Unity][NN] Use Linear name for nn.op.permute_dims (#16303)
Lunderberg Jan 5, 2024
ded4be4
enhance shared memory merge.
LeiWang1999 Jan 3, 2024
6b6419c
merge from unity upstream
LeiWang1999 Jan 3, 2024
047211f
revert the change for dyanmic test
LeiWang1999 Jan 3, 2024
76ceff2
fix typo
LeiWang1999 Jan 3, 2024
e3216a6
lint fix
LeiWang1999 Jan 4, 2024
3fad010
[Unity][Contrib] Add vLLM paged attention kernel (#16350)
vinx13 Jan 5, 2024
b47280b
Merge branch 'main' into 'unity'
MasterJH5574 Jan 6, 2024
155dd73
Fix after merging 'main' into 'unity'
MasterJH5574 Jan 6, 2024
0b13b5c
[Unity] Enhance Torch-consistency in rehsape (#16360)
junrushao Jan 7, 2024
8e54a9e
[Unity][DLight] Introduce Specific Rule for RMSNorm (#16338)
Celve Jan 8, 2024
4c77f0f
[TIR] Extend DP4A tensor intrin (#16293)
vincentccc Jan 8, 2024
ef2a913
[Unity] Improved error message for matmul shape mismatch (#16308)
Lunderberg Jan 8, 2024
b22aa0f
[Unity] Improved error message in ExprMutator::ReEmitBinding (#16307)
Lunderberg Jan 8, 2024
fab6db2
[Unity][Transform] Use parameter name in BundleModelParams (#16309)
Lunderberg Jan 8, 2024
1e95b63
Merge branch 'main' of github.com:apache/tvm into unity
vinx13 Jan 9, 2024
2d53e6a
[Unity][Transform] Handle replacement at both var binding and usage (…
Lunderberg Jan 9, 2024
a796023
[Unity][Fix] Memory planning check value type of 'tir_var_upper_bound…
MasterJH5574 Jan 9, 2024
4a37cfe
[Unity][Analysis] Show objects instead of names in WellFormedChecker …
Lunderberg Jan 9, 2024
4e05eb4
[CI] Upgrade Unity ci images (#16369)
vinx13 Jan 9, 2024
298ad2c
[Unity][Transform] Update LambdaLift to use name of lifted lambda (#1…
Lunderberg Jan 9, 2024
e1d71b3
[Unity] Add dlight.gpu.Fallback in DispatchSortScan, add argsort, top…
yongwww Jan 10, 2024
45532d7
Merge branch 'main' of github.com:apache/tvm into unity
vinx13 Jan 10, 2024
474c06b
[Unity] Set CMAKE_CUDA_ARCHITECTURES default to native (#16335)
vinx13 Jan 10, 2024
c40d96b
Merge remote-tracking branch 'upstream/main' into unity
tqchen Jan 11, 2024
b8230f6
[Unity] Update dispatch test cases following the merge from main (#16…
yongwww Jan 11, 2024
81a6c51
[Unity] Fix creation of disco ProcessSession (#16375)
vinx13 Jan 12, 2024
d1b890a
[Unity][Contrib] Fix a bug due to typo in vllm `reconstruct_from_cach…
masahi Jan 12, 2024
b69d720
[Unity][MSC] Avoid depending on trivial bindings in Relax intermediat…
Lunderberg Jan 12, 2024
4c7c010
[Unity][Transform] Implement relax.transform.AdjustMatmulOrder (#16314)
Lunderberg Jan 12, 2024
7798e93
[Unity] Support TIR kernel for PagedKVCache (#16374)
MasterJH5574 Jan 12, 2024
138cb65
[Unity][BlockBuilder] Restore bb.get() (#16378)
Ubospica Jan 12, 2024
07d8e02
[Unity][nnModule] Dynamic shape support in nn Module (#16284)
CharlieFRuan Jan 12, 2024
5c87bfe
[Unity][Relax][Op] Add Conv3D Operator (#16385)
Jan 14, 2024
e9bea9d
[Relax][Frontend][ONNX]fix onnx frontend parse (#16395)
chengven027 Jan 14, 2024
98d5153
[Unity] PagedKVCache supporting on-the-fly RoPE calculation (#16396)
MasterJH5574 Jan 15, 2024
cf14edd
[Unity][Transform] Memory planning for dynamic-shape func return (#16…
MasterJH5574 Jan 15, 2024
a2a1b53
[Unity] Split DecomposeOpsForTraining into two steps (#15954)
Lunderberg Jan 16, 2024
c8f2e30
Merge branch 'main' into test-merge-unity
Hzfengsy Jan 17, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,6 @@
[submodule "3rdparty/libflash_attn"]
path = 3rdparty/libflash_attn
url = https://github.com/tlc-pack/libflash_attn
[submodule "3rdparty/flashinfer"]
path = 3rdparty/flashinfer
url = https://github.com/flashinfer-ai/flashinfer.git
2 changes: 1 addition & 1 deletion 3rdparty/cutlass
Submodule cutlass updated 1118 files
2 changes: 1 addition & 1 deletion 3rdparty/cutlass_fpA_intB_gemm
Submodule cutlass_fpA_intB_gemm updated 72 files
+78 −0 .clang-format
+6 −0 CMakeLists.txt
+66 −0 cmake/utils/Utils.cmake
+78 −4 cutlass_extensions/include/cutlass_extensions/arch/mma.h
+23 −12 cutlass_extensions/include/cutlass_extensions/compute_occupancy.h
+14 −10 cutlass_extensions/include/cutlass_extensions/epilogue/thread/fused_activations.h
+0 −390 cutlass_extensions/include/cutlass_extensions/epilogue/threadblock/epilogue_per_row_per_col_scale.h
+65 −55 cutlass_extensions/include/cutlass_extensions/epilogue/threadblock/epilogue_tensor_op_int32.h
+67 −48 cutlass_extensions/include/cutlass_extensions/epilogue_helpers.h
+57 −42 cutlass_extensions/include/cutlass_extensions/gemm/kernel/default_fpA_intB_traits.h
+253 −176 cutlass_extensions/include/cutlass_extensions/gemm/kernel/fpA_intB_gemm.h
+497 −0 cutlass_extensions/include/cutlass_extensions/gemm/kernel/fpA_intB_gemm_with_broadcast.h
+86 −0 cutlass_extensions/include/cutlass_extensions/gemm/kernel/gemm_moe_problem_visitor.h
+58 −33 cutlass_extensions/include/cutlass_extensions/gemm/kernel/mixed_gemm_B_layout.h
+534 −0 cutlass_extensions/include/cutlass_extensions/gemm/kernel/moe_cutlass_kernel.h
+358 −0 cutlass_extensions/include/cutlass_extensions/gemm/kernel/moe_problem_visitor.h
+44 −25 cutlass_extensions/include/cutlass_extensions/gemm/threadblock/default_dq_mma.h
+148 −197 cutlass_extensions/include/cutlass_extensions/gemm/threadblock/default_dq_mma_multistage.h
+106 −172 cutlass_extensions/include/cutlass_extensions/gemm/threadblock/default_dq_mma_pipelined.h
+71 −207 cutlass_extensions/include/cutlass_extensions/gemm/threadblock/default_mma.h
+89 −263 cutlass_extensions/include/cutlass_extensions/gemm/threadblock/default_mma_bf16.h
+56 −35 cutlass_extensions/include/cutlass_extensions/gemm/threadblock/dq_mma_base.h
+16 −505 cutlass_extensions/include/cutlass_extensions/gemm/threadblock/dq_mma_multistage.h
+691 −0 cutlass_extensions/include/cutlass_extensions/gemm/threadblock/dq_mma_multistage_finegrained.h
+636 −0 cutlass_extensions/include/cutlass_extensions/gemm/threadblock/dq_mma_multistage_percol.h
+81 −69 cutlass_extensions/include/cutlass_extensions/gemm/threadblock/dq_mma_pipelined.h
+19 −39 cutlass_extensions/include/cutlass_extensions/gemm/warp/default_mma_tensor_op.h
+57 −69 cutlass_extensions/include/cutlass_extensions/gemm/warp/mma_tensorop_compute_B_with_f16.h
+274 −98 cutlass_extensions/include/cutlass_extensions/gemm/warp/mma_tensorop_dequantizer.h
+24 −10 cutlass_extensions/include/cutlass_extensions/gemm_configs.h
+73 −55 cutlass_extensions/include/cutlass_extensions/interleaved_numeric_conversion.h
+16 −11 cutlass_extensions/include/cutlass_extensions/tile_interleaved_layout.h
+248 −0 cutlass_extensions/include/cutlass_extensions/transform/threadblock/fine_grained_scale_zero_iterator.h
+23 −13 cutlass_extensions/include/cutlass_extensions/weight_only_quant_op.h
+19 −2 cutlass_kernels/CMakeLists.txt
+79 −73 cutlass_kernels/cutlass_heuristic.cc
+12 −14 cutlass_kernels/cutlass_heuristic.h
+517 −403 cutlass_kernels/cutlass_preprocessors.cc
+5 −4 cutlass_kernels/cutlass_preprocessors.h
+0 −33 cutlass_kernels/fpA_intB_gemm.cu
+21 −16 cutlass_kernels/fpA_intB_gemm.h
+30 −92 cutlass_kernels/fpA_intB_gemm/fpA_intB_gemm.h
+26 −0 cutlass_kernels/fpA_intB_gemm/fpA_intB_gemm_finegrained.cu
+117 −0 cutlass_kernels/fpA_intB_gemm/fpA_intB_gemm_impl.h
+26 −0 cutlass_kernels/fpA_intB_gemm/fpA_intB_gemm_per_col.cu
+543 −389 cutlass_kernels/fpA_intB_gemm/fpA_intB_gemm_template.h
+71 −0 cutlass_kernels/moe_gemm/moe_gemm_kernels.h
+10 −4 cutlass_kernels/moe_gemm/moe_gemm_kernels_fp16_fp16.cu
+27 −0 cutlass_kernels/moe_gemm/moe_gemm_kernels_fp16_uint4.cu
+10 −4 cutlass_kernels/moe_gemm/moe_gemm_kernels_fp16_uint8.cu
+511 −0 cutlass_kernels/moe_gemm/moe_gemm_kernels_template.h
+73 −0 cutlass_kernels/moe_gemm/moe_gemv_kernels.h
+36 −0 tvm_binding/CMakeLists.txt
+74 −0 tvm_binding/tvm_binding.cu
+20 −4 utils/activation_types.h
+17 −12 utils/cuda_utils.h
+30 −22 utils/logger.h
+13 −11 utils/string_utils.h
+84 −0 weightOnlyBatchedGemv/common.h
+91 −0 weightOnlyBatchedGemv/enabled.h
+440 −0 weightOnlyBatchedGemv/kernel.h
+224 −0 weightOnlyBatchedGemv/kernelLauncher.cu
+27 −0 weightOnlyBatchedGemv/kernelLauncher.h
+99 −0 weightOnlyBatchedGemv/utility.h
+98 −0 weightOnlyBatchedGemv/weightOnlyBatchedGemvBs1Int4b.cu
+98 −0 weightOnlyBatchedGemv/weightOnlyBatchedGemvBs1Int8b.cu
+97 −0 weightOnlyBatchedGemv/weightOnlyBatchedGemvBs2Int4b.cu
+97 −0 weightOnlyBatchedGemv/weightOnlyBatchedGemvBs2Int8b.cu
+98 −0 weightOnlyBatchedGemv/weightOnlyBatchedGemvBs3Int4b.cu
+98 −0 weightOnlyBatchedGemv/weightOnlyBatchedGemvBs3Int8b.cu
+97 −0 weightOnlyBatchedGemv/weightOnlyBatchedGemvBs4Int4b.cu
+98 −0 weightOnlyBatchedGemv/weightOnlyBatchedGemvBs4Int8b.cu
1 change: 1 addition & 0 deletions 3rdparty/flashinfer
Submodule flashinfer added at 9cd1f4
5 changes: 4 additions & 1 deletion 3rdparty/picojson/picojson.h
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,10 @@
* POSSIBILITY OF SUCH DAMAGE.
*/
#pragma once
#ifndef PICOJSON_USE_INT64
#define PICOJSON_USE_INT64
#define __STDC_FORMAT_MACROS 1
#endif

#include <algorithm>
#include <cstddef>
Expand Down Expand Up @@ -76,7 +80,6 @@ extern "C" {

// experimental support for int64_t (see README.mkdn for detail)
#ifdef PICOJSON_USE_INT64
#define __STDC_FORMAT_MACROS
#include <errno.h>
#include <inttypes.h>
#endif
Expand Down
59 changes: 55 additions & 4 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
cmake_minimum_required(VERSION 3.18)
cmake_minimum_required(VERSION 3.24)
project(tvm C CXX)

# Utility functions
Expand Down Expand Up @@ -98,6 +98,7 @@ tvm_option(USE_MKL "MKL root path when use MKL blas" OFF)
tvm_option(USE_DNNL "Enable DNNL codegen" OFF)
tvm_option(USE_CUDNN "Build with cuDNN" OFF)
tvm_option(USE_CUBLAS "Build with cuBLAS" OFF)
tvm_option(USE_NVTX "Build with NVTX" OFF)
tvm_option(USE_CUTLASS "Build with CUTLASS" OFF)
tvm_option(USE_THRUST "Build with Thrust" OFF)
tvm_option(USE_CURAND "Build with cuRAND" OFF)
Expand Down Expand Up @@ -126,6 +127,7 @@ tvm_option(USE_CLML "Build with CLML Codegen support" OFF)
tvm_option(USE_CLML_GRAPH_EXECUTOR "Build with CLML graph runtime" OFF)
tvm_option(USE_UMA "Build with UMA support" OFF)
tvm_option(USE_VERILATOR "Build with Verilator support" OFF)
tvm_option(USE_MSC "Enable Multi-System Compiler" OFF)

# include directories
include_directories(${CMAKE_INCLUDE_PATH})
Expand Down Expand Up @@ -303,6 +305,18 @@ tvm_file_glob(GLOB_RECURSE COMPILER_SRCS
src/driver/*.cc
src/support/*.cc
src/script/*.cc
src/relax/ir/*.cc
src/relax/op/*.cc
src/relax/analysis/*.cc
src/relax/transform/*.cc
src/relax/backend/vm/*.cc
src/relax/backend/task_extraction.cc
src/relax/backend/pattern_registry.cc
src/relax/utils.cc
src/relax/distributed/*.cc
src/relax/distributed/transform/*.cc
src/relax/op/distributed/*.cc
src/relax/testing/*.cc
)

tvm_file_glob(GLOB CODEGEN_SRCS
Expand Down Expand Up @@ -351,6 +365,7 @@ tvm_file_glob(GLOB RUNTIME_SRCS
src/runtime/memory/*.cc
src/runtime/disco/*.cc
src/runtime/minrpc/*.cc
src/runtime/relax_vm/*.cc
)

if(BUILD_FOR_HEXAGON)
Expand Down Expand Up @@ -437,13 +452,15 @@ if(USE_CUDA AND USE_NCCL)
message(STATUS "Build with NCCL...")
find_nccl(${USE_NCCL})
tvm_file_glob(GLOB RUNTIME_NCCL_SRC src/runtime/disco/nccl/*.cc)
set_source_files_properties(src/runtime/disco/nccl/nccl.cc PROPERTIES COMPILE_DEFINITIONS "TVM_NCCL_RCCL_SWITCH=0")
list(APPEND RUNTIME_SRCS ${RUNTIME_NCCL_SRC})
endif()

if(USE_ROCM AND USE_RCCL)
message(STATUS "Build with RCCL...")
find_rccl(${USE_RCCL})
tvm_file_glob(GLOB RUNTIME_RCCL_SRC src/runtime/disco/rccl/*.cc)
tvm_file_glob(GLOB RUNTIME_RCCL_SRC src/runtime/disco/nccl/*.cc)
set_source_files_properties(src/runtime/disco/nccl/nccl.cc PROPERTIES COMPILE_DEFINITIONS "TVM_NCCL_RCCL_SWITCH=1")
list(APPEND RUNTIME_SRCS ${RUNTIME_RCCL_SRC})
endif()

Expand Down Expand Up @@ -559,6 +576,8 @@ include(cmake/modules/contrib/TensorRT.cmake)
include(cmake/modules/contrib/VitisAI.cmake)
include(cmake/modules/contrib/Verilator.cmake)
include(cmake/modules/contrib/UMA.cmake)
include(cmake/modules/contrib/MSC.cmake)
include(cmake/modules/contrib/vllm.cmake)
include(cmake/modules/Git.cmake)
include(cmake/modules/LibInfo.cmake)
include(cmake/modules/RustExt.cmake)
Expand Down Expand Up @@ -873,14 +892,46 @@ if(USE_CUDA AND USE_CUTLASS)
install(TARGETS fpA_intB_gemm EXPORT ${PROJECT_NAME}Targets DESTINATION lib${LIB_SUFFIX})
target_link_libraries(tvm PRIVATE fpA_intB_gemm)
target_link_libraries(tvm_runtime PRIVATE fpA_intB_gemm)
target_link_libraries(tvm PRIVATE fpA_intB_gemm_tvm)
target_link_libraries(tvm_runtime PRIVATE fpA_intB_gemm_tvm)

install(TARGETS flash_attn EXPORT ${PROJECT_NAME}Targets DESTINATION lib${LIB_SUFFIX})
target_link_libraries(tvm PRIVATE -Wl,--no-as-needed flash_attn)
target_link_libraries(tvm_runtime PRIVATE -Wl,--no-as-needed flash_attn)
endif()

if(USE_CUDA AND USE_NVTX)
set_source_files_properties(src/runtime/nvtx.cc PROPERTIES COMPILE_DEFINITIONS "TVM_NVTX_ENABLED=1")
endif()

if(USE_CUDA AND USE_NCCL)
target_link_libraries(tvm PRIVATE nccl)
target_link_libraries(tvm_runtime PRIVATE nccl)
find_library(LIBRT rt)
target_link_libraries(tvm PRIVATE nccl ${LIBRT})
target_link_libraries(tvm_runtime PRIVATE nccl ${LIBRT})
endif()

if(USE_ROCM AND USE_RCCL)
target_link_libraries(tvm PRIVATE rccl)
target_link_libraries(tvm_runtime PRIVATE rccl)
endif()


option(USE_FLASHINFER "Build TVM with FlashInfer" OFF)
if (USE_FLASHINFER STREQUAL "ON")
message(STATUS "Build with FlashInfer")
set(FLASHINFER_TVM_BINDING ON)
set(FLASHINFER_TVM_HOME ${PROJECT_SOURCE_DIR})
set(FLASHINFER_ENABLE_FP8 OFF)
set(FLASHINFER_PREFILL OFF)
set(FLASHINFER_DECODE OFF)
set(FLASHINFER_PAGE OFF)
add_subdirectory(3rdparty/flashinfer)
else ()
message(STATUS "Build without FlashInfer")
endif ()


if (USE_FLASHINFER STREQUAL "ON")
target_link_libraries(tvm PRIVATE flashinfer_tvm)
target_link_libraries(tvm_runtime PRIVATE flashinfer_tvm)
endif ()
Loading