[TIR][PASS] dtype rewrite for indexing variables #5092

hzfan · 2020-03-18T14:36:16Z

Changes:

enable indexing with i64 vars, so that large tensors with more than 2^32 elements can be properly indexed.
narrow i64 index which trivially fits into i32 to i32.

Some background:
https://discuss.tvm.ai/t/rfc-support-for-large-tensors/5643

Take the following as an example:

A = te.placeholder((m, n), name='A')
B = te.placeholder((m, n), name='B')
C = te.compute((m, n), lambda *idx: A[idx] + B[idx])

m, n = te.var(’m’, dtype=‘int64’), te.var(’n’, dtype=‘int64’) yields

produce compute {
  for (i0.int64, (int64)0, m.int64) {
    for (i1.int64, (int64)0, n.int64) {
      compute[((i0.int64*stride.int64) + (i1.int64*stride.int64))] = (A[((i0.int64*stride.int64) + (i1.int64*stride.int64))] + B[((i0.int64*stride.int64) + (i1.int64*stride.int64))])
    }
  }
}

m, n = tvm.tir.const(2, dtype="int64"), tvm.tir.const(2, dtype="int64") yields

produce compute {
  for (i0.int32, 0, 2) {
    for (i1.int32, 0, 2) {
      compute[((i0.int32*2) + i1.int32)] = (A[((i0.int32*2) + i1.int32)] + B[((i0.int32*2) + i1.int32)])
    }
  }
}

@yzhliu Could you review?

yzhliu

will do another round of review tomorrow. @tqchen could you also take a look?

src/tir/pass/rewrite_datatype.cc

tqchen · 2020-03-19T16:39:58Z

@junrushao1994 @ZihengJiang @merrymercy @Hzfengsy would be great if you can also help to take a look

src/tir/pass/rewrite_datatype.cc

src/tir/pass/unroll_loop.cc

include/tvm/tir/ir_pass.h

src/tir/pass/narrow_datatype.cc

tqchen · 2020-03-25T15:08:32Z

tests/python/unittest/test_tir_pass_narrow_datatype.py

+    bounds = te.schedule.InferBound(sch)
+    stmt = te.schedule.ScheduleOps(sch, bounds)
+    stmt = tvm.tir.ir_pass.StorageFlatten(stmt, binds, 64, False)
+    stmt = tvm.tir.ir_pass.NarrowDataType(stmt, 32)


It would be great if we can add more test cases:

Please also consider use ir_builde to directly build loops

Have testcase that narrows to i16

Have a loop variable occurs in multiple expressions, one expr overflows, another does not

Tests added. test_slice covers the last case you mentioned.

tests/python/unittest/test_tir_pass_narrow_datatype.py

src/tir/pass/narrow_datatype.cc

tqchen · 2020-03-25T15:14:53Z

src/tir/pass/narrow_datatype.cc

+      ConstIntBound bound = analyzer_.const_int_bound(e);
+      int64_t ubound = Downcast<IntImm, PrimExpr>(max_value(DataType::Int(target_bits_)))->value;
+      int64_t lbound = Downcast<IntImm, PrimExpr>(min_value(DataType::Int(target_bits_)))->value;
+      if (e.dtype().bits() <= target_bits_ ||


Shall we rewrite lower bits into higher ones? cc @yzhliu

Do you have an example? We previously reviewed some of the scenarios, not seeing needs doing so.

To clarify, this code seems to indicate that we can rewrite lower bits into higher ones, and I think we do not need this behavior.

Yes, I think this pass aligns with what you said. It only narrows and never promotes. This code means lower bits fits into higher bits. For example, Consider e is i.i64 <= j.i64, a bool expression with only 1 bit. So e fits into i32 and does not hinder narrowing i to i32.

https://github.com/apache/incubator-tvm/pull/5092/files#diff-98ae729cf00e30cff311ed80b4a25df9R129 ensures dtype promotion does not occcur.

tqchen · 2020-03-25T15:17:13Z

src/tir/pass/narrow_datatype.cc

+      if (vmap.find(op) == vmap.end()) {
+        vmap[op] = op->dtype.with_bits(bits);
+      } else {
+        vmap[op] = op->dtype.with_bits(std::max(vmap[op].bits(), bits));


It would be great to add more comment here. e.g. We are taking maximum bits for all the possible Exprs that a Var occurs.

Thanks for the example. Comments added.

tqchen · 2020-03-25T15:17:21Z

src/tir/pass/narrow_datatype.cc

+
+  void VisitExpr_(const VarNode* op) {
+    if (vextent_.find(op) != vextent_.end()) {
+      int bits = std::min(vextent_[op].bits(), bits_);


Add more comments about the algorithm

Added. Here we ensure that datatype is not promoted.

tqchen · 2020-03-25T15:18:20Z

src/tir/pass/narrow_datatype.cc

+using arith::IRMutatorWithAnalyzer;
+using arith::ConstIntBound;
+
+class DataTypeVisitor final : public StmtExprVisitor {


Please document:

input, what would vmap eventually store

The general algorithm we use(e.g. we propagate the bits backwards into vmap)

Documented.

tqchen · 2020-03-25T15:19:58Z

src/tir/pass/narrow_datatype.cc

+  void VisitExpr(const PrimExpr& e) {
+    if (e.dtype().is_int()) {
+      int bits = max_bits_;
+      ConstIntBound bound = analyzer_.const_int_bound(e);


NOTE: the constant int bound here is not necessarily the most efficient for deep nested expressions.

As we are recursively calling const int bound for all sub-expressions of e as well(when we recursively visit). Perhaps we want to add a const int bound with memoization option that allows the analyzer to pass a memo(of each subexpr to the const int bound).

We could add it as a TODO item, or directly do it in this PR, but would be great to be resolved in next few weeks cc @yzhliu

What about a state flag like with_memo in class ConstIntBoundAnalyzer? Once it is set, we first look up in the table when a new Expr comes. We can have an unordered_map like unordered_map<PrimExpr*, Entry> in ConstIntBoundAnalyzer to achieve this.

@hzfan I think it is good. but why not always doing memorization? @tqchen

memo can have unintended consequences if the vars can be bound to different context dependent info(e.g. if (x<10) {x+1; } else x; x<10 is only effective in the then branch.

I would say perhaps we could have another API to pass in a unordered map, and ask the analyzer to record every intermediate steps into the map

tqchen · 2020-03-25T15:21:36Z

Some more comments, mainly wrt to clarity of the code, test coverage and efficiency concerns. Thanks for bringing in the PR. given that this is critical to a lot of the codebase, let us try to https://docs.tvm.ai/contribute/code_review.html#hold-the-highest-standard :) Let us work to polish it to the best state. Thanks @hzfan for good work so far

tqchen · 2020-03-25T15:22:18Z

@spectrometerHBH @FrozenGene @Hzfengsy please also help to take a look

Hzfengsy · 2020-03-25T15:32:40Z

src/tir/pass/unroll_loop.cc

-    if (v1 != nullptr) {
+    // integers that do not fit in int32_t are treated as symbolic,
+    // as it's impossible to unroll such large loops
+    if (v1 != nullptr && v1->value <= std::numeric_limits<int>::max()) {


I wonder if we should use int32_t here rather than int. I'm not sure, but just worry about int will represent different types (int16_t, int32_t or int64_t) on different systems and devices.

My motivation here is to prevent overflow in the next line (which uses int):

value = static_cast<int>(v1->value);

IMO it might be fine to use int here, as it's consistent with other parts of the pass. What do you think?

That's fine

yzhliu

@tqchen @Hzfengsy please check again.

yzhliu · 2020-04-01T00:27:35Z

src/tir/pass/narrow_datatype.cc

+  void VisitExpr(const PrimExpr& e) {
+    if (e.dtype().is_int()) {
+      int bits = max_bits_;
+      ConstIntBound bound = analyzer_.const_int_bound(e);


@hzfan I think it is good. but why not always doing memorization? @tqchen

tqchen · 2020-04-01T01:02:56Z

@hzfan Good work. We are in the progress of migrating to the new transform pass manager API. Can you also add a variant of the pass for IRModule and change the testcases to the new style? We can still keep using the old API until we migrated everything itno the new pass style.

reference #5198

tqchen · 2020-04-02T16:05:18Z

Thanks @hzfan for keep improving the code and maintaining a high standard. Thanks @yzhliu @Hzfengsy for helpful reviews, this PR is now merged

yzhliu reviewed Mar 19, 2020

View reviewed changes

src/tir/pass/rewrite_datatype.cc Outdated Show resolved Hide resolved

src/tir/pass/rewrite_datatype.cc Outdated Show resolved Hide resolved

src/tir/pass/rewrite_datatype.cc Outdated Show resolved Hide resolved

tqchen added the status: need review label Mar 19, 2020

tqchen self-assigned this Mar 19, 2020

hzfan force-pushed the large-tensor-pass_pr branch from 8c7c303 to 0724b66 Compare March 20, 2020 03:59

yzhliu reviewed Mar 20, 2020

View reviewed changes

hzfan force-pushed the large-tensor-pass_pr branch from 4e84153 to 178274c Compare March 20, 2020 18:46

tqchen requested changes Mar 20, 2020

View reviewed changes

src/tir/pass/narrow_datatype.cc Outdated Show resolved Hide resolved

src/tir/pass/narrow_datatype.cc Outdated Show resolved Hide resolved

src/tir/pass/narrow_datatype.cc Outdated Show resolved Hide resolved

hzfan force-pushed the large-tensor-pass_pr branch 4 times, most recently from c668c00 to 3ba40c2 Compare March 21, 2020 17:13

yzhliu approved these changes Mar 25, 2020

View reviewed changes

tqchen requested changes Mar 25, 2020

View reviewed changes

tqchen added the status: need update need update based on feedbacks label Mar 25, 2020

tqchen changed the title ~~[PASS] dtype rewrite for indexing variables~~ [TIR][PASS] dtype rewrite for indexing variables Mar 25, 2020

Hzfengsy reviewed Mar 25, 2020

View reviewed changes

yzhliu reviewed Apr 1, 2020

View reviewed changes

meta-project-ci added 9 commits April 1, 2020 17:30

Support large tensors

8937617

Clear

17751bd

Add tests

1709764

Change tests

4d420db

Add IntImm and Cast

b0dd985

Fix multi-lanes dtype

10ac54f

Add comments

d2c8237

Restricted to StoreNode and LoadNode

ca7a74d

Only narrow, no promotion

ad30a83

meta-project-ci added 15 commits April 1, 2020 17:30

Fix CodeGenCPU

73a526e

Fix cast

03e9093

Add reduction tests

4cdeea3

Rename

686a6fb

Clear

725f42d

Fix multiple instances of one IterVar

b3e3da5

Remove unnecessary cast

dde45e2

Resolve comments

ec815d3

Resolve comments

130373d

Resolve comments

04c4a10

Use ir_builder to test

0578b85

Add comments

9c5acee

Fix sanity

6cee7b7

Migrate to transform pass

458f0b3

ConstIntBound with memorization

e960449

hzfan force-pushed the large-tensor-pass_pr branch from 56ee7a8 to e960449 Compare April 2, 2020 13:17

Fix sanity

eb5e02a

Hzfengsy approved these changes Apr 2, 2020

View reviewed changes

tqchen approved these changes Apr 2, 2020

View reviewed changes

tqchen merged commit 4e5c584 into apache:master Apr 2, 2020

tqchen added status: accepted and removed status: need review status: need update need update based on feedbacks labels Apr 2, 2020

trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Apr 16, 2020

[TIR][PASS] dtype rewrite for indexing variables (apache#5092)

9c18a53

zhiics pushed a commit to neo-ai/tvm that referenced this pull request Apr 17, 2020

[TIR][PASS] dtype rewrite for indexing variables (apache#5092)

0464084

ZihengJiang mentioned this pull request Sep 25, 2020

TVM v0.7 Release Note Candidate #6486

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TIR][PASS] dtype rewrite for indexing variables #5092

[TIR][PASS] dtype rewrite for indexing variables #5092

hzfan commented Mar 18, 2020

yzhliu left a comment

tqchen commented Mar 19, 2020

tqchen Mar 25, 2020

hzfan Mar 30, 2020

tqchen Mar 25, 2020

yzhliu Mar 25, 2020

tqchen Mar 26, 2020

hzfan Mar 30, 2020

tqchen Mar 25, 2020

hzfan Mar 30, 2020

tqchen Mar 25, 2020

hzfan Mar 30, 2020

tqchen Mar 25, 2020

hzfan Mar 30, 2020

tqchen Mar 25, 2020 •

edited

Loading

hzfan Mar 30, 2020

yzhliu Apr 1, 2020

tqchen Apr 1, 2020

tqchen commented Mar 25, 2020 •

edited

Loading

tqchen commented Mar 25, 2020

Hzfengsy Mar 25, 2020

hzfan Mar 30, 2020

Hzfengsy Apr 2, 2020

yzhliu left a comment

yzhliu Apr 1, 2020

tqchen commented Apr 1, 2020

tqchen commented Apr 2, 2020 •

edited

Loading

[TIR][PASS] dtype rewrite for indexing variables #5092

[TIR][PASS] dtype rewrite for indexing variables #5092

Conversation

hzfan commented Mar 18, 2020

yzhliu left a comment

Choose a reason for hiding this comment

tqchen commented Mar 19, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tqchen Mar 25, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tqchen commented Mar 25, 2020 • edited Loading

tqchen commented Mar 25, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yzhliu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tqchen commented Apr 1, 2020

tqchen commented Apr 2, 2020 • edited Loading

tqchen Mar 25, 2020 •

edited

Loading

tqchen commented Mar 25, 2020 •

edited

Loading

tqchen commented Apr 2, 2020 •

edited

Loading