-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Ansor][AutoTVM v2.0] Phase 1: Access Analyzer #6103
Conversation
It is one of the core part of the system. Will take a look later tomorrow night. |
ea153e2
to
56b0187
Compare
95ef46c
to
966c3cc
Compare
/*! \brief Store whether the operation is an output operation */ | ||
OperationMap<bool> is_output; | ||
/*! \brief Store the topological order of operations */ | ||
Array<te::Operation> ops_topo_order; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the relationship between this array and ComputeDAG::ops
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are the same. I store it in AccessAnalyzer because it is used first here.
In the constructor of ComputeDAG, it copies ops_topo_order
as its ops
namespace auto_scheduler { | ||
|
||
/*! \brief Static analysis result for a ComputeDAG */ | ||
class AccessAnalyzerNode : public Object { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like AccessAnalyzer itself can be a much more principled and extensible component of the system, so shall we put it in a separate file instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. Maybe have an analysis.h
to expect more analyzers in the future. On the other hand, another direction might be renaming AccessAnalyzer to ComputeDAGAnalyzer, because it provides some APIs for the ops in a compute DAG, such as NeedMultiLevelTiling
, IsOutput
, etc.
@@ -126,6 +555,7 @@ class FlopEstimator : public ExprFunctor<double(const PrimExpr& n)> { | |||
fail_ = true; | |||
break; | |||
} | |||
cur_type_code_ = pop->output_dtype(0).code(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you elaborate why we need cur_type_code_
? how do we deal with the case that computation is mixed with int8 and fp32?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The FLOP information is only used for printing and debugging. It is okay to just give a rough estimation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@merrymercy I agree that it is totally okay if we can just use rough information (in fact it is highly non-trivial to get accurate info without backend info). My point is that cur_type_code_
comes from the dtype of output, but it is totally possible that a compute dag contains computation of different type code (int8, fp16)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not common at all. The common case is either int8->int16/32 or fp16/fp16->fp32
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM. Agree with @junrushao1994 on most comments.
namespace auto_scheduler { | ||
|
||
/*! \brief Static analysis result for a ComputeDAG */ | ||
class AccessAnalyzerNode : public Object { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. Maybe have an analysis.h
to expect more analyzers in the future. On the other hand, another direction might be renaming AccessAnalyzer to ComputeDAGAnalyzer, because it provides some APIs for the ops in a compute DAG, such as NeedMultiLevelTiling
, IsOutput
, etc.
node->read_from[op] = OperationMap<std::vector<std::vector<PrimExpr>>>(); | ||
} else if (auto cop = op.as<te::ComputeOpNode>()) { | ||
TensorAccessExtractor extractor; | ||
for (const auto& exp : cop->body) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious: did we test with ComputeOp with multiple bodies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. It will be addressed later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally done this round of review... Thanks for the contribution!
const Array<PrimExpr>& output_shape = op->output_shape(0); | ||
const Array<PrimExpr>& producer_shape = producer->output_shape(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do it only work for te::Operation
with a single output? Do we have a fallback solution for operators with multiple outputs like argmax
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even with multiple outputs, the shape will be the same
99e9ab3
to
b01fcf8
Compare
@junrushao1994 @jcf94 @comaniac Most of the comments are addressed. I added more doc and make the name convention more consistent and meaningful. Please take another look |
b01fcf8
to
f63929b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
f63929b
to
c690c32
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks.
4cfe0cf
to
e9933c6
Compare
e9933c6
to
e032686
Compare
* add access analyzer * add test cases * move header files and polish comments * fix lint * update * fix lint * address comments * fix lint
* add access analyzer * add test cases * move header files and polish comments * fix lint * update * fix lint * address comments * fix lint
* add access analyzer * add test cases * move header files and polish comments * fix lint * update * fix lint * address comments * fix lint
* add access analyzer * add test cases * move header files and polish comments * fix lint * update * fix lint * address comments * fix lint
* add access analyzer * add test cases * move header files and polish comments * fix lint * update * fix lint * address comments * fix lint
For the full upstream plan, see Ansor RFC.
This pr
The search policy will use the analysis results to make decisions such as doing multi-level tiling for an op or strictly inlining an op.
include/tvm/auto_scheduler
and polishes some comments.