Skip to content

Commit

Permalink
Fix doc after moving to unified IR (apache#4835)
Browse files Browse the repository at this point in the history
  • Loading branch information
zhiics authored and Ubuntu committed Feb 10, 2020
1 parent c1efd5a commit d88436e
Show file tree
Hide file tree
Showing 2 changed files with 24 additions and 22 deletions.
10 changes: 6 additions & 4 deletions docs/dev/relay_pass_infra.rst
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ C++ Backend
We provide a ``PassInfo`` object to contain the basic information needed by
a pass. ``name`` is the pass name, ``opt_level`` indicates at which optimization
level the pass will be enabled, and ``required`` represents the passes that are
required to execute a certain pass (see `include/tvm/relay/transform.h`_ for
required to execute a certain pass (see `include/tvm/ir/transform.h`_ for
more details). For example, during registration of a pass (will be covered in
later), the pass developers can specify the name of the pass, the optimization
level it will be performed at, and/or the passes that are required.
Expand Down Expand Up @@ -183,7 +183,7 @@ optimization passes, e.g., function-level passes, module-level passes, and
sequential passes. Each subclass itself could act as a pass manager. For
instance, they could collect the required passes and execute them or build
a dependency graph based on the given metadata. The full definition of them
can be found in `src/relay/pass/pass_manager.cc`_
can be found in `src/relay/ir/transform.cc`_ and `src/ir/transform.cc`_.

Module-Level Passes
^^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -651,9 +651,11 @@ For more pass infra related examples in Python and C++, please refer to

.. _Relay module: https://docs.tvm.ai/langref/relay_expr.html#module-and-global-functions

.. _include/tvm/relay/transform.h: https://github.com/apache/incubator-tvm/blob/master/include/tvm/relay/transform.h
.. _include/tvm/ir/transform.h: https://github.com/apache/incubator-tvm/blob/master/include/tvm/ir/transform.h

.. _src/relay/pass/pass_manager.cc: https://github.com/apache/incubator-tvm/blob/master/src/relay/pass/pass_manager.cc
.. _src/relay/ir/transform.cc: https://github.com/apache/incubator-tvm/blob/master/src/relay/ir/transform.cc

.. _src/ir/transform.cc: https://github.com/apache/incubator-tvm/blob/master/src/ir/transform.cc

.. _src/relay/pass/fold_constant.cc: https://github.com/apache/incubator-tvm/blob/master/src/relay/pass/fold_constant.cc

Expand Down
36 changes: 18 additions & 18 deletions docs/dev/runtime.rst
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ Here are the common ones:
- PackedFunc itself
- Module for compiled modules
- DLTensor* for tensor object exchange
- TVM Node to represent any object in IR
- TVM Object to represent any object in IR

The restriction makes the implementation simple without the need of serialization.
Despite being minimum, the PackedFunc is sufficient for the use-case of deep learning deployment as
Expand Down Expand Up @@ -141,7 +141,7 @@ One fun fact about PackedFunc is that we use it for both compiler and deployment

.. _here: https://github.com/apache/incubator-tvm/tree/master/src/api

To keep the runtime minimum, we isolated the IR Node support from the deployment runtime. The resulting runtime takes around 200K - 600K depending on how many runtime driver modules (e.g., CUDA) get included.
To keep the runtime minimum, we isolated the IR Object support from the deployment runtime. The resulting runtime takes around 200K - 600K depending on how many runtime driver modules (e.g., CUDA) get included.

The overhead of calling into PackedFunc vs. a normal function is small, as it is only saving a few values on the stack.
So it is OK as long as we don't wrap small functions.
Expand Down Expand Up @@ -182,7 +182,7 @@ RPC server on iPhone/android/raspberry pi or even the browser. The cross compila

This instant feedback gives us a lot of advantages. For example, to test the correctness of generated code on iPhone, we no longer have to write test-cases in swift/objective-c from scratch -- We can use RPC to execute on iPhone, copy the result back and do verification on the host via numpy. We can also do the profiling using the same script.

TVM Node and Compiler Stack
TVM Object and Compiler Stack
---------------------------

As we mentioned earlier, we build compiler stack API on top of the PackedFunc runtime system.
Expand All @@ -192,17 +192,17 @@ However, we don't want to change our API from time to time. Besides that, we als
- be able to serialize any language object and IRs
- be able to explore, print, and manipulate the IR objects in front-end language to do quick prototyping.

We introduced a base class, called `Node`_ to solve this problem.
All the language object in the compiler stack is a subclass of Node. Each node contains a string type_key that uniquely identifies
the type of object. We choose string instead of int as type key so new Node class can be added in the decentralized fashion without
We introduced a base class, called `Object`_ to solve this problem.
All the language object in the compiler stack is a subclass of ``Object``. Each object contains a string type_key that uniquely identifies
the type of object. We choose string instead of int as type key so new ``Object`` class can be added in the decentralized fashion without
adding the code back to the central repo. To ease the speed of dispatching, we allocate an integer type_index at runtime for each type_key.

.. _Node: https://github.com/dmlc/HalideIR/blob/master/src/tvm/node/node.h#L61
.. _Object: https://github.com/apache/incubator-tvm/blob/master/include/tvm/runtime/object.h

Since usually one Node object could be referenced in multiple places in the language, we use a shared_ptr to keep
track of reference. We use NodeRef class to represent a reference to the Node.
We can roughly view NodeRef class as shared_ptr to the Node container.
We can also define subclass NodeRef to hold each subtypes of Node. Each Node class needs to define the VisitAttr function.
Since usually one ``Object`` could be referenced in multiple places in the language, we use a shared_ptr to keep
track of reference. We use ``ObjectRef`` class to represent a reference to the ``Object``.
We can roughly view ``ObjectRef`` class as shared_ptr to the ``Object`` container.
We can also define subclass ``ObjectRef`` to hold each subtypes of ``Object``. Each subclass of ``Object`` needs to define the VisitAttr function.

.. code:: c
Expand All @@ -216,21 +216,21 @@ We can also define subclass NodeRef to hold each subtypes of Node. Each Node cla
virtual void Visit(const char* key, std::string* value) = 0;
virtual void Visit(const char* key, void** value) = 0;
virtual void Visit(const char* key, Type* value) = 0;
virtual void Visit(const char* key, NodeRef* value) = 0;
virtual void Visit(const char* key, ObjectRef* value) = 0;
// ...
};
class Node {
class BaseAttrsNode : public Object {
public:
virtual void VisitAttrs(AttrVisitor* visitor) {}
virtual void VisitAttrs(AttrVisitor* v) {}
// ...
};
Each Node subclass will override this to visit its members. Here is an example implementation of TensorNode.
Each ``Object`` subclass will override this to visit its members. Here is an example implementation of TensorNode.

.. code:: c
class TensorNode : public Node {
class TensorNode : public Object {
public:
/*! \brief The shape of the tensor */
Array<Expr> shape;
Expand All @@ -251,7 +251,7 @@ Each Node subclass will override this to visit its members. Here is an example i
}
};
In the above examples, both ``Operation`` and ``Array<Expr>`` are NodeRef.
In the above examples, both ``Operation`` and ``Array<Expr>`` are ObjectRef.
The VisitAttrs gives us a reflection API to visit each member of the object.
We can use this function to visit the node and serialize any language object recursively.
It also allows us to get members of an object easily in front-end language.
Expand All @@ -264,7 +264,7 @@ For example, in the following code, we accessed the op field of the TensorNode.
# access the op field of TensorNode
print(x.op.name)
New Node can be added to C++ without changing the front-end runtime, making it easy to make extensions to the compiler stack.
New ``Object`` can be added to C++ without changing the front-end runtime, making it easy to make extensions to the compiler stack.
Note that this is not the fastest way to expose members to front-end language, but might be one of the simplest
approaches possible. We also find that it fits our purposes as we mainly use python for testing and prototyping and still use c++
to do the heavy lifting job.
Expand Down

0 comments on commit d88436e

Please sign in to comment.