From d7390d16291b4efdf6cce186592f22cfa73c9fc0 Mon Sep 17 00:00:00 2001
From: "Steven S. Lyubomirsky" <slyubomirsky@gmail.com>
Date: Mon, 18 Feb 2019 13:59:20 -0800
Subject: [PATCH] [Relay][Docs] Documentation for Algebraic Data Types (#2575)

---
 docs/langref/index.rst      |  10 +-
 docs/langref/relay_adt.rst  | 515 ++++++++++++++++++++++++++++++++++++
 docs/langref/relay_expr.rst | 117 +++++++-
 docs/langref/relay_type.rst | 158 ++++++++++-
 4 files changed, 784 insertions(+), 16 deletions(-)
 create mode 100644 docs/langref/relay_adt.rst

diff --git a/docs/langref/index.rst b/docs/langref/index.rst
index f89d29d9218d..e586a052e4e4 100644
--- a/docs/langref/index.rst
+++ b/docs/langref/index.rst
@@ -8,9 +8,9 @@ Introduction to Relay
 
 Relay is a functional, differentiable programming language
 designed to be an expressive intermediate representation for machine
-learning systems. Relay supports closures, control flow, and
-recursion, allowing it to directly represent more complex models
-than can computation graph-based IRs.
+learning systems. Relay supports algebraic data types, closures,
+control flow, and recursion, allowing it to directly represent more
+complex models than computation graph-based IRs can.
 Relay also includes a form of dependent typing using *type relations*
 in order to handle shape analysis for operators with complex
 requirements on argument shapes.
@@ -19,13 +19,15 @@ Relay is extensible by design and makes it easy for machine learning
 researchers and practitioners to develop new large-scale program
 transformations and optimizations.
 
-The below pages describe the grammar, type system, and operators in Relay, respectively.
+The below pages describe the grammar, type system,
+algebraic data types, and operators in Relay, respectively.
 
 .. toctree::
    :maxdepth: 2
 
    relay_expr
    relay_type
+   relay_adt
    relay_op
 
 Hybrid Script
diff --git a/docs/langref/relay_adt.rst b/docs/langref/relay_adt.rst
new file mode 100644
index 000000000000..3ca2eab84434
--- /dev/null
+++ b/docs/langref/relay_adt.rst
@@ -0,0 +1,515 @@
+.. _adt-overview:
+
+=============================
+Algebraic Data Types in Relay
+=============================
+
+Algebraic data types (ADTs) are a staple feature of functional programming languages,
+particularly those derived from ML, because they express data structures in a
+manner that is easy to reason about when writing recursive computations.
+Because recursion is intended to be one of the primary mechanisms of control
+flow in Relay, it is important that Relay include ADTs in order to best express
+loops and other control flow structures that must be implemented using recursion.
+
+Defining and Matching on an ADT
+===============================
+
+*Note: ADTs are not presently supported in the text format. The syntax here is speculative, based on ADTs in other languages.*
+
+ADTs can be understood as a generalized version of :code:`enum` and :code:`struct` types
+from C-like languages. Like a C :code:`struct:`, an ADT instance is a container for fields
+of specified types, but the type system allows for the same type to encode different possible
+groupings of fields in a systematic manner, similar to C :code:`enum` types, which are
+defined using a finite set of possible values named by the user.
+
+Specifically, an ADT is defined as a named group of constructors, each of which is
+a function that takes values of specified types as arguments and returns an instance
+of the named ADT. An ADT instance simply contains the values of the arguments
+passed to the constructor call used to produce it.
+
+An ADT value is opaque until it is *deconstructed*, allowing the arguments to the
+constructor to be accessed again and used to compute new values. Because
+a particular ADT can have multiple constructors with different signatures,
+it is usually necessary to branch on the different possible constructors,
+resulting in the *match* syntax for ADTs. Hence, ADTs are sometimes called
+"tagged unions" because an ADT instance is tagged by the name of the constructor
+used to produce it and can later be inspected based on the tag.
+
+Because each ADT has a finite set of constructors, it is straightforward to determine
+whether a function processing an ADT instance is handling all possible cases.
+In particular, the type system can ensure that types are properly assigned in all cases when
+deconstructing an ADT instance, in contrast to :code:`union` types in C.
+Hence, it is often easy to reason about ADTs.
+
+*Implementation detail: Relay ADT definitions are global and are stored in the module, similarly to global function definitions. An ADT name is, in fact, a global type variable (just as a global function name is a global variable). The module keeps a mapping of ADT names (global type variables) to the list of constructors for that ADT.*
+
+Below is a simple example of defining an ADT and using it in a function
+via a match expression:
+
+.. code-block:: python
+
+   # Defines an ADT named "Numbers"
+   data Numbers {
+     Empty : () -> Numbers
+     Single : (Tensor[(), int32]) -> Numbers
+     Pair : (Tensor[(), int32], Tensor[(), int32]) -> Numbers
+   }
+   # A Numbers value can be produced using an Empty, Single, or Pair
+   # constructor, each with a signature given above
+
+   def @sum(%n : Numbers[]) -> Tensor[(), int32] {
+      # The match expression branches on the constructor that was
+      # used to produce %n. The variables in each case are bound
+      # if the constructor matches that used for %n
+      match(%n) {
+        case Empty() { 0 }
+        case Single(x) { x }
+        case Pair(x, y) { x + y }
+      }
+   }
+
+   @sum(Empty())    # evaluates to 0
+   @sum(Single(3))  # evaluates to 3
+   @sum(Pair(5, 6)) # evaluates to 11
+
+Note that ADTs are identified by name,
+meaning that two ADTs with structurally identical constructors
+will nevertheless be distinct data types from the point of view of
+the typechecker.
+
+.. code-block:: python
+
+   # structurally identical constructors to Numbers
+   data Numbers2 {
+     Empty2 : () -> Numbers2
+     Single2 : (Tensor[(), int32]) -> Numbers2
+     Pair2 : (Tensor[(), int32], Tensor[(), int32]) -> Numbers2
+   }
+
+   # the below results in a type error because Numbers2
+   # is a distinct type from Numbers
+   # fn() { @sum(Empty2()) }
+
+Type-Checking ADTs and Polymorphism
+===================================
+
+This section will go into more specific detail about the typing of ADTs.
+Most of the complexity involved results from the fact that, as with functions, ADTs
+can be polymorphic and take type parameters.
+
+For example, one of the standard ADTs commonly used in functional
+programming languages is the optional type, defined here:
+
+.. code-block:: python
+
+   # a is a type parameter
+   data Optional<a> {
+     None : () -> Optional
+     Some : (a) -> Optional
+   }
+
+Optional types are commonly used as the return type for any operation
+involving querying into a data structure (returning :code:`Some(v)`
+if a value is found and :code:`None` if it isn't).
+Taking a type parameter in the definition allows the same optional type
+to be used in a wide variety of situations, rather than having to
+define a unique ADT for each different type that could be contained in it.
+
+However, it is important to ensure that option types whose contents
+are of different types can still be distinguished by the type system,
+since it would violate type safety if a function expecting an option
+containing a :code:`Tensor[(), int32]` instead receives an option
+containing a :code:`Tensor[(3, 4), float32]`. As this example may
+imply, an ADT instance is thus given a type that contains the
+concrete type arguments for that instance, ensuring the information is
+kept around. Let the below example illustrate:
+
+.. code-block:: python
+
+   # the signature for option indicates the type argument
+   def @inc_scalar(%opt : Optional[Tensor[(), int32]]) -> Tensor[(), int32] {
+     match(%opt) {
+       case None() { 1 }
+       case Some(%s) { %s + 1 }
+     }
+   }
+
+   def @main() {
+     let %one : Optional[Tensor[(), int32]] = Some(1);
+     let %big : Optional[Tensor[(10, 10), float32]]
+       = Some(Constant(1, (10, 10), float32));
+     let %two = inc_scalar(%one);
+     # let %bigger = inc_scalar(%big); # type system rejects
+     # None does not take an argument so it can always implicitly
+     # be given the correct type arguments
+     let %z = inc_scalar(None());
+     ()
+   }
+
+The syntax for the annotated type arguments
+(e.g., :code:`Optional[Tensor[(), int32]]`) in the above examples is
+called a "type call," treating the polymorphic ADT definition as a
+type-level function (taking type params and returning a type, namely
+the ADT). Any ADT appearing in a type annotation or function signature
+must be annotated with type arguments (a non-polymorphic ADT must be
+in a type call with no arguments).
+
+Thus, we can say in general that if constructor :code:`C` that
+takes arguments of types :code:`T1, ..., Tn` is a constructor
+for an ADT :code:`D` that takes type parameters :code:`v1, ..., vn`
+(where :code:`T1, ..., Tn` may contain any of the :code:`v1, ..., vn`),
+then :code:`C` has
+the type :code:`fun<v1, ..., vn>(T1, ..., Tn) -> D[v1, ..., vn]`.
+This means that constructors are typed like ordinary functions and
+thus appear inside call nodes and can be passed to or returned by
+other functions. In particular, the :code:`Some` example above has
+the signature :code:`fun<a>(a) -> Optional[a]`, while :code:`None`
+has the signature :code:`fun<a>() -> Optional[a]`.
+
+Recursion with ADTs
+===================
+
+ADT definitions are allowed to be recursive, that is, a definition for
+an ADT named :code:`D` can assume the existence of type :code:`D` and
+use it as an argument to constructors. Recursion allows ADTs to
+represent complex structures such as lists or trees; it is the source
+of much of ADTs' power in functional programming, since an appropriately
+designed data structure could make it easy to concisely express a
+computation with a recursive function.
+
+Many commonly used ADTs involve recursion; some of these are given
+in `Common ADT Uses`_. As an example here, we will
+examine the list ADT, ubiquitous in functional languages:
+
+.. code-block:: python
+
+   data List<a> {
+      Nil : () -> List
+      Cons : (a, List[a]) -> List
+   }
+
+(Notice that the recursive reference to :code:`List` is wrapped
+in a type call even in the constructor.)
+
+The above definition means that a list of values of a particular type
+can be represented by nesting :code:`Cons` constructors until the
+end of the list is reached, which can be indicated with a :code:`Nil`
+(representing an empty list).
+
+Lists represented in this manner can easily be recursively processed.
+For example, the following function sums a list of integers:
+
+.. code-block:: python
+
+   def @list_sum(%l : List[Tensor[(), int32]]) -> Tensor[(), int32] {
+     match(%l) {
+       case Nil() { 0 }
+       # add the head of the list to the sum of the tail
+       case Cons(%h, %t) { %h + @list_sum(%t) }
+     }
+   }
+
+As it happens, many recursive functions on lists like the one just given
+share structures that can be factored out into generic, easily
+usable functions that will be discussed under `Common ADT Uses`_.
+
+.. _adt-pattern:
+
+Pattern Matching in Match Expressions
+=====================================
+
+Match expressions in Relay, as in other functional languages, are capable of
+more versatile pattern matching than simply having one case for each constructor
+for the datatype of the value being deconstructed.
+
+In particular, the patterns in match cases can be built up recursively:
+
+- Constructor patterns match for a particular ADT constructor. If a value matches the constructor, each argument to the constructor will be matched against a nested pattern.
+- Wildcard patterns will match any value and will not bind to a variable.
+- Variable patterns will match any value and bind it to a local variable, scoped to the match clause.
+
+In the simple case of :code:`@list_sum` above, the first match case has a :code:`Nil` constructor pattern (with no nested arguments)
+and the second has a :code:`Cons` constructor pattern that uses variable patterns for each of the arguments to :code:`Cons`.
+
+The below example uses a wildcard pattern to ignore one of the arguments to :code:`Cons`:
+
+.. code-block:: python
+
+   def @first<a>(%l : List[a]) -> Optional[a] {
+     match(%l) {
+       case Nil() { None() }
+       case Cons(%h, _) { Some(%h) } # list tail is unused and ignored
+     }
+   }
+
+Here, a constructor pattern is nested inside another constructor pattern to avoid nested match expressions for a list option.
+A top-level wildcard pattern is also used to handle all cases that do not match the first clause:
+
+.. code-block:: python
+
+   def @second_opt<a>(%ll : Optional[List[a]]) -> Optional[a] {
+     match(%ll) {
+       # we only need the second member of the list if there is one
+       case Some(Cons(_, Cons(%s, _))) { Some(%s) }
+       case _ { None() }
+     }
+   }
+
+   # @second_opt(Some(Cons(1, Nil()))) evaluates to None()
+   # @second_opt(Some(Cons(1, Cons(2, Nil())))) evaluates to Some(2)
+   # @second_opt(Some(Nil())) evaluates to None()
+   # @second_opt(None()) evaluates to None()
+
+Note that a match expression checks its patterns in the order the cases are listed: the first clause whose pattern
+that matches the input value is the one that is evaluated. Here, a top-level variable pattern binds the whole
+input value:
+
+.. code-block:: python
+
+   def @match_order_beware<a>(%l : List[a]) -> List[a] {
+     match(%l) {
+       case %v { %v }
+       # the above matches everything so neither of these runs
+       case Cons(%h, %t) { Cons(%h, @match_order_beware(%t)) }
+       case Nil() { Nil() }
+     }
+   }
+  
+Common ADT Uses
+===============
+
+In functional programming languages, certain ADTs provide useful facilities for writing common programs.
+Parametric polymorphism and higher-order functions allow these ADTs to be easily reuseable and for generic
+functions to manipulate them in common situations. Relay includes a "Prelude" of certain pre-defined ADTs
+and functions for them that correspond to the indispensable ADTs of other languages.
+
+The option type defined under `Type-Checking ADTs and Polymorphism`_ is one such ADT, used
+whenever it can make sense for a function to only return a value under certain circumstances. Having
+the option type allows for the type system to keep track of which functions always return a value
+of a certain type versus returning an option of that type, ensuring that any options are always
+explicitly checked (contrast with returning null pointers or throwing
+exceptions as other ways to addressing that problem).
+
+Lists (defined in `Recursion with ADTs`_) can be manipulated by generic functions in a manner similar to
+list comprehensions and certain library functions in Python. Below are very common functions for iterating
+through lists, which are included in Relay's Prelude. (These have all been extensively characterized
+in the functional programming literature, and we do not attempt to reproduce that work in this document.)
+
+.. code-block:: python
+
+   # Map: for [h1, h2, ..., hn] returns [f(h1), f(h2), ..., f(hn)]
+   def @map<a, b>(%f : fn(a) -> b, %l : List[a]) -> List[b] {
+     match(%l) {
+       case Nil() { Nil() }
+       case Cons(%h, %t) { Cons(%f(%h), @map(%f, %t)) }
+     }
+   }
+
+   # Left fold: for [h1, h2, ..., hn] returns f(...(f(f(z, h1), h2)...), hn)
+   def @foldl<a, b>(%f : fn(b, a) -> b, %z : b, %l : List[a]) -> b {
+     match(%l) {
+       case Nil() { %z }
+       case Cons(%h, %t) { @foldl(%f, %f(%z, %h), %t) }
+     }
+   }
+
+   # Right fold: for [h1, h2, ..., hn] returns f(h1, f(h2, f(..., (f(hn, z)...)
+   def @foldr<a, b>(%f : fn(a, b) -> b, %z : b, %l : List[a] -> b {
+     match(%l) {
+       case Nil() { %z }
+       case Cons(%h, %t) { %f(%h, @foldr(%f, %z, %t)) }
+     }
+   }
+
+Using these iteration constructs, many common operations over lists can be expressed compactly.
+For example, the following map doubles all members of a list:
+
+.. code-block:: python
+
+   # directly written
+   def @double(%l : List[Tensor[(), int32]]) -> List[Tensor[(), int32]] {
+     match(%l) {
+       case Nil() { Nil() }
+       case Cons(%h, %t) { Cons(%h * 2, @double(%t)) }
+     }
+   }
+
+   # map takes care of the recursion
+   @map(fn(%i) { %i * 2 }, %l)
+
+The following right fold concatenates two lists:
+
+.. code-block:: python
+
+   # directly written
+   def @concat<a>(%l1 : List[a], %l2 : List[a]) -> List[a] {
+     match(%l1) {
+       case Nil() { %l2 }
+       case Cons(%h, %t) { Cons(%h, @concat(%t, %l2) }
+     }
+   }
+
+   # foldr takes care of the recursion
+   @foldr(fn(%h, %z) { Cons(%h, %z) }, %l2, %l1)
+
+The following left fold flattens a list of lists (using concatenation):
+
+.. code-block:: python
+
+  # directly written
+  def @flatten<a>(%ll : List[List[a]]) -> List[a] {
+    match(%ll) {
+      case Cons(%h, %t) { @concat(%h, @flatten(%t)) }
+      case Nil() { Nil() }
+    }
+
+  # foldl takes care of the recursion
+  @foldl(@concat, Nil(), %ll)
+
+Note that these iteration constructs can be implemented directly in Relay's
+source language and more can easily be defined (and for more data types, like trees),
+rather than being constructs built into the language (e.g.,
+`"foreach" in MXNet <https://mxnet.incubator.apache.org/versions/master/tutorials/control_flow/ControlFlowTutorial.html>`__).
+ADTs and their extensibility allow for a broad range of iterations and data structures to be expressed
+in Relay and supported by the type system without having to modify the language implementation.
+
+Implementing Neural Nets Using ADTs
+===================================
+
+In `this 2015 blog post <http://colah.github.io/posts/2015-09-NN-Types-FP/>`__, Christopher Olah notes that
+many neural networks can be easily expressed using common functional programming constructs. Relay's ADTs
+allow those examples to be implemented directly in TVM.
+
+First let us suppose that we have a function corresponding to a trained recurrent neural net (RNN)
+cell, which takes in a past state and an input value and returns a new state and output value. In
+Relay, this would have the following signature:
+
+.. code-block:: python
+
+   @cell : fn<state_type, in_type, out_type>(state_type, in_type) -> (state_type, out_type)
+
+We might consider a ReLU cell as a simple concrete example, with a trained version below:
+
+.. code-block:: python
+
+  def @linear(%x, %w, %b) { %w*%x + %b }
+
+  def @relu_cell(%w, # weights
+                 %b, # offsets
+                 %s, # state
+                 %x  # input
+  ) {
+    let %x2 = @linear(%x, %w.0, %b.0);
+    let %s2 = @linear(%s, %w.1, %b.1);
+    # doesn't change the state
+    (%s, nn.relu(%x2 + %s2))
+  }
+
+  # this is a higher-order function because it returns a closure
+  def @trained_cell(%w, %b) {
+    fn(%x, %h) { @relu_cell(%w, %b, %x, %h) }
+  }
+
+Following Olah's example, we can encode a sequence (list) of inputs with the following left fold:
+
+.. code-block:: python
+
+   def @encode<state_type, in_type, out_type>(%cell, %input : List[in_type], %init : state_type) -> state_type {
+     # not using the output
+     @foldl(fn(%state, %in) { %cell(%state, %in).0 }, %init, %input)
+   }
+
+Using an *unfold* iterator (from Haskell's standard library), the same cell could be used to make
+a generator network (which takes a single input and produces a sequence of outputs):
+
+.. code-block:: python
+
+   # included in Relay's Prelude
+   def @unfoldr<a, b>(%f : fn(b) -> Optional[(a, b)], %z : b) -> List[a] {
+     match(%f(%z)) {
+       case Some(%pair) { Cons(%pair.0, @unfoldr(%f, %pair.1)) }
+       case None() { Nil() }
+     }
+   }
+
+   # we need some way of generating an input to the cell function given only a state
+   def @gen_func<state_type, in_type, out_type>(%state : state_type) : Optional[(out_type, state_type)] {
+     let %in : Optional[in_type] = @generate_input(%state);
+     match(%in) {
+       case Some(%n) {
+         let %cell_out = @cell(%n, %state);
+         Some((%cell_out.1, %cell_out.0)) # pair of output and state
+       }
+       case None() { None() }
+     }
+   }
+
+   def @generator<state_type, in_type, out_type>(%cell, %init : state_type) -> List[out_type] {
+     @unfoldr(fn(%state) { @gen_func(%cell, %state) }, %init)
+   }
+
+An accumulating map (a fold that simultaneously updates an accumulator value and a list
+of outputs) can be used to write a general RNN (with an output for every input):
+
+.. code-block:: python
+
+   def @map_accumr<a, b, c>(%f : fn(a, b) -> (a, c), %acc : a, %l : List[b]) -> (a, List[c]) {
+     match(%l) {
+       case Nil() { (%acc, Nil()) }
+       case Cons(%b, %t) {
+         let %update = %f(%acc, %b);
+         let %rest = @map_accumr(%f, %update.0, %t));
+         (%rest.0, Cons(%update.1, %rest.1))
+       }
+     }
+   }
+
+   # can also be implemented as a right fold
+   # (this version is included in Relay's Prelude)
+   def @map_accumr_fold(%f, %acc, %l) {
+     @foldr(fn(%b, %p) {
+       let %f_out = %f(%p.0, %b);
+       (%f_out.0, Cons(%f_out.1, %p.1))
+     },
+     (%acc, Nil()), %l)
+   }
+
+   def @general_rnn<state_type, in_type, out_type>(%cell, %init : state_type, %input : List[in_type])
+     -> (state_type, List[out_type]) {
+     @map_accumr(%cell, %init, %input)
+   }
+
+Olah also gives an example of a bidirectional neural network, in which two sets of
+cells (which may have different weights) process the input in both directions and produce a
+single set of outputs. The following is a Relay implementation of that example:
+
+.. code-block:: python
+
+   # creates a list of tuples from two lists
+   # included in Relay's Prelude
+   def @zip<a, b>(%l : List[a], %m : List[b]) -> List[(a, b)] {
+     match(%l) {
+       case Nil() { Nil() }
+       case Cons(%a, %t1) {
+         match(%m) {
+           case Nil() { Nil() }
+           case Cons(%b, %t2) { Cons((%a, %b), @zip(%t1, %t2)) }
+         }
+       }
+     }
+   }
+
+   # analogous to map_accumr
+   # included in Relay's Prelude
+   def @map_accmul(%f, %acc, %l) {
+     @foldl(fn(%p, %b){
+       let %f_out = %f(%p.0, %b);
+       (%f_out.0, Cons(%f_out.1, %p.1))
+     }, (%acc, Nil()), %l)
+   }
+
+   def @bidirectional_rnn<state1_type, state2_type, in_type, out1_type, out2_type>
+     (%cell1, %cell2, %state1 : state1_type, %state2 : state2_type, %input : List[in_type])
+     -> List[(out1_type, out2_type)] {
+     @zip(@map_accumr(%cell1, %state1, %input).1, @map_accuml(%cell2, %state2, %input).1)
+   }
diff --git a/docs/langref/relay_expr.rst b/docs/langref/relay_expr.rst
index 6c4ec5ca2624..efa74d314198 100644
--- a/docs/langref/relay_expr.rst
+++ b/docs/langref/relay_expr.rst
@@ -21,13 +21,14 @@ constructs corresponds to a pure computation graph:
 - Tuple `Construction`_ and `Projection`_
 - `Let Bindings`_
 - `Graph Bindings`_
-- Calls to `Operators`_
+- Calls to `Operators`_ and `ADT Constructors`_
 
 Control flow expressions allow the graph topology to change
 based on the value of previously executed expressions. The control
 fragment in Relay includes the following constructs:
 
 - `If-Then-Else`_ Expressions
+- `ADT Matching`_ Expressions
 - Recursive Calls in Functions
 
 From the point of view of a computation graph, a function is a subgraph and a function call inlines the subgraph, substituting its arguments for the free variables in the subgraph with corresponding names.
@@ -282,6 +283,42 @@ handles for generating a call to various pre-registered operators.
 The :ref:`tutorial on adding operators to Relay <relay-add-op>` shows how to add further
 operators into the language.
 
+ADT Constructors
+================
+
+Algebraic data types (ADTs) in Relay are described in detail in a
+:ref:`separate overview<adt-overview>` and their integration into
+the type system is described :ref:`here<adt-typing>`.
+
+In this section, we will simply note that ADT constructors are given
+a function type and should be used inside call nodes like a function
+or operator. An ADT constructor is defined by giving the name of
+the ADT it constructs (a global type variable) and the types of the
+expected arguments for the constructor.
+
+If the ADT definition includes type variables, those type variables
+may appear in the constructor. Constructors cannot include any other
+type variables.
+
+Let us suppose that :code:`D` is an ADT that takes type parameters
+:code:`a` and :code:`b`. If :code:`C1` is a constructor for :code:`D`
+and expects two arguments, one of type :code:`a` and one of type :code:`b`, then
+:code:`C1` has the following type signature:
+:code:`fun<a, b>(a, b) -> D[a, b]`. (See either the ADT overview
+or the discussion of ADT typing for an explanation of the type call
+in the return type.)
+If another constructor for :code:`D`, :code:`C2`, takes no arguments,
+then it has the following type signature: :code:`fun<a, b>() -> D[a, b]`;
+the type parameters will always appear in the return type.
+
+Once called, a constructor produces an ADT instance, which is a
+container that stores the values of the arguments to the constructor
+as well as the name ("tag") of the constructor. The tag will be used
+for deconstructing the instances and retrieving the values when
+`ADT Matching`_.
+
+See :py:class:`~tvm.relay.adt.Constructor` for the definition and documentation.
+
 Call
 ====
 
@@ -343,6 +380,8 @@ the type annotation:
 
 See :py:class:`~tvm.relay.expr.Call` for its definition and documentation.
 
+.. _module-description:
+
 Module and Global Functions
 ===========================
 
@@ -542,6 +581,82 @@ condition value evaluates to :code:`False`.
 
 See :py:class:`~tvm.relay.expr.If` for its definition and documentation.
 
+ADT Matching
+============
+
+Instances of algebraic data types (ADTs), as discussed in the
+:ref:`ADT overview<adt-overview>`, are containers that store the
+arguments passed to the constructor used to create them, tagged by
+the constructor name.
+
+Match expressions in Relay allow for retrieving the values stored in
+an ADT instance ("deconstructing" it) based on their constructor tag.
+A match expression behaves similarly to a C-style :code:`switch` statement,
+branching on the different possible constructors for the type of the
+value being deconstructed. As the ADT overview details, match
+expressions are capable of more general pattern-matching than simply
+splitting by constructors: any ADT instance nested inside an instance
+(e.g., a list of lists) can be deconstructed at the same time as
+the outer instance, while the different fields of the instance can be
+bound to variables. (See :ref:`this section<adt-pattern>` for a detailed
+description of ADT pattern-matching.)
+
+A match expression is defined using the
+input value (an expression) and a list of clauses, each of which
+consists of a pattern and an expression. When executed, the *first*
+clause whose pattern matches the structure of the queried value is
+executed; the clause expression is evaluated and returned.
+
+For example, suppose we have an ADT for natural numbers:
+
+.. code-block:: python
+
+   data Nat {
+     Z : () -> Nat # zero
+     S : (Nat) -> Nat # successor (+1) to a nat
+   }
+
+Then the following function subtracts one from a passed nat:
+
+.. code-block:: python
+
+   fn(%v: Nat[]) -> Nat[] {
+     match(%v) {
+       case Z() { Z() }
+       case S(%n) { %n } # the variable %n is bound in the scope of this clause
+     }
+   }
+
+The following function subtracts two from its argument if it is at least
+two and returns the argument otherwise, using a nested constructor pattern:
+
+.. code-block:: python
+
+   fn(%v : Nat[]) -> Nat[] {
+     match(%v) {
+        case S(S(%n)) { %n }
+        # wildcard pattern: matches all cases not matched already
+        case _ { %v }
+     }
+   }
+
+As aforementioned, the ordering of match clauses is relevant.
+In the below example, the first clause will always match so
+those below it can never run:
+
+.. code-block:: python
+
+   fn(%v : Nat[]) -> Nat[] {
+     match(%v) {
+       case _ { %v }
+       case S(S(%n)) { S(%n) }
+       case S(%n) { %n }
+       case Z() { S(Z()) }
+     }
+   }
+
+See :py:class:`~tvm.relay.adt.Match` for its definition and documentation.
+
 TempExprs
 =========
 
diff --git a/docs/langref/relay_type.rst b/docs/langref/relay_type.rst
index 8d725b7d97f5..45217b234e76 100644
--- a/docs/langref/relay_type.rst
+++ b/docs/langref/relay_type.rst
@@ -10,6 +10,9 @@ be fully typed while requiring just a few explicit type annotations.
 Static types are useful when performing compiler optimizations because they
 communicate properties about the data a program manipulates, such as runtime
 shape, data layout, and storage, without needing to run the program.
+Relay's `Algebraic Data Types`_ allow for easily and flexibly composing
+types in order to build data structures that can be
+reasoned about inductively and used to write recursive functions.
 
 Relay's type system features a form of *dependent typing* for shapes. That is, its type system keeps track of the shapes of tensors in a Relay program. Treating tensor
 shapes as types allows Relay to perform more powerful reasoning at compile time;
@@ -33,20 +36,17 @@ type relations are satisfied at call sites). Type relations offer a flexible and
 relatively simple way of making the power of dependent typing available in Relay
 without greatly increasing the complexity of its type system.
 
-Types
-=====
-
 Below we detail the language of types in Relay and how they are assigned to Relay expressions.
 
 Type
-~~~~
+====
 
 The base type for all Relay types. All Relay types are sub-classes of this base type.
 
 See :py:class:`~tvm.relay.ty.Type` for its definition and documentation.
 
 Tensor Type
-~~~~~~~~~~~
+===========
 
 A concrete tensor type in Relay.
 
@@ -70,7 +70,7 @@ For example, here is a simple concrete tensor type corresponding to a 10-by-10 t
 See :py:class:`~tvm.relay.ty.TensorType` for its definition and documentation.
 
 Tuple Type
-~~~~~~~~~~
+==========
 
 A type of a tuple in Relay.
 
@@ -94,7 +94,7 @@ See :py:class:`~tvm.relay.ty.TupleType` for its definition and documentation.
 .. _type-parameter:
 
 Type Parameter
-~~~~~~~~~~~~~~
+==============
 
 Type parameters represent placeholder types used for polymorphism in functions.
 Type parameters are specified according to *kind*, corresponding to the types
@@ -127,7 +127,7 @@ be substituted with :code:`(10, 10)` at the call site below:
 See :py:class:`~tvm.relay.ty.TypeVar` for its definition and documentation.
 
 Type Constraint
-~~~~~~~~~~~~~~~
+===============
 
 This is an abstract class representing a type constraint, to be elaborated
 upon in further releases. Currently, type relations are the only
@@ -136,7 +136,7 @@ type constraints provided; they are discussed below.
 See :py:class:`~tvm.relay.ty.TypeConstraint` for its definition and documentation.
 
 Function Type
-~~~~~~~~~~~~~
+=============
 
 A function type in Relay, see `tvm/relay/type.h` for more details.
 
@@ -158,7 +158,7 @@ See :py:class:`~tvm.relay.ty.FuncType` for its definition and documentation.
 .. _type-relation:
 
 Type Relation
-~~~~~~~~~~~~~
+=============
 
 A type relation is the most complex type system feature in Relay.
 It allows users to extend type inference with new rules.
@@ -226,7 +226,7 @@ in C++.
 See :py:class:`~tvm.relay.ty.TypeRelation` for its definition and documentation.
 
 Incomplete Type
-~~~~~~~~~~~~~~~
+===============
 
 An incomplete type is a type or portion of a type that is not yet known.
 This is only used during type inference. Any omitted type annotation is
@@ -239,3 +239,139 @@ parameters: Type parameters must be bound to a function and are replaced with co
 at call sites, whereas incomplete types may appear anywhere in the program and are filled in during type inference.
 
 See :py:class:`~tvm.relay.ty.IncompleteType` for its definition and documentation.
+
+.. _adt-typing:
+
+Algebraic Data Types
+====================
+
+*Note: ADTs are not currently supported in the text format.*
+
+Algebraic data types (ADTs) are described in more detail in
+:ref:`their overview <adt-overview>`; this section describes
+their implementation in the type system.
+
+An ADT is defined by a collection of named constructors,
+each of which takes arguments of certain types.
+An instance of an ADT is a container that stores the values
+of the constructor arguments used to produce it as well as the
+name of the constructor; the values can be retrieved by
+deconstructing the instance by matching based on its constructor.
+Hence, ADTs are sometimes called "tagged unions": like a C-style
+union, the contents of an instance for a given ADT may have
+different types in certain cases, but the constructor serves as a
+tag to indicate how to interpret the contents.
+
+From the type system's perspective, it is most pertinent that
+ADTs can take type parameters (constructor arguments can be
+type parameters, though ADT instances with different type
+parameters must be treated as different types) and be
+recursive (a constructor for an ADT can take an instance of
+that ADT, thus an ADT like a tree or list can be inductively
+built up). The representation of ADTs in the type system must
+be able to accomodate these facts, as the below sections will detail.
+
+Global Type Variable
+~~~~~~~~~~~~~~~~~~~~
+
+To represent ADTs compactly and easily allow for recursive ADT definitions,
+an ADT definition is given a handle in the form of a global type variable
+that uniquely identifies it. Each ADT definition is given a fresh global
+type variable as a handle, so pointer equality can be used to distinguish
+different ADT names.
+
+For the purposes of Relay's type system, ADTs are differentiated by name;
+that means that if two ADTs have different handles, they will be
+considered different types even if all their constructors are
+structurally identical.
+
+Recursion in an ADT definition thus follows just like recursion for a
+global function: the constructor can simply reference the ADT handle
+(global type variable) in its definition.
+
+See :py:class:`~tvm.relay.ty.GlobalTypeVar` for its definition and documentation.
+
+Definitions (Type Data)
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Besides a name, an ADT needs to store the constructors that are used
+to define it and any type paramters used within them. These are
+stored in the module, :ref:`analogous to global function definitions<module-description>`.
+
+While type-checking uses of ADTs, the type system sometimes must
+index into the module using the ADT name to look up information
+about constructors. For example, if a constructor is being pattern-matched
+in a match expression clause, the type-checker must check the constructor's
+signature to ensure that any bound variables are being assigned the
+correct types.
+
+See :py:class:`~tvm.relay.adt.TypeData` for its definition and documentation.
+
+Type Call
+~~~~~~~~~
+
+Because an ADT definition can take type parameters, Relay's type
+system considers an ADT definition to be a *type-level function*
+(in that the definition takes type parameters and returns the
+type of an ADT instance with those type parameters). Thus, any
+instance of an ADT is typed using a type call, which explicitly
+lists the type parameters given to the ADT definition.
+
+It is important to list the type parameters for an ADT instance,
+as two ADT instances built using different constructors but the
+same type parameters are of the *same type* while two ADT instances
+with different type parameters should not be considered the same
+type (e.g., a list of integers should not have the same type as
+a list of pairs of floating point tensors).
+
+The "function" in the type call is the ADT handle and there must
+be one argument for each type parameter in the ADT definition. (An
+ADT definition with no arguments means that any instance will have
+no type arguments passed to the type call).
+
+See :py:class:`~tvm.relay.ty.TypeCall` for its definition and documentation.
+
+Example: List ADT
+~~~~~~~~~~~~~~~~~
+
+This subsection uses the simple list ADT (included as a default
+ADT in Relay) to illustrate the constructs described in the previous
+sections. Its definition is as follows:
+
+.. code-block:: python
+
+   data List<a> {
+     Nil : () -> List
+     Cons : (a, List[a]) -> List
+   }
+
+Thus, the global type variable :code:`List` is the handle for the ADT.
+The type data for the list ADT in the module notes that
+:code:`List` takes one type parameter and has two constructors,
+:code:`Nil` (with signature :code:`fn<a>() -> List[a]`)
+and :code:`Cons` (with signature :code:`fn<a>(a, List[a]) -> List[a]`).
+The recursive reference to :code:`List` in the :code:`Cons`
+constructor is accomplished by using the global type
+variable :code:`List` in the constructor definition.
+
+Below two instances of lists with their types given, using type calls:
+
+.. code-block:: python
+
+   Cons(1, Cons(2, Nil())) # List[Tensor[(), int32]]
+   Cons((1, 1), Cons((2, 2), Nil())) # List[(Tensor[(), int32], Tensor[(), int32])]
+
+Note that :code:`Nil()` can be an instance of any list because it
+does not take any arguments that use a type parameter. (Nevertheless,
+for any *particular* instance of :code:`Nil()`, the type parameter must
+be specified.)
+
+Here are two lists that are rejected by the type system because
+the type parameters do not match:
+
+.. code-block:: python
+
+   # attempting to put an integer on a list of int * int tuples
+   Cons(1, Cons((1, 1), Nil()))
+   # attempting to put a list of ints on a list of lists of int * int tuples
+   Cons(Cons(1, Cons(2, Nil())), Cons(Cons((1, 1), Cons((2, 2), Nil())), Nil()))