From 1bc5f4603013392debd3c9b5b394b471e40a32e9 Mon Sep 17 00:00:00 2001 From: "Steven S. Lyubomirsky" Date: Mon, 18 Feb 2019 13:59:20 -0800 Subject: [PATCH] [Relay][Docs] Documentation for Algebraic Data Types (#2575) --- docs/langref/index.rst | 10 +- docs/langref/relay_adt.rst | 515 ++++++++++++++++++++++++++++++++++++ docs/langref/relay_expr.rst | 117 +++++++- docs/langref/relay_type.rst | 158 ++++++++++- 4 files changed, 784 insertions(+), 16 deletions(-) create mode 100644 docs/langref/relay_adt.rst diff --git a/docs/langref/index.rst b/docs/langref/index.rst index f89d29d9218d1..e586a052e4e4b 100644 --- a/docs/langref/index.rst +++ b/docs/langref/index.rst @@ -8,9 +8,9 @@ Introduction to Relay Relay is a functional, differentiable programming language designed to be an expressive intermediate representation for machine -learning systems. Relay supports closures, control flow, and -recursion, allowing it to directly represent more complex models -than can computation graph-based IRs. +learning systems. Relay supports algebraic data types, closures, +control flow, and recursion, allowing it to directly represent more +complex models than computation graph-based IRs can. Relay also includes a form of dependent typing using *type relations* in order to handle shape analysis for operators with complex requirements on argument shapes. @@ -19,13 +19,15 @@ Relay is extensible by design and makes it easy for machine learning researchers and practitioners to develop new large-scale program transformations and optimizations. -The below pages describe the grammar, type system, and operators in Relay, respectively. +The below pages describe the grammar, type system, +algebraic data types, and operators in Relay, respectively. .. toctree:: :maxdepth: 2 relay_expr relay_type + relay_adt relay_op Hybrid Script diff --git a/docs/langref/relay_adt.rst b/docs/langref/relay_adt.rst new file mode 100644 index 0000000000000..3ca2eab844348 --- /dev/null +++ b/docs/langref/relay_adt.rst @@ -0,0 +1,515 @@ +.. _adt-overview: + +============================= +Algebraic Data Types in Relay +============================= + +Algebraic data types (ADTs) are a staple feature of functional programming languages, +particularly those derived from ML, because they express data structures in a +manner that is easy to reason about when writing recursive computations. +Because recursion is intended to be one of the primary mechanisms of control +flow in Relay, it is important that Relay include ADTs in order to best express +loops and other control flow structures that must be implemented using recursion. + +Defining and Matching on an ADT +=============================== + +*Note: ADTs are not presently supported in the text format. The syntax here is speculative, based on ADTs in other languages.* + +ADTs can be understood as a generalized version of :code:`enum` and :code:`struct` types +from C-like languages. Like a C :code:`struct:`, an ADT instance is a container for fields +of specified types, but the type system allows for the same type to encode different possible +groupings of fields in a systematic manner, similar to C :code:`enum` types, which are +defined using a finite set of possible values named by the user. + +Specifically, an ADT is defined as a named group of constructors, each of which is +a function that takes values of specified types as arguments and returns an instance +of the named ADT. An ADT instance simply contains the values of the arguments +passed to the constructor call used to produce it. + +An ADT value is opaque until it is *deconstructed*, allowing the arguments to the +constructor to be accessed again and used to compute new values. Because +a particular ADT can have multiple constructors with different signatures, +it is usually necessary to branch on the different possible constructors, +resulting in the *match* syntax for ADTs. Hence, ADTs are sometimes called +"tagged unions" because an ADT instance is tagged by the name of the constructor +used to produce it and can later be inspected based on the tag. + +Because each ADT has a finite set of constructors, it is straightforward to determine +whether a function processing an ADT instance is handling all possible cases. +In particular, the type system can ensure that types are properly assigned in all cases when +deconstructing an ADT instance, in contrast to :code:`union` types in C. +Hence, it is often easy to reason about ADTs. + +*Implementation detail: Relay ADT definitions are global and are stored in the module, similarly to global function definitions. An ADT name is, in fact, a global type variable (just as a global function name is a global variable). The module keeps a mapping of ADT names (global type variables) to the list of constructors for that ADT.* + +Below is a simple example of defining an ADT and using it in a function +via a match expression: + +.. code-block:: python + + # Defines an ADT named "Numbers" + data Numbers { + Empty : () -> Numbers + Single : (Tensor[(), int32]) -> Numbers + Pair : (Tensor[(), int32], Tensor[(), int32]) -> Numbers + } + # A Numbers value can be produced using an Empty, Single, or Pair + # constructor, each with a signature given above + + def @sum(%n : Numbers[]) -> Tensor[(), int32] { + # The match expression branches on the constructor that was + # used to produce %n. The variables in each case are bound + # if the constructor matches that used for %n + match(%n) { + case Empty() { 0 } + case Single(x) { x } + case Pair(x, y) { x + y } + } + } + + @sum(Empty()) # evaluates to 0 + @sum(Single(3)) # evaluates to 3 + @sum(Pair(5, 6)) # evaluates to 11 + +Note that ADTs are identified by name, +meaning that two ADTs with structurally identical constructors +will nevertheless be distinct data types from the point of view of +the typechecker. + +.. code-block:: python + + # structurally identical constructors to Numbers + data Numbers2 { + Empty2 : () -> Numbers2 + Single2 : (Tensor[(), int32]) -> Numbers2 + Pair2 : (Tensor[(), int32], Tensor[(), int32]) -> Numbers2 + } + + # the below results in a type error because Numbers2 + # is a distinct type from Numbers + # fn() { @sum(Empty2()) } + +Type-Checking ADTs and Polymorphism +=================================== + +This section will go into more specific detail about the typing of ADTs. +Most of the complexity involved results from the fact that, as with functions, ADTs +can be polymorphic and take type parameters. + +For example, one of the standard ADTs commonly used in functional +programming languages is the optional type, defined here: + +.. code-block:: python + + # a is a type parameter + data Optional { + None : () -> Optional + Some : (a) -> Optional + } + +Optional types are commonly used as the return type for any operation +involving querying into a data structure (returning :code:`Some(v)` +if a value is found and :code:`None` if it isn't). +Taking a type parameter in the definition allows the same optional type +to be used in a wide variety of situations, rather than having to +define a unique ADT for each different type that could be contained in it. + +However, it is important to ensure that option types whose contents +are of different types can still be distinguished by the type system, +since it would violate type safety if a function expecting an option +containing a :code:`Tensor[(), int32]` instead receives an option +containing a :code:`Tensor[(3, 4), float32]`. As this example may +imply, an ADT instance is thus given a type that contains the +concrete type arguments for that instance, ensuring the information is +kept around. Let the below example illustrate: + +.. code-block:: python + + # the signature for option indicates the type argument + def @inc_scalar(%opt : Optional[Tensor[(), int32]]) -> Tensor[(), int32] { + match(%opt) { + case None() { 1 } + case Some(%s) { %s + 1 } + } + } + + def @main() { + let %one : Optional[Tensor[(), int32]] = Some(1); + let %big : Optional[Tensor[(10, 10), float32]] + = Some(Constant(1, (10, 10), float32)); + let %two = inc_scalar(%one); + # let %bigger = inc_scalar(%big); # type system rejects + # None does not take an argument so it can always implicitly + # be given the correct type arguments + let %z = inc_scalar(None()); + () + } + +The syntax for the annotated type arguments +(e.g., :code:`Optional[Tensor[(), int32]]`) in the above examples is +called a "type call," treating the polymorphic ADT definition as a +type-level function (taking type params and returning a type, namely +the ADT). Any ADT appearing in a type annotation or function signature +must be annotated with type arguments (a non-polymorphic ADT must be +in a type call with no arguments). + +Thus, we can say in general that if constructor :code:`C` that +takes arguments of types :code:`T1, ..., Tn` is a constructor +for an ADT :code:`D` that takes type parameters :code:`v1, ..., vn` +(where :code:`T1, ..., Tn` may contain any of the :code:`v1, ..., vn`), +then :code:`C` has +the type :code:`fun(T1, ..., Tn) -> D[v1, ..., vn]`. +This means that constructors are typed like ordinary functions and +thus appear inside call nodes and can be passed to or returned by +other functions. In particular, the :code:`Some` example above has +the signature :code:`fun(a) -> Optional[a]`, while :code:`None` +has the signature :code:`fun() -> Optional[a]`. + +Recursion with ADTs +=================== + +ADT definitions are allowed to be recursive, that is, a definition for +an ADT named :code:`D` can assume the existence of type :code:`D` and +use it as an argument to constructors. Recursion allows ADTs to +represent complex structures such as lists or trees; it is the source +of much of ADTs' power in functional programming, since an appropriately +designed data structure could make it easy to concisely express a +computation with a recursive function. + +Many commonly used ADTs involve recursion; some of these are given +in `Common ADT Uses`_. As an example here, we will +examine the list ADT, ubiquitous in functional languages: + +.. code-block:: python + + data List { + Nil : () -> List + Cons : (a, List[a]) -> List + } + +(Notice that the recursive reference to :code:`List` is wrapped +in a type call even in the constructor.) + +The above definition means that a list of values of a particular type +can be represented by nesting :code:`Cons` constructors until the +end of the list is reached, which can be indicated with a :code:`Nil` +(representing an empty list). + +Lists represented in this manner can easily be recursively processed. +For example, the following function sums a list of integers: + +.. code-block:: python + + def @list_sum(%l : List[Tensor[(), int32]]) -> Tensor[(), int32] { + match(%l) { + case Nil() { 0 } + # add the head of the list to the sum of the tail + case Cons(%h, %t) { %h + @list_sum(%t) } + } + } + +As it happens, many recursive functions on lists like the one just given +share structures that can be factored out into generic, easily +usable functions that will be discussed under `Common ADT Uses`_. + +.. _adt-pattern: + +Pattern Matching in Match Expressions +===================================== + +Match expressions in Relay, as in other functional languages, are capable of +more versatile pattern matching than simply having one case for each constructor +for the datatype of the value being deconstructed. + +In particular, the patterns in match cases can be built up recursively: + +- Constructor patterns match for a particular ADT constructor. If a value matches the constructor, each argument to the constructor will be matched against a nested pattern. +- Wildcard patterns will match any value and will not bind to a variable. +- Variable patterns will match any value and bind it to a local variable, scoped to the match clause. + +In the simple case of :code:`@list_sum` above, the first match case has a :code:`Nil` constructor pattern (with no nested arguments) +and the second has a :code:`Cons` constructor pattern that uses variable patterns for each of the arguments to :code:`Cons`. + +The below example uses a wildcard pattern to ignore one of the arguments to :code:`Cons`: + +.. code-block:: python + + def @first(%l : List[a]) -> Optional[a] { + match(%l) { + case Nil() { None() } + case Cons(%h, _) { Some(%h) } # list tail is unused and ignored + } + } + +Here, a constructor pattern is nested inside another constructor pattern to avoid nested match expressions for a list option. +A top-level wildcard pattern is also used to handle all cases that do not match the first clause: + +.. code-block:: python + + def @second_opt(%ll : Optional[List[a]]) -> Optional[a] { + match(%ll) { + # we only need the second member of the list if there is one + case Some(Cons(_, Cons(%s, _))) { Some(%s) } + case _ { None() } + } + } + + # @second_opt(Some(Cons(1, Nil()))) evaluates to None() + # @second_opt(Some(Cons(1, Cons(2, Nil())))) evaluates to Some(2) + # @second_opt(Some(Nil())) evaluates to None() + # @second_opt(None()) evaluates to None() + +Note that a match expression checks its patterns in the order the cases are listed: the first clause whose pattern +that matches the input value is the one that is evaluated. Here, a top-level variable pattern binds the whole +input value: + +.. code-block:: python + + def @match_order_beware(%l : List[a]) -> List[a] { + match(%l) { + case %v { %v } + # the above matches everything so neither of these runs + case Cons(%h, %t) { Cons(%h, @match_order_beware(%t)) } + case Nil() { Nil() } + } + } + +Common ADT Uses +=============== + +In functional programming languages, certain ADTs provide useful facilities for writing common programs. +Parametric polymorphism and higher-order functions allow these ADTs to be easily reuseable and for generic +functions to manipulate them in common situations. Relay includes a "Prelude" of certain pre-defined ADTs +and functions for them that correspond to the indispensable ADTs of other languages. + +The option type defined under `Type-Checking ADTs and Polymorphism`_ is one such ADT, used +whenever it can make sense for a function to only return a value under certain circumstances. Having +the option type allows for the type system to keep track of which functions always return a value +of a certain type versus returning an option of that type, ensuring that any options are always +explicitly checked (contrast with returning null pointers or throwing +exceptions as other ways to addressing that problem). + +Lists (defined in `Recursion with ADTs`_) can be manipulated by generic functions in a manner similar to +list comprehensions and certain library functions in Python. Below are very common functions for iterating +through lists, which are included in Relay's Prelude. (These have all been extensively characterized +in the functional programming literature, and we do not attempt to reproduce that work in this document.) + +.. code-block:: python + + # Map: for [h1, h2, ..., hn] returns [f(h1), f(h2), ..., f(hn)] + def @map(%f : fn(a) -> b, %l : List[a]) -> List[b] { + match(%l) { + case Nil() { Nil() } + case Cons(%h, %t) { Cons(%f(%h), @map(%f, %t)) } + } + } + + # Left fold: for [h1, h2, ..., hn] returns f(...(f(f(z, h1), h2)...), hn) + def @foldl(%f : fn(b, a) -> b, %z : b, %l : List[a]) -> b { + match(%l) { + case Nil() { %z } + case Cons(%h, %t) { @foldl(%f, %f(%z, %h), %t) } + } + } + + # Right fold: for [h1, h2, ..., hn] returns f(h1, f(h2, f(..., (f(hn, z)...) + def @foldr(%f : fn(a, b) -> b, %z : b, %l : List[a] -> b { + match(%l) { + case Nil() { %z } + case Cons(%h, %t) { %f(%h, @foldr(%f, %z, %t)) } + } + } + +Using these iteration constructs, many common operations over lists can be expressed compactly. +For example, the following map doubles all members of a list: + +.. code-block:: python + + # directly written + def @double(%l : List[Tensor[(), int32]]) -> List[Tensor[(), int32]] { + match(%l) { + case Nil() { Nil() } + case Cons(%h, %t) { Cons(%h * 2, @double(%t)) } + } + } + + # map takes care of the recursion + @map(fn(%i) { %i * 2 }, %l) + +The following right fold concatenates two lists: + +.. code-block:: python + + # directly written + def @concat(%l1 : List[a], %l2 : List[a]) -> List[a] { + match(%l1) { + case Nil() { %l2 } + case Cons(%h, %t) { Cons(%h, @concat(%t, %l2) } + } + } + + # foldr takes care of the recursion + @foldr(fn(%h, %z) { Cons(%h, %z) }, %l2, %l1) + +The following left fold flattens a list of lists (using concatenation): + +.. code-block:: python + + # directly written + def @flatten(%ll : List[List[a]]) -> List[a] { + match(%ll) { + case Cons(%h, %t) { @concat(%h, @flatten(%t)) } + case Nil() { Nil() } + } + + # foldl takes care of the recursion + @foldl(@concat, Nil(), %ll) + +Note that these iteration constructs can be implemented directly in Relay's +source language and more can easily be defined (and for more data types, like trees), +rather than being constructs built into the language (e.g., +`"foreach" in MXNet `__). +ADTs and their extensibility allow for a broad range of iterations and data structures to be expressed +in Relay and supported by the type system without having to modify the language implementation. + +Implementing Neural Nets Using ADTs +=================================== + +In `this 2015 blog post `__, Christopher Olah notes that +many neural networks can be easily expressed using common functional programming constructs. Relay's ADTs +allow those examples to be implemented directly in TVM. + +First let us suppose that we have a function corresponding to a trained recurrent neural net (RNN) +cell, which takes in a past state and an input value and returns a new state and output value. In +Relay, this would have the following signature: + +.. code-block:: python + + @cell : fn(state_type, in_type) -> (state_type, out_type) + +We might consider a ReLU cell as a simple concrete example, with a trained version below: + +.. code-block:: python + + def @linear(%x, %w, %b) { %w*%x + %b } + + def @relu_cell(%w, # weights + %b, # offsets + %s, # state + %x # input + ) { + let %x2 = @linear(%x, %w.0, %b.0); + let %s2 = @linear(%s, %w.1, %b.1); + # doesn't change the state + (%s, nn.relu(%x2 + %s2)) + } + + # this is a higher-order function because it returns a closure + def @trained_cell(%w, %b) { + fn(%x, %h) { @relu_cell(%w, %b, %x, %h) } + } + +Following Olah's example, we can encode a sequence (list) of inputs with the following left fold: + +.. code-block:: python + + def @encode(%cell, %input : List[in_type], %init : state_type) -> state_type { + # not using the output + @foldl(fn(%state, %in) { %cell(%state, %in).0 }, %init, %input) + } + +Using an *unfold* iterator (from Haskell's standard library), the same cell could be used to make +a generator network (which takes a single input and produces a sequence of outputs): + +.. code-block:: python + + # included in Relay's Prelude + def @unfoldr(%f : fn(b) -> Optional[(a, b)], %z : b) -> List[a] { + match(%f(%z)) { + case Some(%pair) { Cons(%pair.0, @unfoldr(%f, %pair.1)) } + case None() { Nil() } + } + } + + # we need some way of generating an input to the cell function given only a state + def @gen_func(%state : state_type) : Optional[(out_type, state_type)] { + let %in : Optional[in_type] = @generate_input(%state); + match(%in) { + case Some(%n) { + let %cell_out = @cell(%n, %state); + Some((%cell_out.1, %cell_out.0)) # pair of output and state + } + case None() { None() } + } + } + + def @generator(%cell, %init : state_type) -> List[out_type] { + @unfoldr(fn(%state) { @gen_func(%cell, %state) }, %init) + } + +An accumulating map (a fold that simultaneously updates an accumulator value and a list +of outputs) can be used to write a general RNN (with an output for every input): + +.. code-block:: python + + def @map_accumr(%f : fn(a, b) -> (a, c), %acc : a, %l : List[b]) -> (a, List[c]) { + match(%l) { + case Nil() { (%acc, Nil()) } + case Cons(%b, %t) { + let %update = %f(%acc, %b); + let %rest = @map_accumr(%f, %update.0, %t)); + (%rest.0, Cons(%update.1, %rest.1)) + } + } + } + + # can also be implemented as a right fold + # (this version is included in Relay's Prelude) + def @map_accumr_fold(%f, %acc, %l) { + @foldr(fn(%b, %p) { + let %f_out = %f(%p.0, %b); + (%f_out.0, Cons(%f_out.1, %p.1)) + }, + (%acc, Nil()), %l) + } + + def @general_rnn(%cell, %init : state_type, %input : List[in_type]) + -> (state_type, List[out_type]) { + @map_accumr(%cell, %init, %input) + } + +Olah also gives an example of a bidirectional neural network, in which two sets of +cells (which may have different weights) process the input in both directions and produce a +single set of outputs. The following is a Relay implementation of that example: + +.. code-block:: python + + # creates a list of tuples from two lists + # included in Relay's Prelude + def @zip(%l : List[a], %m : List[b]) -> List[(a, b)] { + match(%l) { + case Nil() { Nil() } + case Cons(%a, %t1) { + match(%m) { + case Nil() { Nil() } + case Cons(%b, %t2) { Cons((%a, %b), @zip(%t1, %t2)) } + } + } + } + } + + # analogous to map_accumr + # included in Relay's Prelude + def @map_accmul(%f, %acc, %l) { + @foldl(fn(%p, %b){ + let %f_out = %f(%p.0, %b); + (%f_out.0, Cons(%f_out.1, %p.1)) + }, (%acc, Nil()), %l) + } + + def @bidirectional_rnn + (%cell1, %cell2, %state1 : state1_type, %state2 : state2_type, %input : List[in_type]) + -> List[(out1_type, out2_type)] { + @zip(@map_accumr(%cell1, %state1, %input).1, @map_accuml(%cell2, %state2, %input).1) + } diff --git a/docs/langref/relay_expr.rst b/docs/langref/relay_expr.rst index 6c4ec5ca26242..efa74d3141985 100644 --- a/docs/langref/relay_expr.rst +++ b/docs/langref/relay_expr.rst @@ -21,13 +21,14 @@ constructs corresponds to a pure computation graph: - Tuple `Construction`_ and `Projection`_ - `Let Bindings`_ - `Graph Bindings`_ -- Calls to `Operators`_ +- Calls to `Operators`_ and `ADT Constructors`_ Control flow expressions allow the graph topology to change based on the value of previously executed expressions. The control fragment in Relay includes the following constructs: - `If-Then-Else`_ Expressions +- `ADT Matching`_ Expressions - Recursive Calls in Functions From the point of view of a computation graph, a function is a subgraph and a function call inlines the subgraph, substituting its arguments for the free variables in the subgraph with corresponding names. @@ -282,6 +283,42 @@ handles for generating a call to various pre-registered operators. The :ref:`tutorial on adding operators to Relay ` shows how to add further operators into the language. +ADT Constructors +================ + +Algebraic data types (ADTs) in Relay are described in detail in a +:ref:`separate overview` and their integration into +the type system is described :ref:`here`. + +In this section, we will simply note that ADT constructors are given +a function type and should be used inside call nodes like a function +or operator. An ADT constructor is defined by giving the name of +the ADT it constructs (a global type variable) and the types of the +expected arguments for the constructor. + +If the ADT definition includes type variables, those type variables +may appear in the constructor. Constructors cannot include any other +type variables. + +Let us suppose that :code:`D` is an ADT that takes type parameters +:code:`a` and :code:`b`. If :code:`C1` is a constructor for :code:`D` +and expects two arguments, one of type :code:`a` and one of type :code:`b`, then +:code:`C1` has the following type signature: +:code:`fun(a, b) -> D[a, b]`. (See either the ADT overview +or the discussion of ADT typing for an explanation of the type call +in the return type.) +If another constructor for :code:`D`, :code:`C2`, takes no arguments, +then it has the following type signature: :code:`fun() -> D[a, b]`; +the type parameters will always appear in the return type. + +Once called, a constructor produces an ADT instance, which is a +container that stores the values of the arguments to the constructor +as well as the name ("tag") of the constructor. The tag will be used +for deconstructing the instances and retrieving the values when +`ADT Matching`_. + +See :py:class:`~tvm.relay.adt.Constructor` for the definition and documentation. + Call ==== @@ -343,6 +380,8 @@ the type annotation: See :py:class:`~tvm.relay.expr.Call` for its definition and documentation. +.. _module-description: + Module and Global Functions =========================== @@ -542,6 +581,82 @@ condition value evaluates to :code:`False`. See :py:class:`~tvm.relay.expr.If` for its definition and documentation. +ADT Matching +============ + +Instances of algebraic data types (ADTs), as discussed in the +:ref:`ADT overview`, are containers that store the +arguments passed to the constructor used to create them, tagged by +the constructor name. + +Match expressions in Relay allow for retrieving the values stored in +an ADT instance ("deconstructing" it) based on their constructor tag. +A match expression behaves similarly to a C-style :code:`switch` statement, +branching on the different possible constructors for the type of the +value being deconstructed. As the ADT overview details, match +expressions are capable of more general pattern-matching than simply +splitting by constructors: any ADT instance nested inside an instance +(e.g., a list of lists) can be deconstructed at the same time as +the outer instance, while the different fields of the instance can be +bound to variables. (See :ref:`this section` for a detailed +description of ADT pattern-matching.) + +A match expression is defined using the +input value (an expression) and a list of clauses, each of which +consists of a pattern and an expression. When executed, the *first* +clause whose pattern matches the structure of the queried value is +executed; the clause expression is evaluated and returned. + +For example, suppose we have an ADT for natural numbers: + +.. code-block:: python + + data Nat { + Z : () -> Nat # zero + S : (Nat) -> Nat # successor (+1) to a nat + } + +Then the following function subtracts one from a passed nat: + +.. code-block:: python + + fn(%v: Nat[]) -> Nat[] { + match(%v) { + case Z() { Z() } + case S(%n) { %n } # the variable %n is bound in the scope of this clause + } + } + +The following function subtracts two from its argument if it is at least +two and returns the argument otherwise, using a nested constructor pattern: + +.. code-block:: python + + fn(%v : Nat[]) -> Nat[] { + match(%v) { + case S(S(%n)) { %n } + # wildcard pattern: matches all cases not matched already + case _ { %v } + } + } + +As aforementioned, the ordering of match clauses is relevant. +In the below example, the first clause will always match so +those below it can never run: + +.. code-block:: python + + fn(%v : Nat[]) -> Nat[] { + match(%v) { + case _ { %v } + case S(S(%n)) { S(%n) } + case S(%n) { %n } + case Z() { S(Z()) } + } + } + +See :py:class:`~tvm.relay.adt.Match` for its definition and documentation. + TempExprs ========= diff --git a/docs/langref/relay_type.rst b/docs/langref/relay_type.rst index 8d725b7d97f51..45217b234e768 100644 --- a/docs/langref/relay_type.rst +++ b/docs/langref/relay_type.rst @@ -10,6 +10,9 @@ be fully typed while requiring just a few explicit type annotations. Static types are useful when performing compiler optimizations because they communicate properties about the data a program manipulates, such as runtime shape, data layout, and storage, without needing to run the program. +Relay's `Algebraic Data Types`_ allow for easily and flexibly composing +types in order to build data structures that can be +reasoned about inductively and used to write recursive functions. Relay's type system features a form of *dependent typing* for shapes. That is, its type system keeps track of the shapes of tensors in a Relay program. Treating tensor shapes as types allows Relay to perform more powerful reasoning at compile time; @@ -33,20 +36,17 @@ type relations are satisfied at call sites). Type relations offer a flexible and relatively simple way of making the power of dependent typing available in Relay without greatly increasing the complexity of its type system. -Types -===== - Below we detail the language of types in Relay and how they are assigned to Relay expressions. Type -~~~~ +==== The base type for all Relay types. All Relay types are sub-classes of this base type. See :py:class:`~tvm.relay.ty.Type` for its definition and documentation. Tensor Type -~~~~~~~~~~~ +=========== A concrete tensor type in Relay. @@ -70,7 +70,7 @@ For example, here is a simple concrete tensor type corresponding to a 10-by-10 t See :py:class:`~tvm.relay.ty.TensorType` for its definition and documentation. Tuple Type -~~~~~~~~~~ +========== A type of a tuple in Relay. @@ -94,7 +94,7 @@ See :py:class:`~tvm.relay.ty.TupleType` for its definition and documentation. .. _type-parameter: Type Parameter -~~~~~~~~~~~~~~ +============== Type parameters represent placeholder types used for polymorphism in functions. Type parameters are specified according to *kind*, corresponding to the types @@ -127,7 +127,7 @@ be substituted with :code:`(10, 10)` at the call site below: See :py:class:`~tvm.relay.ty.TypeVar` for its definition and documentation. Type Constraint -~~~~~~~~~~~~~~~ +=============== This is an abstract class representing a type constraint, to be elaborated upon in further releases. Currently, type relations are the only @@ -136,7 +136,7 @@ type constraints provided; they are discussed below. See :py:class:`~tvm.relay.ty.TypeConstraint` for its definition and documentation. Function Type -~~~~~~~~~~~~~ +============= A function type in Relay, see `tvm/relay/type.h` for more details. @@ -158,7 +158,7 @@ See :py:class:`~tvm.relay.ty.FuncType` for its definition and documentation. .. _type-relation: Type Relation -~~~~~~~~~~~~~ +============= A type relation is the most complex type system feature in Relay. It allows users to extend type inference with new rules. @@ -226,7 +226,7 @@ in C++. See :py:class:`~tvm.relay.ty.TypeRelation` for its definition and documentation. Incomplete Type -~~~~~~~~~~~~~~~ +=============== An incomplete type is a type or portion of a type that is not yet known. This is only used during type inference. Any omitted type annotation is @@ -239,3 +239,139 @@ parameters: Type parameters must be bound to a function and are replaced with co at call sites, whereas incomplete types may appear anywhere in the program and are filled in during type inference. See :py:class:`~tvm.relay.ty.IncompleteType` for its definition and documentation. + +.. _adt-typing: + +Algebraic Data Types +==================== + +*Note: ADTs are not currently supported in the text format.* + +Algebraic data types (ADTs) are described in more detail in +:ref:`their overview `; this section describes +their implementation in the type system. + +An ADT is defined by a collection of named constructors, +each of which takes arguments of certain types. +An instance of an ADT is a container that stores the values +of the constructor arguments used to produce it as well as the +name of the constructor; the values can be retrieved by +deconstructing the instance by matching based on its constructor. +Hence, ADTs are sometimes called "tagged unions": like a C-style +union, the contents of an instance for a given ADT may have +different types in certain cases, but the constructor serves as a +tag to indicate how to interpret the contents. + +From the type system's perspective, it is most pertinent that +ADTs can take type parameters (constructor arguments can be +type parameters, though ADT instances with different type +parameters must be treated as different types) and be +recursive (a constructor for an ADT can take an instance of +that ADT, thus an ADT like a tree or list can be inductively +built up). The representation of ADTs in the type system must +be able to accomodate these facts, as the below sections will detail. + +Global Type Variable +~~~~~~~~~~~~~~~~~~~~ + +To represent ADTs compactly and easily allow for recursive ADT definitions, +an ADT definition is given a handle in the form of a global type variable +that uniquely identifies it. Each ADT definition is given a fresh global +type variable as a handle, so pointer equality can be used to distinguish +different ADT names. + +For the purposes of Relay's type system, ADTs are differentiated by name; +that means that if two ADTs have different handles, they will be +considered different types even if all their constructors are +structurally identical. + +Recursion in an ADT definition thus follows just like recursion for a +global function: the constructor can simply reference the ADT handle +(global type variable) in its definition. + +See :py:class:`~tvm.relay.ty.GlobalTypeVar` for its definition and documentation. + +Definitions (Type Data) +~~~~~~~~~~~~~~~~~~~~~~~ + +Besides a name, an ADT needs to store the constructors that are used +to define it and any type paramters used within them. These are +stored in the module, :ref:`analogous to global function definitions`. + +While type-checking uses of ADTs, the type system sometimes must +index into the module using the ADT name to look up information +about constructors. For example, if a constructor is being pattern-matched +in a match expression clause, the type-checker must check the constructor's +signature to ensure that any bound variables are being assigned the +correct types. + +See :py:class:`~tvm.relay.adt.TypeData` for its definition and documentation. + +Type Call +~~~~~~~~~ + +Because an ADT definition can take type parameters, Relay's type +system considers an ADT definition to be a *type-level function* +(in that the definition takes type parameters and returns the +type of an ADT instance with those type parameters). Thus, any +instance of an ADT is typed using a type call, which explicitly +lists the type parameters given to the ADT definition. + +It is important to list the type parameters for an ADT instance, +as two ADT instances built using different constructors but the +same type parameters are of the *same type* while two ADT instances +with different type parameters should not be considered the same +type (e.g., a list of integers should not have the same type as +a list of pairs of floating point tensors). + +The "function" in the type call is the ADT handle and there must +be one argument for each type parameter in the ADT definition. (An +ADT definition with no arguments means that any instance will have +no type arguments passed to the type call). + +See :py:class:`~tvm.relay.ty.TypeCall` for its definition and documentation. + +Example: List ADT +~~~~~~~~~~~~~~~~~ + +This subsection uses the simple list ADT (included as a default +ADT in Relay) to illustrate the constructs described in the previous +sections. Its definition is as follows: + +.. code-block:: python + + data List { + Nil : () -> List + Cons : (a, List[a]) -> List + } + +Thus, the global type variable :code:`List` is the handle for the ADT. +The type data for the list ADT in the module notes that +:code:`List` takes one type parameter and has two constructors, +:code:`Nil` (with signature :code:`fn() -> List[a]`) +and :code:`Cons` (with signature :code:`fn(a, List[a]) -> List[a]`). +The recursive reference to :code:`List` in the :code:`Cons` +constructor is accomplished by using the global type +variable :code:`List` in the constructor definition. + +Below two instances of lists with their types given, using type calls: + +.. code-block:: python + + Cons(1, Cons(2, Nil())) # List[Tensor[(), int32]] + Cons((1, 1), Cons((2, 2), Nil())) # List[(Tensor[(), int32], Tensor[(), int32])] + +Note that :code:`Nil()` can be an instance of any list because it +does not take any arguments that use a type parameter. (Nevertheless, +for any *particular* instance of :code:`Nil()`, the type parameter must +be specified.) + +Here are two lists that are rejected by the type system because +the type parameters do not match: + +.. code-block:: python + + # attempting to put an integer on a list of int * int tuples + Cons(1, Cons((1, 1), Nil())) + # attempting to put a list of ints on a list of lists of int * int tuples + Cons(Cons(1, Cons(2, Nil())), Cons(Cons((1, 1), Cons((2, 2), Nil())), Nil()))