diff --git a/docs/source/background/assets/logical_v_physical.png b/docs/source/background/assets/logical_v_physical.png new file mode 100644 index 00000000..7312f8c4 Binary files /dev/null and b/docs/source/background/assets/logical_v_physical.png differ diff --git a/docs/source/background/index.rst b/docs/source/background/index.rst index 82ddc0b0..b27a2208 100644 --- a/docs/source/background/index.rst +++ b/docs/source/background/index.rst @@ -23,4 +23,5 @@ TensorWrapper Background key_features other_choices + logical_v_physical terminology diff --git a/docs/source/background/logical_v_physical.rst b/docs/source/background/logical_v_physical.rst new file mode 100644 index 00000000..9856cff8 --- /dev/null +++ b/docs/source/background/logical_v_physical.rst @@ -0,0 +1,71 @@ +######################################################## +Logical Versus Physical: Understanding the Tensor Layout +######################################################## + +Most tensor libraries conceptually support what we call physical and logical +layouts; however, they often do not distinguish among them. The point of this +page is to introduce the logical vs. physical concept, provide some examples, +and explain why the explicit distinction is important. + +********** +Motivation +********** + +Ideally, users of a tensor library should only have to specify the properties +of the tensor as they relate to the problem being modeled. Roughly speaking, +this amounts to the literal values of the tensor, the symmetry of the tensor, +and the sparsity of the tensor. The literal implementation may need additional +structure, for example tiling and distribution. This additional structure +effectively changes the properties of the tensor (vide infra), and moreover +this additional structure may only appear on certain hardware. + +In an effort to distinguish the problem-specific structure from the +implementation-/hardware-specific structure we term the former the "logical" +structure and the latter the "physical" structure. Generally speaking, the +physical structure will describe how the tensor is actually structured, whereas +the logical structure describes how the user thinks the tensor is structured. + +******** +Examples +******** + +.. _fig_logical_v_physical: + +.. figure:: assets/logical_v_physical.png + :align: center + + Illustration of how tiling a tensor effectively changes its rank. + +The motivation for introducing the logical vs. physical distinction is tiling +of a tensor. :numref:`fig_logical_v_physical` illustrates this process for a +matrix. The left side of :numref:`fig_logical_v_physical` shows the logical view +of the matrix. This is how the end-user thinks of the tensor, *i.e.*, it's some +number of rows by some number of columns. When the user interacts with this +tensor they expect to give two indices, one for the row and one for the column. +All interactions of the end-user with the tensor should behave like the tensor +is a matrix. + +The right side of :numref:`fig_logical_v_physical` shows how TensorWrapper +"physically" lays out a tiled matrix, namely a matrix of matrices. Row and +column offsets in the outer matrix are used to select a tile. Row and column +offsets in a tile are used to select an element. In turn, when interacting with +the physical tensor a user needs to provide four indices. Clearly the logical +and physical views are not compatible without knowing the mapping. + +******* +Summary +******* + +- Ideally users will interact with tensors in a manner dictated by the problem + being modeled. This "logical" interaction will ideally be performance + portable. +- Ideally, TensorWrapper will automatically map the logical view to a + "physical" performant representation. +- Generally speaking, the physical representation will not be the same as the + logical representation. +- In practice, TensorWrapper is unlikely to automate the logical to physical + mapping anytime soon so users will likely need to consider both the logical + and physical representations. +- By distinguishing between logical and physical views and writing TensorWrapper + infrastructure in terms of the physical views we can move towards ideality by + having the logical views dispatch to the physical views. diff --git a/docs/source/developer/design/overview.rst b/docs/source/developer/design/overview.rst index 56d0d417..7bd0f235 100644 --- a/docs/source/developer/design/overview.rst +++ b/docs/source/developer/design/overview.rst @@ -96,6 +96,12 @@ Create a tensor element's indices as input. - There are a number of notable "special" tensors like the zero and identity tensors which users will sometimes need to create too. + - Part of creating the tensor is selecting the backend. Ideally, + TensorWrapper would be able to pick the backend most appropriate for the + requested calculation; however, it is likely that for the forseeable + future users will need to specify the backend, at least for the initial + tensors (tensors computed from the initial tensors will use the same + backend). .. _atw_compose_with_tensors: @@ -179,6 +185,10 @@ Domain-specific optimization symmetry often simply result in tensor sparsity. Point being, many of these optimizations can be mapped to more general tensor considerations. + - To be clear, designing interfaces explicitly for domain-specific + optimizations is out of scope. The underlying optimizations (usually + sparsity and symmetry) are in scope. + ******************* Architecture Design ******************* @@ -265,6 +275,12 @@ component is responsible for dealing with: - switching among sparsity representations - Nesting of sparsity +As a note, in TensorWrapper tiling implies a nested tensor. Therefore tile- +sparsity is actually element-sparsity, just for non-scalar elements. The point +is, even though we anticipate tile-sparsity to be the most important sparsity +we think that TensorWrapper needs to be designed with element-sparsity in mind +from the getgo. + TensorWrapper ------------- @@ -284,7 +300,7 @@ User-Facing External Dependencies TensorWrapper additionally needs a description of the runtime. For this purpose we have elected to build upon -`ParallelZone __`. +`ParallelZone `__. Implementation-Facing Classes ============================= @@ -307,7 +323,6 @@ as: - Fundamental type of the values (*e.g.*, float, double, etc.) - Vectorization strategy (row-major vs. column-major) - Value location: distribution, RAM, disk, on-the-fly, GPU -- Distribution strategy Buffer ------ @@ -339,7 +354,7 @@ describing the runtime properties of the tensor: - Actual shape of the tensor (including tiling) - Actual symmetry (accounting for actual shape) - Actual sparsity of the tensor (accounting for symmetry) -- Distribution for multi-process tensors +- Distribution for distributed tensors Expression @@ -353,27 +368,26 @@ in a user-facing manner; however, the expression layer is specifically designed to appear to the user like they are working with only ``TensorWrapper`` objects, which is why we consider it an implementation detail. Responsibilities include: -- Assembling the :ref:`term_cst` from the DSL. +- Assembling the :ref:`term_cst` from the :ref:`term_dsl`. - Express transformations of a single tensor - Express binary operations - Represent branching nodes of the abstract syntax tree - OpGraph ------- Main discussion: :ref:`tw_designing_the_opgraph`. The ``Expression`` component contains a user-friendly mechanism for composing -tensors using TensorWrapper's DSL. The result is a CSL. In practice, CSLs -contain extraneous information (and in C++ are typically represented by +tensors using TensorWrapper's DSL. The result is a :ref:`term_cst`. In practice, +CSTs contain extraneous information (and in C++ are typically represented by heavily nested template instantiations which are not fun to look at). The ``OpGraph`` component is designed to be an easier-to-manipulate, more-programmer-friendly representation of the tensor algebra the user requested than the ``Expression`` component. It is the ``OpGraph`` which is used to drive executing the backends. Responsibilities include: -- Converting the CST to an AST +- Converting the CST to an :ref:`term_ast` - Runtime optimizations of the AST Implementation-Facing External Dependencies