[MLIR/Frontend] C++ compiler driver improvements, ability to compile …

…textual IR (#216) **Context**: The process of lowering from MLIR to binary went through several subprocess calls. Each subprocess call would create a new process (either quantum-opt or mlir-hlo opt). Between each subprocess call, there was a need to dump the IR into a textual representation and parse it again in the next subprocess to an in-memory representation. This process can be inefficient and compilation times would increase proportionally to the size of the program. This design also had the disadvantage of increased binary size. Since quantum-opt and mlir-hlo-opt shared the common libraries from LLVM and MLIR which were linked statically, the size of the package was larger than needed. **Description of the Change:** - There is now a C++ compiler driver that avoids dumping and parsing the intermediate representation between stages. Reducing compilation time and package size. - The C++ driver compiles an MLIR module down to an object binary file. - General refactoring, separating out the extensions from the driver into separate pybind modules. - Addition of the ability to use `@qjit` on a string containing textual IR (MLIR at any level and LLVM IR) and get it to run from Python. - Enzyme module is updated to be compiled statically. **Benefits:** Improved compilation time Reduced package size. [sc-41430] [sc-41704] --------- Co-authored-by: Ali Asadi <[email protected]> Co-authored-by: Sergei Mironov <[email protected]> Co-authored-by: David Ittah <[email protected]> Co-authored-by: Erick <[email protected]> Co-authored-by: erick-xanadu <[email protected]>
PennyLaneAI · Sep 20, 2023 · 64be9d2 · 64be9d2
1 parent 2c38488
commit 64be9d2
Show file tree

Hide file tree

Showing 38 changed files with 1,780 additions and 956 deletions.
diff --git a/.dep-versions b/.dep-versions
@@ -2,4 +2,4 @@
 jax=0.4.14
 mhlo=00be4a6ce2c4d464e07d10eae51918a86f8df7b4
 llvm=4706251a3186c34da0ee8fd894f7e6b095da8fdc
-enzyme=86197cb2d776d72e2063695be21b729f6cffeb9b
+enzyme=8d22ed1b8c424a061ed9d6d0baf0cc0d2d6842e2
diff --git a/.github/workflows/check-catalyst.yaml b/.github/workflows/check-catalyst.yaml
@@ -131,9 +131,17 @@ jobs:
         key: ${{ runner.os }}-llvm-${{ needs.constants.outputs.llvm_version }}-default-build-opt
         fail-on-cache-miss: True
 
+    - name: Cache MHLO Source
+      id: cache-mhlo-source
+      uses: actions/cache@v3
+      with:
+        path: mlir/mlir-hlo
+        key: ${{ runner.os }}-mhlo-${{ needs.constants.outputs.mhlo_version }}-default-source
+        enableCrossOsArchive: True
+
     - name: Clone MHLO Submodule
       if: |
-        steps.cache-mhlo.outputs.cache-hit != 'true' &&
+        steps.cache-mhlo.outputs.cache-hit != 'true' ||
         steps.cache-mhlo-source.outputs.cache-hit != 'true'
       uses: actions/checkout@v3
       with:
@@ -213,7 +221,7 @@ jobs:
 
   quantum:
     name: Quantum Dialects Build
-    needs: [constants, llvm]
+    needs: [constants, mhlo, llvm]
     runs-on: ubuntu-latest
 
     steps:
@@ -234,6 +242,15 @@ jobs:
         enableCrossOsArchive: True
         fail-on-cache-miss: True
 
+    - name: Get Cached MHLO Source
+      id: cache-mhlo-source
+      uses: actions/cache@v3
+      with:
+        path: mlir/mlir-hlo
+        key: ${{ runner.os }}-mhlo-${{ needs.constants.outputs.mhlo_version }}-default-source
+        enableCrossOsArchive: True
+        fail-on-cache-miss: True
+
     - name: Get Cached LLVM Build
       id: cache-llvm-build
       uses: actions/cache@v3
@@ -242,6 +259,14 @@ jobs:
         key: ${{ runner.os }}-llvm-${{ needs.constants.outputs.llvm_version }}-default-build-opt
         fail-on-cache-miss: True
 
+    - name: Get Cached MHLO Build
+      id: cache-mhlo-build
+      uses: actions/cache@v3
+      with:
+        path: mhlo-build
+        key: ${{ runner.os }}-mhlo-${{ needs.constants.outputs.mhlo_version }}-default-build
+        fail-on-cache-miss: True
+
     - name: Cache CCache
       id: cache-ccache
       uses: actions/cache@v3
@@ -253,11 +278,21 @@ jobs:
         key: ${{ runner.os }}-ccache-${{ github.run_id }}
         restore-keys: ${{ runner.os }}-ccache-
 
+    - name: Clone Enzyme Submodule
+      if: |
+        steps.cache-enzyme.outputs.cache-hit != 'true'
+      uses: actions/checkout@v3
+      with:
+        repository: EnzymeAD/Enzyme
+        ref: ${{ needs.constants.outputs.enzyme_version }}
+        path: mlir/Enzyme
+
     - name: Build MLIR Dialects
       run: |
         CCACHE_DIR="$(pwd)/.ccache" \
         LLVM_BUILD_DIR="$(pwd)/llvm-build" \
         MHLO_BUILD_DIR="$(pwd)/mhlo-build" \
+        ENZYME_SRC_DIR="$(pwd)/Enzyme" \
         DIALECTS_BUILD_DIR="$(pwd)/quantum-build" \
         make dialects
 
@@ -273,7 +308,7 @@ jobs:
 
   frontend-tests:
     name: Frontend Tests
-    needs: [constants, runtime, mhlo, quantum, enzyme]
+    needs: [constants, runtime, mhlo, quantum]
     runs-on: ubuntu-latest
 
     steps:
@@ -331,7 +366,6 @@ jobs:
         echo "PYTHONPATH=$PYTHONPATH:$(pwd)/quantum-build/python_packages/quantum" >> $GITHUB_ENV
         echo "RUNTIME_LIB_DIR=$(pwd)/runtime-build/lib" >> $GITHUB_ENV
         echo "MLIR_LIB_DIR=$(pwd)/llvm-build/lib" >> $GITHUB_ENV
-        echo "ENZYME_LIB_DIR=$(pwd)/enzyme-build/Enzyme" >> $GITHUB_ENV
         chmod +x quantum-build/bin/quantum-opt  # artifact upload does not preserve permissions
 
     - name: Run Python Lit Tests
@@ -358,7 +392,7 @@ jobs:
 
   frontend-tests-lightning-kokkos:
     name: Frontend Tests (backend="lightning.kokkos")
-    needs: [constants, runtime, mhlo, quantum, enzyme]
+    needs: [constants, runtime, mhlo, quantum]
     runs-on: ubuntu-latest
 
     steps:
@@ -410,14 +444,12 @@ jobs:
 
     - name: Add Frontend Dependencies to PATH
       run: |
-        echo "$(pwd)/enzyme-build/Enzyme" >> $GITHUB_PATH
         echo "$(pwd)/llvm-build/bin" >> $GITHUB_PATH
         echo "$(pwd)/mhlo-build/bin" >> $GITHUB_PATH
         echo "$(pwd)/quantum-build/bin" >> $GITHUB_PATH
         echo "PYTHONPATH=$PYTHONPATH:$(pwd)/quantum-build/python_packages/quantum" >> $GITHUB_ENV
         echo "RUNTIME_LIB_DIR=$(pwd)/runtime-build/lib" >> $GITHUB_ENV
         echo "MLIR_LIB_DIR=$(pwd)/llvm-build/lib" >> $GITHUB_ENV
-        echo "ENZYME_LIB_DIR=$(pwd)/enzyme-build/Enzyme" >> $GITHUB_ENV
         chmod +x quantum-build/bin/quantum-opt  # artifact upload does not preserve permissions
 
     - name: Install lightning.kokkos used in Python tests
@@ -430,7 +462,7 @@ jobs:
 
   frontend-tests-openqasm-device:
     name: Frontend Tests (backend="openqasm3")
-    needs: [constants, mhlo, quantum, enzyme, llvm]
+    needs: [constants, mhlo, quantum, llvm]
     runs-on: ubuntu-latest
 
     steps:
@@ -494,7 +526,6 @@ jobs:
         echo "PYTHONPATH=$PYTHONPATH:$(pwd)/quantum-build/python_packages/quantum" >> $GITHUB_ENV
         echo "RUNTIME_LIB_DIR=$(pwd)/runtime-build/lib" >> $GITHUB_ENV
         echo "MLIR_LIB_DIR=$(pwd)/llvm-build/lib" >> $GITHUB_ENV
-        echo "ENZYME_LIB_DIR=$(pwd)/enzyme-build/Enzyme" >> $GITHUB_ENV
         chmod +x quantum-build/bin/quantum-opt  # artifact upload does not preserve permissions
 
     - name: Run Python Pytest Tests

diff --git a/.gitmodules b/.gitmodules
@@ -8,7 +8,7 @@
 	url = https://github.com/llvm/llvm-project.git
 	shallow = true
 	ignore = dirty
-[submodule "enzyme"]
+[submodule "Enzyme"]
 	path = mlir/Enzyme
 	url = https://github.com/EnzymeAD/Enzyme.git
 	shallow = true

diff --git a/doc/changelog.md b/doc/changelog.md
@@ -7,6 +7,16 @@
 * Update the Lightning backend device to work with the PL-Lightning monorepo.
   [(#259)](https://github.com/PennyLaneAI/catalyst/pull/259)
 
+* Move to an alternate compiler driver in C++. This improves compile-time performance by
+  avoiding *round-tripping*, which is when the entire program being compiled is dumped to
+  a textual form and re-parsed by another tool.
+
+  This is also a requirement for providing custom metadata at the LLVM level, which is
+  necessary for better integration with tools like Enzyme. Finally, this makes it more natural
+  to improve error messages originating from C++ when compared to the prior subprocess-based
+  approach.
+  [(#216)](https://github.com/PennyLaneAI/catalyst/pull/216)
+
 * Build both `"lightning.qubit"` and `"lightning.kokkos"` against the PL-Lightning monorepo.
   [(#277)](https://github.com/PennyLaneAI/catalyst/pull/277)
 
@@ -22,7 +32,10 @@
 
 This release contains contributions from (in alphabetical order):
 
-Ali Asadi
+Ali Asadi,
+Erick Ochoa Lopez,
+Jacob Mai Peng,
+Sergei Mironov.
 
 # Release 0.3.0
 

diff --git a/doc/conf.py b/doc/conf.py
@@ -82,13 +82,15 @@ def __getattr__(cls, name):
 
 
 MOCK_MODULES = [
+    "mlir_quantum",
     "mlir_quantum.runtime",
     "mlir_quantum.dialects",
     "mlir_quantum.dialects.arith",
     "mlir_quantum.dialects.tensor",
     "mlir_quantum.dialects.scf",
     "mlir_quantum.dialects.quantum",
     "mlir_quantum.dialects.gradient",
+    "mlir_quantum.compiler_driver",
     "pybind11",
 ]
 

diff --git a/doc/dev/debugging.rst b/doc/dev/debugging.rst
@@ -118,98 +118,79 @@ Will print out something close to the following:
 Pass Pipelines
 ==============
 
-The compilation steps which take MLIR as an input and lower it to binary are broken into pass pipelines.
-A ``PassPipeline`` is a class that specifies which binary and which flags are used for compilation.
-Users can implement their own ``PassPipeline`` by inheriting from this class and implementing the relevant methods/attributes.
-Catalyst's compilation strategy can then be adjusted by overriding the default pass pipeline.
-For example, let's imagine that a user is interested in testing different optimization levels when compiling LLVM IR to binary using ``llc``.
-The user would then create a ``PassPipeline`` that replaces the ``LLVMIRToObjectFile`` class.
-First let's take a look at the ``LLVMIRToObjectFile``.
+The compilation steps which take MLIR as an input and lower it to binary are broken into MLIR pass
+pipelines.  The ``pipelines`` argument of the ``qjit`` function may be used to alter the steps used
+for compilation. The default set of pipelines is defined via the ``catalyst.compiler.DEFAULT_PIPELINES``
+list. Its structure is shown below.
 
 .. code-block:: python
 
-    class LLVMIRToObjectFile(PassPipeline):
-        """LLVMIR To Object File."""
-    
-        _executable = get_executable_path("llvm", "llc")
-        _default_flags = [
-            "--filetype=obj",
-            "--relocation-model=pic",
+    DEFAULT_PIPELINES = [
+        (
+            "HLOLoweringPass",
+            [
+                "canonicalize",
+                "func.func(chlo-legalize-to-hlo)",
+                "stablehlo-legalize-to-hlo",
+                "func.func(mhlo-legalize-control-flow)",
+                ...
+            ],
+        ),
+        (
+            "QuantumCompilationPass",
+            [
+                "lower-gradients",
+                "adjoint-lowering",
+                "convert-arraylist-to-memref",
+            ],
+        ),
+        ...
         ]
-    
-        @staticmethod
-        def get_output_filename(infile):
-            path = pathlib.Path(infile)
-            if not path.exists():
-                raise FileNotFoundError("Cannot find {infile}.")
-            return str(path.with_suffix(".o"))
-
-
-The ``LLVMDialectTOLLVMIR`` and all classes derived from ``PassPipeline`` must define an ``_executable`` and ``_default_flags`` fields.
-The ``_executable`` field is string that corresponds to the command that will be used to execute in a subprocess.
-The ``_default_flags`` are the flags that will be used when running the executable.
-The method ``get_output_filename`` computes the name of the output file given an input file.
-It is expected that the output of a ``PassPipeline`` will be fed as an input to the following ``PassPipeline``.
 
-From here, we can see that in order for the user to test different optimization levels, all that is needed is create a class that extends either ``PassPipeline`` or ``LLVMDialectToLLVMIR`` and appends the ``-O3`` flag to the ``_default_flags`` field. For example, either of the following classes would work:
 
+One could customize what compilation passes are executed. A good use case of this would be if you
+are debugging Catalyst itself or you want to enable or disable passes within a specific pipeline.
+It is recommended to copy the default pipelines and edit them to suit your goals and afterwards
+passing them to the ``@qjit`` decorator. E.g. if you want to disable inlining
 
 .. code-block:: python
 
-    class MyLLCOpt(PassPipeline):
-        """LLVMIR To Object File."""
-    
-        _executable = get_executable_path("llvm", "llc")
-        _default_flags = [
-            "--filetype=obj",
-            "--relocation-model=pic",
-            "-O3",
-        ]
-    
-        @staticmethod
-        def get_output_filename(infile):
-            path = pathlib.Path(infile)
-            if not path.exists():
-                raise FileNotFoundError("Cannot find {infile}.")
-            return str(path.with_suffix(".o"))
-
-or
-
-.. code-block:: python
-
-    class MyLLCOpt(LLVMIRToObjectFile):
-        """LLVMIR To Object File."""
-    
-        _default_flags = [
-            "--filetype=obj",
-            "--relocation-model=pic",
-            "-O3",
+    my_pipelines = [
+        ...
+        (
+            "MyBufferizationPass",
+            [
+                "one-shot-bufferize{dialect-filter=memref}",
+                # "inline",
+                "gradient-bufferize",
+                ...
+            ],
+        ),
+        ...
         ]
-    
-In order to actually use this ``PassPipeline``, the user must override the default ``PassPipeline``.
-To do so, use the ``pipelines`` keyword parameter in ``@qjit`` decorator.
-The value assigned to ``pipelines`` must be a list of ``PassPipeline`` that will lower MLIR to binary.
-In this particular case, we are substituting the ``LLVMIRToObjectFile`` pass pipeline with ``MyLLCOpt`` in the default pass pipeline.
-The following will work:
 
+     @qjit(pipelines=my_pipelines)
+     @qml.qnode(dev)
+     def circuit():
+        ...
 
-.. code-block:: python
 
-    custom_pipeline = [MHLOPass, QuantumCompilationPass, BufferizationPass, MLIRToLLVMDialect, LLVMDialectToLLVMIR, MyLLCOpt, CompilerDriver]
-    
-    @qjit(pipelines=custom_pipeline)
-    def foo():
-        """A method to be JIT compiled using a custom pipeline"""
-        ...
+Here, each item represents a pipeline. Each pipeline has a name and a list of MLIR passes
+to perform. Most of the standard passes are described in the
+`MLIR passes documentation <https://mlir.llvm.org/docs/Passes/>`_. Quantum MLIR passes are
+implemented in Catalyst and can be found in the sources.
 
-Users that are interested in ``PassPipeline`` classes are encouraged to look at the ``compiler.py`` file to look at different ``PassPipeline`` child classes.
+All pipelines are executed in sequence, the output MLIR of each pipeline is stored in
+memory and becomes available via the ``get_output_of`` method of the ``QJIT`` object.
 
 Printing the IR generated by Pass Pipelines
-==========================================
+===========================================
 
-We won't get into too much detail here, but sometimes it is useful to look at the output of a specific ``PassPipeline``.
+We won't get into too much detail here, but sometimes it is useful to look at the output of a
+specific pass pipeline.
 To do so, simply use the ``get_output_of`` method available in ``QJIT``.
-For example, if one wishes to inspect the output of the ``BufferizationPass``, simply run the following command.
+For example, if one wishes to inspect the output of the ``BufferizationPass`` pipeline, simply run
+the following command.
 
 .. code-block:: python
 
@@ -278,15 +259,21 @@ compiler used by TensorFlow.
 
 .. code-block:: python
 
-    print(circuit.mlir)    
+    print(circuit.mlir)
 
 Lowering out of the MHLO dialect leaves us with the classical computation represented by generic
 dialects such as ``arith``, ``math``, or ``linalg``. This allows us to later generate machine code
 via standard LLVM-MLIR tooling.
 
 .. code-block:: python
 
-    circuit.get_output_of("MHLOPass")
+    circuit.get_output_of("HLOLoweringPass")
+
+The quantum compilation pipeline expands high-level quantum instructions like adjoint, and applies quantum differentiation methods and optimization techniques.
+
+.. code-block:: python
+
+    circuit.get_output_of("QuantumCompilationPass")
 
 An important step in getting to machine code from a high-level representation is allocating memory
 for all the tensor/array objects in the program.