Skip to content

Commit

Permalink
[BYOC][ACL] Improve installation tutorial (apache#6170)
Browse files Browse the repository at this point in the history
* [BYOC][ACL] Improve installation tutorial

Improves installation script so that ACL can be built natively and improves tutorial to give clearer information on how ACL can be installed using two different methods.

Change-Id: I6cec98b4b0a7dc2b151b36583d3d28f2b85f8702

* Address comments

Change-Id: I88db6d9d539a8f06e2dfe1b9a0a3ac7a4b46cece
  • Loading branch information
lhutton1 authored and Trevor Morris committed Aug 26, 2020
1 parent f914d56 commit 0289ac1
Show file tree
Hide file tree
Showing 3 changed files with 65 additions and 12 deletions.
2 changes: 2 additions & 0 deletions cmake/modules/contrib/ArmComputeLib.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,8 @@ if(USE_ARM_COMPUTE_LIB_GRAPH_RUNTIME)

file(GLOB ACL_CONTRIB_SRC src/runtime/contrib/arm_compute_lib/*)

# Cmake needs to find arm_compute, include and support directories
# in the path specified by ACL_PATH.
set(ACL_INCLUDE_DIRS ${ACL_PATH}/include ${ACL_PATH})
include_directories(${ACL_INCLUDE_DIRS})

Expand Down
11 changes: 9 additions & 2 deletions docker/install/ubuntu_install_arm_compute_lib.sh
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ repo_url="https://github.com/ARM-software/ComputeLibrary.git"
repo_dir="acl"
install_path="/opt/$repo_dir"
architecture_type=$(uname -i)
target_arch="arm64-v8a" # arm64-v8a/armv7a
target_arch="arm64-v8a" # arm64-v8a / arm64-v8.2-a / armv7a
build_type="native"

tmpdir=$(mktemp -d)
Expand All @@ -41,9 +41,16 @@ apt-get install -y --no-install-recommends \
git \
scons \
bsdmainutils \
build-essential \
build-essential

# Install cross-compiler when not building natively.
# Depending on the architecture selected to compile for,
# you may need to install an alternative cross-compiler.
if [ "$architecture_type" != "aarch64" ]; then
apt-get install -y --no-install-recommends \
g++-aarch64-linux-gnu \
gcc-aarch64-linux-gnu
fi

cd "$tmpdir"

Expand Down
64 changes: 54 additions & 10 deletions docs/deploy/arm_compute_lib.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,9 @@
specific language governing permissions and limitations
under the License.
Relay Arm|reg| Compute Library Integration
==========================================
Relay Arm :sup:`®` Compute Library Integration
==============================================
**Author**: `Luke Hutton <https://github.com/lhutton1>`_

Introduction
------------
Expand All @@ -26,6 +27,35 @@ and GPU's. Currently the integration offloads operators to ACL to use hand-craft
routines in the library. By offloading select operators from a relay graph to ACL we can achieve
a performance boost on such devices.

Installing Arm Compute Library
------------------------------

Before installing Arm Compute Library, it is important to know what architecture to build for. One way
to determine this is to use `lscpu` and look for the "Model name" of the CPU. You can then use this to
determine the architecture by looking online.

We recommend two different ways to build and install ACL:

* Use the script located at `docker/install/ubuntu_install_arm_compute_library.sh`. You can use this
script for building ACL from source natively or for cross-compiling the library on an x86 machine.
You may need to change the architecture of the device you wish to compile for by altering the
`target_arch` variable. Binaries will be built from source and installed to the location denoted by
`install_path`.
* Alternatively, you can download and use pre-built binaries from:
https://github.com/ARM-software/ComputeLibrary/releases. When using this package, you will need to
select the binaries for the architecture you require and make sure they are visible to cmake. This
can be done like so:

.. code:: bash
cd <acl-prebuilt-package>/lib
mv ./linux-<architecture-to-build-for>-neon/* .
In both cases you will need to set USE_ARM_COMPUTE_LIB_GRAPH_RUNTIME to the path where the ACL package
is located. Cmake will look in /path-to-acl/ along with /path-to-acl/lib and /path-to-acl/build for the
required binaries. See the section below for more information on how to use these configuration options.

Building with ACL support
-------------------------

Expand All @@ -42,6 +72,20 @@ to compile an ACL module on an x86 machine and then run the module on a remote A
need to use USE_ARM_COMPUTE_LIB=ON on the x86 machine and USE_ARM_COMPUTE_LIB_GRAPH_RUNTIME=ON on the remote
AArch64 device.

By default both options are set to OFF. Using USE_ARM_COMPUTE_LIB_GRAPH_RUNTIME=ON will mean that ACL
binaries are searched for by cmake in the default locations
(see https://cmake.org/cmake/help/v3.4/command/find_library.html). In addition to this,
/path-to-tvm-project/acl/ will also be searched. It is likely that you will need to set your own path to
locate ACL. This can be done by specifying a path in the place of ON.

These flags should be set in your config.cmake file. For example:

.. code:: cmake
set(USE_ARM_COMPUTE_LIB ON)
set(USE_ARM_COMPUTE_LIB_GRAPH_RUNTIME /path/to/acl)
Usage
-----

Expand Down Expand Up @@ -74,7 +118,7 @@ max_pool2d operator).
Annotate and partition the graph for ACL.

..code:: python
.. code:: python
from tvm.relay.op.contrib.arm_compute_lib import partition_for_arm_compute_lib
module = partition_for_arm_compute_lib(module)
Expand All @@ -100,7 +144,7 @@ Export the module.
Run Inference. This must be on an Arm device. If compiling on x86 device and running on AArch64,
consider using the RPC mechanism. Tutorials for using the RPC mechanism:
https://tvm.apache.org/docs/tutorials/cross_compilation_and_rpc.html#sphx-glr-tutorials-cross-compilation-and-rpc-py
https://tvm.apache.org/docs/tutorials/get_started/cross_compilation_and_rpc.html

.. code:: python
Expand Down Expand Up @@ -155,12 +199,12 @@ what needs to be changed and where, it will not however dive into the complexiti
individual operator. This is left to the developer.

There are a series of files we need to make changes to:

* `python/relay/op/contrib/arm_compute_lib.py` In this file we define the operators we wish to offload using the
`op.register` decorator. This will mean the annotation pass recognizes this operator as ACL
offloadable.
`op.register` decorator. This will mean the annotation pass recognizes this operator as ACL offloadable.
* `src/relay/backend/contrib/arm_compute_lib/codegen.cc` Implement `Create[OpName]JSONNode` method. This is where we
declare how the operator should be represented by JSON. This will be used to create the ACL module.
* `src/runtime/contrib/arm_compute_lib/acl_kernel.h` Implement `Create[OpName]Layer` method. This is where we
define how the JSON representation can be used to create an ACL function. We simply define how to
translate from the JSON representation to ACL API.
declare how the operator should be represented by JSON. This will be used to create the ACL module.
* `src/runtime/contrib/arm_compute_lib/acl_runtime.cc` Implement `Create[OpName]Layer` method. This is where we
define how the JSON representation can be used to create an ACL function. We simply define how to
translate from the JSON representation to ACL API.
* `tests/python/contrib/test_arm_compute_lib` Add unit tests for the given operator.

0 comments on commit 0289ac1

Please sign in to comment.