Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate C codegen #493

Merged
merged 2 commits into from
May 5, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified docs/_static/deployment.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4,449 changes: 454 additions & 3,995 deletions docs/_static/deployment.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -235,6 +235,7 @@ def is_readthedocs_build():
"scipy": ("https://docs.scipy.org/doc/scipy/", None),
"pandas": ("https://pandas.pydata.org/pandas-docs/stable/", None),
"sklearn": ("https://scikit-learn.org/stable", None),
"tl2cgen": ("https://tl2cgen.readthedocs.io/en/latest/", None),
}


Expand Down
161 changes: 65 additions & 96 deletions docs/index.rst
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
#####################################################
=====================================================
Treelite : model compiler for decision tree ensembles
#####################################################
=====================================================

**Treelite** is a model compiler for decision tree ensembles, aimed at
efficient deployment.
**Treelite** is a universal model exchange and serialization format for
decision tree forests. Treelite aims to be a small library that enables
other C++ applications to exchange and store decision trees on the disk
as well as the network.

.. raw:: html

Expand All @@ -14,115 +16,84 @@ efficient deployment.
data-size="large" data-show-count="true"
aria-label="Watch dmlc/treelite on GitHub">Watch</a>

*************
.. warning:: Tree compiler was migrated to TL2cgen

If you are looking for a compiler to translate tree models into C code,
use :doc:`TL2cgen <tl2cgen:index>`.
To migrate existing code using Treelite 3.x, consult the page
:doc:`tl2cgen:treelite-migration`.


Why Treelite?
*************


Use machine learning package of your choice
===========================================
Treelite accommodates a wide range of decision tree ensemble models. In
particular, it handles both
`random forests <https://en.wikipedia.org/wiki/Random_forest>`_ and
`gradient boosted trees <https://en.wikipedia.org/wiki/Gradient_boosting>`_.

Treelite can read models produced by
`XGBoost <https://github.com/dmlc/xgboost/>`_,
`LightGBM <https://github.com/Microsoft/LightGBM>`_, and
`scikit-learn <https://github.com/scikit-learn/scikit-learn>`_. In cases where
you are using another package to train your model, you may use the
:doc:`flexible builder class <tutorials/builder>`.

Deploy with minimal dependencies
================================
It is a great hassle to install machine learning packages (e.g. XGBoost,
LightGBM, scikit-learn, etc.) on every machine your tree model will run. This is
the case no longer: Treelite will export your model as a stand-alone
prediction library so that predictions will be made without any machine
learning package installed.
=============

Universal, lightweight specification for all tree models
========================================================
Are you designing an optimized prediction runtime software for tree models?
--------------------------------------------------------
Are you designing a C++ application that needs to read and write tree models,
e.g. a prediction server?
Do not be overwhelmed by the variety of tree models in the wild. Treelite
lets you convert many kinds of tree models into a **single, lightweight
exchange format**. You can serialize (save) any tree model into a byte
sequence or a file. Plus, Treelite is designed to be used as a component in
prediction runtimes. Currently, Treelite is used by
`Amazon SageMaker Neo <https://github.com/neo-ai/neo-ai-dlr>`_ and
and `RAPIDS cuML <https://github.com/rapidsai/cuml>`_.

***********
Quick start
***********
Install Treelite from PyPI:

.. code-block:: console
lets you convert many kinds of tree models into a **common specification**.
By using Treelite as a library, your application now only needs to deal
with one model specification instead of many. Treelite currently
supports:

* `XGBoost <https://github.com/dmlc/xgboost/>`_
* `LightGBM <https://github.com/Microsoft/LightGBM>`_
* `scikit-learn <https://github.com/scikit-learn/scikit-learn>`_
* :doc:`flexible builder class <tutorials/builder>` for users of other
tree libraries

In addition, tree libraries can directly output trained trees using the
Treelite specification. For example, the random forest algoritm in
`RAPIDS cuML <https://github.com/rapidsai/cuml>`_ stores the random forest
object using Treelite.

python3 -m pip install --user treelite treelite_runtime
.. raw:: html

Import your tree ensemble model into Treelite:
<p>
<a href="_static/deployment.png">
<img src="_static/deployment.svg"
onerror="this.src='_static/deployment.png'; this.onerror=null;"
width="100%"><br>
(Click to enlarge)
</a>
</p>

.. code-block:: python
A small library that's easy to embed in another C++ application
---------------------------------------------------------------
Treelite has an up-to-date CMake build script. If your C++
application uses CMake, it is easy to embed Treelite.
Treelite is currently used by the following applications:

import treelite
model = treelite.Model.load('my_model.model', model_format='xgboost')
* :doc:`tl2cgen:index`
* Forest Inference Library (FIL) in `RAPIDS cuML <https://github.com/rapidsai/cuml>`_
* `Triton Inference Server FIL Backend <https://github.com/triton-inference-server/fil_backend>`_,
an optimized prediction runtime for CPUs and GPUs.

Deploy a source archive:
Quick start
===========
Install Treelite:

.. code-block:: python
.. code-block:: console

# Produce a zipped source directory, containing all model information
# Run `make` on the target machine
model.export_srcpkg(platform='unix', toolchain='gcc',
pkgpath='./mymodel.zip', libname='mymodel.so',
verbose=True)
# From PyPI
pip install treelite
# From Conda
conda install -c conda-forge treelite

Deploy a shared library:
Import your tree ensemble model into Treelite:

.. code-block:: python

# Like export_srcpkg, but generates a shared library immediately
# Use this only when the host and target machines are compatible
model.export_lib(toolchain='gcc', libpath='./mymodel.so', verbose=True)
import treelite
model = treelite.Model.load("my_model.model", model_format="xgboost")

Make predictions on the target machine:
Compute predictions using :doc:`treelite-gtil-api`:

.. code-block:: python

import treelite_runtime
predictor = treelite_runtime.Predictor('./mymodel.so', verbose=True)
dmat = treelite_runtime.DMatrix(X)
out_pred = predictor.predict(dmat)

Read :doc:`tutorials/first` for a more detailed example. See
:doc:`tutorials/deploy` for additional instructions on deployment.

.. note:: A note on API compatibility

Since Treelite is in early development, its API may change substantially
in the future.

******************
How Treelite works
******************

.. raw:: html

<p>
<a href="_static/deployment.png">
<img src="_static/deployment.svg"
onerror="this.src='_static/deployment.png'; this.onerror=null;"
width="100%"><br>
(Click to enlarge)
</a>
</p>

The workflow involves two distinct machines: **the host machine** that generates
prediction subroutine from a given tree model, and **the target machine** that
runs the subroutine. The two machines exchange a single C file that contains
all relevant information about the tree model. Only the host machine needs to
have Treelite installed; the target machine requires only a working C compiler.
X = ... # numpy array
treelite.gtil.predict(model, data=X)

********
Contents
Expand All @@ -135,10 +106,8 @@ Contents
install
tutorials/index
treelite-api
treelite-runtime-api
treelite-gtil-api
treelite-c-api
Treelite runtime Rust API <http://dovahcrow.github.io/treerite/treerite/>
knobs/index
notes-on-serialization
treelite-doxygen
Expand Down
105 changes: 33 additions & 72 deletions docs/install.rst
Original file line number Diff line number Diff line change
@@ -1,47 +1,52 @@
============
Installation
============

You may choose one of two methods to install Treelite on your system:

* :ref:`install-pip`
* :ref:`install-conda`
* :ref:`install-source`

.. _install-pip:
.. contents::
:local:
:depth: 1

Download binary releases from PyPI (Recommended)
------------------------------------------------
================================================
This is probably the most convenient method. Simply type

.. code-block:: console

python3 -m pip install --user treelite treelite_runtime
pip install treelite

to install the Treelite package. The command will locate the binary release that is compatible with
your current platform. Check the installation by running

.. code-block:: python

import treelite
import treelite_runtime

in an interactive Python session. This method is available for only Windows, Mac OS X, and Linux.
in an interactive Python session. This method is available for only Windows, MacOS, and Linux.
For other operating systems, see the next section.

.. note:: Installing OpenMP runtime on Mac OSX
.. note:: Windows users need to install Visual C++ Redistributable

Treelite requires DLLs from `Visual C++ Redistributable
<https://www.microsoft.com/en-us/download/details.aspx?id=48145>`_
in order to function, so make sure to install it. Exception: If
you have Visual Studio installed, you already have access to
necessary libraries and thus don't need to install Visual C++
Redistributable.

.. note:: Installing OpenMP runtime on MacOS

Treelite requires the presence of OpenMP runtime. To install OpenMP runtime on a Mac OSX system,
Treelite requires the presence of OpenMP runtime. To install OpenMP runtime on a MacOS system,
run the following command:

.. code-block:: bash

brew install libomp


.. _install-conda:

Download binary releases from Conda
------------------------------------------------
===================================
Treelite is also available on Conda.

.. code-block:: console
Expand All @@ -54,7 +59,7 @@ available platforms.
.. _install-source:

Compile Treelite from the source
--------------------------------
================================
Installation consists of two steps:

1. Build the shared libraries from C++ code (See the note below for the list.)
Expand All @@ -70,7 +75,7 @@ Installation consists of two steps:
Operating System Main library Runtime library
================== ===================== =============================
Windows ``treelite.dll`` ``treelite_runtime.dll``
Mac OS X ``libtreelite.dylib`` ``libtreelite_runtime.dylib``
MacOS ``libtreelite.dylib`` ``libtreelite_runtime.dylib``
Linux / other UNIX ``libtreelite.so`` ``libtreelite_runtime.so``
================== ===================== =============================

Expand All @@ -83,8 +88,8 @@ To get started, clone Treelite repo from GitHub.

The next step is to build the shared libraries.

1-1. Compiling shared libraries on Linux and Mac OS X
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1-1. Compiling shared libraries on Linux and MacOS
--------------------------------------------------
Here, we use CMake to generate a Makefile:

.. code-block:: bash
Expand All @@ -102,7 +107,7 @@ libraries.

The compiled libraries will be under the ``build/`` directory.

.. note:: Compiling Treelite with multithreading on Mac OS X
.. note:: Compiling Treelite with multithreading on MacOS

Treelite requires the presence of OpenMP runtime. To install OpenMP runtime on a Mac OSX system,
run the following command:
Expand All @@ -112,73 +117,29 @@ The compiled libraries will be under the ``build/`` directory.
brew install libomp

1-2. Compiling shared libraries on Windows
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
------------------------------------------
We can use CMake to generate a Visual Studio project. The following snippet assumes that Visual
Studio 2017 is installed. Adjust the version depending on the copy that's installed on your system.
Studio 2022 is installed. Adjust the version depending on the copy that's installed on your system.

.. code-block:: dosbatch

mkdir build
cd build
cmake .. -G"Visual Studio 15 2017 Win64"
cmake .. -G"Visual Studio 17 2022" -A x64

.. note:: Visual Studio 2017 or newer is required
.. note:: Visual Studio 2019 or newer is required

Ensure that you have Visual Studio version 2017 or newer.
Treelite uses the C++17 standard. Ensure that you have Visual Studio version 2019 or newer.

Once CMake finished running, open the generated solution file (``treelite.sln``) in Visual Studio.
From the top menu, select **Build > Build Solution**.

2. Installing Python package
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The Python package is located at the ``python`` subdirectory. There are several
ways to install the package:

**1. Install system-wide, which requires root permission**
----------------------------
The Python package is located at the ``python`` subdirectory. Run Pip to install the Python
package. The Python package will re-use the native library built in Step 1.

.. code-block:: bash

# Install treelite
cd python
sudo python3 setup.py install
# Install treelite_runtime
cd ../runtime/python
sudo python3 setup.py install

You will need Python `setuptools <https://pypi.python.org/pypi/setuptools>`_
module for this to work. It is often part of the core Python installation.
Should it be necessary, the package can be installed using ``pip``:

.. code-block:: bash

pip install -U pip setuptools

**2. Install for only current user**

This is useful if you do not have the administrative rights.

.. code-block:: bash

# Install treelite
cd python
python3 setup.py install --user
# Install treelite_runtime
cd ../runtime/python
python3 setup.py install --user

.. note:: Recompiling Treelite

Every time the C++ portion of Treelite gets re-compiled, the Python
package must be re-installed for the new library to take effect.

**3. Set the environment variable PYTHONPATH to locate Treelite package**

Only set the environment variable ``PYTHONPATH`` to tell Python where to find
the Treelite package. This is useful for developers, as any changes made
to C++ code will be immediately visible to Python side without re-running ``setup.py``.

.. code-block:: bash

export PYTHONPATH=/path/to/treelite/python:/path/to/treelite/runtime/python
python3 # enter interactive session

pip install . # will re-use libtreelite.so
Loading