Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docs for qlib.rl #1322

Merged
merged 45 commits into from
Nov 10, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
ddbbcc6
Add docs for qlib.rl
lwwang1995 Oct 19, 2022
9dd860c
Update docs for qlib.rl
lwwang1995 Oct 19, 2022
c7b68b1
Add homepage introduct to RL framework
you-n-g Oct 20, 2022
d13262f
Update index Link
you-n-g Oct 20, 2022
8cacc96
Fix Icon
you-n-g Oct 20, 2022
c3577e1
typo
you-n-g Oct 20, 2022
b139e7d
Merge remote-tracking branch 'origin/main' into HEAD
you-n-g Oct 20, 2022
a0d3621
Update catelog
you-n-g Oct 20, 2022
1a62af9
Update docs for qlib.rl
lwwang1995 Oct 20, 2022
c9b2198
Update docs for qlib.rl
lwwang1995 Oct 21, 2022
5a75c9d
Update figure
lwwang1995 Oct 21, 2022
77fbb16
Update docs for qlib.rl
lwwang1995 Oct 21, 2022
5d2f21f
Update setup.py
you-n-g Oct 22, 2022
160f951
Merge remote-tracking branch 'origin/main' into HEAD
you-n-g Oct 22, 2022
4b705a9
FIx setup.py
you-n-g Oct 22, 2022
145414b
Update docs and fix some typos
lwwang1995 Oct 24, 2022
f215418
Fix the reference to RL docs
lwwang1995 Oct 24, 2022
b688db7
Update framework.svg
you-n-g Oct 24, 2022
5035af1
Update framework.svg
you-n-g Oct 24, 2022
3b182e1
Update framework.svg
you-n-g Oct 24, 2022
834c4f4
Update docs for qlibrl.
lwwang1995 Oct 27, 2022
0b17397
Update docs for qlibrl.
lwwang1995 Oct 27, 2022
7bfc937
Update docs for Qlibrl.
lwwang1995 Oct 27, 2022
21b765d
Update docs for qlibrl.
lwwang1995 Oct 27, 2022
1703492
Update docs for qlibrl.
lwwang1995 Oct 27, 2022
4d73676
Update docs for qlibrl.
lwwang1995 Oct 27, 2022
f7713e2
Add new framework
you-n-g Oct 27, 2022
a59f844
Update jpg
you-n-g Oct 27, 2022
47667a7
Update framework.svg
you-n-g Oct 28, 2022
129c1a8
Update framework.svg
you-n-g Oct 28, 2022
db543fc
Update Qlib framework and description
you-n-g Oct 28, 2022
34e2bc4
Update grammar
you-n-g Oct 28, 2022
8d7df20
Update README.md
you-n-g Oct 28, 2022
b3eec1c
Update README.md
you-n-g Oct 28, 2022
946177d
Update docs/component/rl.rst
lwwang1995 Oct 28, 2022
04a9b8f
Update docs/component/rl.rst
lwwang1995 Oct 28, 2022
e248066
Update docs for qlib.rl
lwwang1995 Nov 2, 2022
6020b86
Change theme for docs.
lwwang1995 Nov 3, 2022
5db5ea9
Update docs for qlib.rl
lwwang1995 Nov 4, 2022
7b84f49
Update docs for qlib.rl
lwwang1995 Nov 7, 2022
c47a460
Update docs for qlib.rl
lwwang1995 Nov 7, 2022
cf3642d
Update docs for qlib.rl.
lwwang1995 Nov 7, 2022
d723685
Update docs for qlib.rl
lwwang1995 Nov 8, 2022
6484cfa
Update docs for qlib.rl
lwwang1995 Nov 8, 2022
0db199c
Update docs for qlib.rl
lwwang1995 Nov 8, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 20 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
Recent released features
| Feature | Status |
| -- | ------ |
| RL Learning Framework | :hammer: :chart_with_upwards_trend: Released on Oct 20, 2022. [#1322](https://github.com/microsoft/qlib/pull/1322), [#1316](https://github.com/microsoft/qlib/pull/1316),[#1299](https://github.com/microsoft/qlib/pull/1299),[#1263](https://github.com/microsoft/qlib/pull/1263), [#1244](https://github.com/microsoft/qlib/pull/1244), [#1169](https://github.com/microsoft/qlib/pull/1169), [#1125](https://github.com/microsoft/qlib/pull/1125), [#1076](https://github.com/microsoft/qlib/pull/1076)|
| HIST and IGMTF models | :chart_with_upwards_trend: [Released](https://github.com/microsoft/qlib/pull/1040) on Apr 10, 2022 |
| Qlib [notebook tutorial](https://github.com/microsoft/qlib/tree/main/examples/tutorial) | 📖 [Released](https://github.com/microsoft/qlib/pull/1037) on Apr 7, 2022 |
| Ibovespa index data | :rice: [Released](https://github.com/microsoft/qlib/pull/990) on Apr 6, 2022 |
Expand Down Expand Up @@ -67,6 +68,7 @@ For more details, please refer to our paper ["Qlib: An AI-oriented Quantitative
<li type="circle"><a href="#auto-quant-research-workflow">Auto Quant Research Workflow</a></li>
<li type="circle"><a href="#building-customized-quant-research-workflow-by-code">Building Customized Quant Research Workflow by Code</a></li></ul>
<li><a href="#quant-dataset-zoo"><strong>Quant Dataset Zoo</strong></a></li>
<li><a href="#learning-framework">Learning Framework</a></li>
<li><a href="#more-about-qlib">More About Qlib</a></li>
<li><a href="#offline-mode-and-online-mode">Offline Mode and Online Mode</a>
<ul>
Expand Down Expand Up @@ -105,21 +107,16 @@ Your feedbacks about the features are very important.
# Framework of Qlib

<div style="align: center">
<img src="docs/_static/img/framework.svg" />
<img src="docs/_static/img/framework-abstract.jpg" />
</div>

At the module level, Qlib is a platform that consists of the above components. The components are designed as loose-coupled modules, and each component could be used stand-alone.
The high-level framework of Qlib can be found above(users can find the [detailed framework](https://qlib.readthedocs.io/en/latest/introduction/introduction.html#framework) of Qlib's design when getting into nitty gritty).
The components are designed as loose-coupled modules, and each component could be used stand-alone.

| Name | Description |
| ------ | ----- |
| `Infrastructure` layer | `Infrastructure` layer provides underlying support for Quant research. `DataServer` provides a high-performance infrastructure for users to manage and retrieve raw data. `Trainer` provides a flexible interface to control the training process of models, which enable algorithms to control the training process. |
| `Workflow` layer | `Workflow` layer covers the whole workflow of quantitative investment. `Information Extractor` extracts data for models. `Forecast Model` focuses on producing all kinds of forecast signals (e.g. _alpha_, risk) for other modules. With these signals `Decision Generator` will generate the target trading decisions(i.e. portfolio, orders) to be executed by `Execution Env` (i.e. the trading market). There may be multiple levels of `Trading Agent` and `Execution Env` (e.g. an _order executor trading agent and intraday order execution environment_ could behave like an interday trading environment and nested in _daily portfolio management trading agent and interday trading environment_ ) |
| `Interface` layer | `Interface` layer tries to present a user-friendly interface for the underlying system. `Analyser` module will provide users detailed analysis reports of forecasting signals, portfolios and execution results |

* The modules with hand-drawn style are under development and will be released in the future.
* The modules with dashed borders are highly user-customizable and extendible.

(p.s. framework image is created with https://draw.io/)
Qlib provides a strong infrastructure to support Quant research. [Data](https://qlib.readthedocs.io/en/latest/component/data.html) is always an important part.
A strong learning framework is designed to support diverse learning paradigms (e.g. [reinforcement learning](https://qlib.readthedocs.io/en/latest/component/rl.html), [supervised learning](https://qlib.readthedocs.io/en/latest/component/workflow.html#model-section)) and patterns at different levels(e.g. [market dynamic modeling](https://qlib.readthedocs.io/en/latest/component/meta.html)).
By modeling the market, [trading strategies](https://qlib.readthedocs.io/en/latest/component/strategy.html) will generate trade decisions that will be executed. Multiple trading strategies and executors in different levels or granularities can be [nested to be optimized and run together](https://qlib.readthedocs.io/en/latest/component/highfreq.html).
At last, a comprehensive [analysis](https://qlib.readthedocs.io/en/latest/component/report.html) will be provided and the model can be [served online](https://qlib.readthedocs.io/en/latest/component/online.html) in a low cost.


# Quick Start
Expand Down Expand Up @@ -404,6 +401,17 @@ Dataset plays a very important role in Quant. Here is a list of the datasets bui
[Here](https://qlib.readthedocs.io/en/latest/advanced/alpha.html) is a tutorial to build dataset with `Qlib`.
Your PR to build new Quant dataset is highly welcomed.


# Learning Framework
Qlib is high customizable and a lot of its components are learnable.
The learnable components are instances of `Forecast Model` and `Trading Agent`. They are learned based on the `Learning Framework` layer and then applied to multiple scenarios in `Workflow` layer.
The learning framework leverages the `Workflow` layer as well(e.g. sharing `Information Extractor`, creating environments based on `Execution Env`).

Based on learning paradigms, they can be categorized into reinforcement learning and supervised learning.
- For supervised learning, the detailed docs can be found [here](https://qlib.readthedocs.io/en/latest/component/model.html).
- For reinforcement learning, the detailed docs can be found [here](https://qlib.readthedocs.io/en/latest/component/rl.html). Qlib's RL learning framework leverages `Execution Env` in `Workflow` layer to create environments. It's worth noting that `NestedExecutor` is supported as well. This empowers users to optimize different level of strategies/models/agents together (e.g. optimizing an order execution strategy for a specific portfolio management strategy).


# More About Qlib
If you want to have a quick glance at the most frequently used components of qlib, you can try notebooks [here](examples/tutorial/).

Expand Down
Binary file added docs/_static/img/QlibRL_framework.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/img/RL_framework.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/img/framework-abstract.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/_static/img/framework.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
20 changes: 11 additions & 9 deletions docs/component/highfreq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,31 +8,33 @@ Design of Nested Decision Execution Framework for High-Frequency Trading
Introduction
============

Daily trading (e.g. portfolio management) and intraday trading (e.g. orders execution) are two hot topics in Quant investment and usually studied separately.
Daily trading (e.g. portfolio management) and intraday trading (e.g. orders execution) are two hot topics in Quant investment and are usually studied separately.

To get the join trading performance of daily and intraday trading, they must interact with each other and run backtest jointly.
In order to support the joint backtest strategies in multiple levels, a corresponding framework is required. None of the publicly available high-frequency trading frameworks considers multi-level joint trading, which make the backtesting aforementioned inaccurate.
In order to support the joint backtest strategies at multiple levels, a corresponding framework is required. None of the publicly available high-frequency trading frameworks considers multi-level joint trading, which makes the backtesting aforementioned inaccurate.

Besides backtesting, the optimization of strategies from different levels is not standalone and can be affected by each other.
For example, the best portfolio management strategy may change with the performance of order executions(e.g. a portfolio with higher turnover may becomes a better choice when we improve the order execution strategies).
To achieve the overall good performance , it is necessary to consider the interaction of strategies in different level.
For example, the best portfolio management strategy may change with the performance of order executions(e.g. a portfolio with higher turnover may become a better choice when we improve the order execution strategies).
To achieve overall good performance, it is necessary to consider the interaction of strategies at a different levels.

Therefore, building a new framework for trading in multiple levels becomes necessary to solve the various problems mentioned above, for which we designed a nested decision execution framework that consider the interaction of strategies.
Therefore, building a new framework for trading on multiple levels becomes necessary to solve the various problems mentioned above, for which we designed a nested decision execution framework that considers the interaction of strategies.

.. image:: ../_static/img/framework.svg

The design of the framework is shown in the yellow part in the middle of the figure above. Each level consists of ``Trading Agent`` and ``Execution Env``. ``Trading Agent`` has its own data processing module (``Information Extractor``), forecasting module (``Forecast Model``) and decision generator (``Decision Generator``). The trading algorithm generates the decisions by the ``Decision Generator`` based on the forecast signals output by the ``Forecast Module``, and the decisions generated by the trading algorithm are passed to the ``Execution Env``, which returns the execution results.

The frequency of trading algorithm, decision content and execution environment can be customized by users (e.g. intraday trading, daily-frequency trading, weekly-frequency trading), and the execution environment can be nested with finer-grained trading algorithm and execution environment inside (i.e. sub-workflow in the figure, e.g. daily-frequency orders can be turned into finer-grained decisions by splitting orders within the day). The flexibility of nested decision execution framework makes it easy for users to explore the effects of combining different levels of trading strategies and break down the optimization barriers between different levels of trading algorithm.
The frequency of the trading algorithm, decision content and execution environment can be customized by users (e.g. intraday trading, daily-frequency trading, weekly-frequency trading), and the execution environment can be nested with finer-grained trading algorithm and execution environment inside (i.e. sub-workflow in the figure, e.g. daily-frequency orders can be turned into finer-grained decisions by splitting orders within the day). The flexibility of the nested decision execution framework makes it easy for users to explore the effects of combining different levels of trading strategies and break down the optimization barriers between different levels of the trading algorithm.

The optimization for the nested decision execution framework can be implemented with the support of `QlibRL <https://qlib.readthedocs.io/en/latest/component/rl.html>`_. To know more about how to use the QlibRL, go to API Reference: `RL API <../reference/api.html#rl>`_.

Example
=======

An example of nested decision execution framework for high-frequency can be found `here <https://github.com/microsoft/qlib/blob/main/examples/nested_decision_execution/workflow.py>`_.
An example of a nested decision execution framework for high-frequency can be found `here <https://github.com/microsoft/qlib/blob/main/examples/nested_decision_execution/workflow.py>`_.


Besides, the above examples, here are some other related work about high-frequency trading in Qlib.
Besides, the above examples, here are some other related works about high-frequency trading in Qlib.

- `Prediction with high-frequency data <https://github.com/microsoft/qlib/tree/main/examples/highfreq#benchmarks-performance-predicting-the-price-trend-in-high-frequency-data>`_
- `Examples <https://github.com/microsoft/qlib/blob/main/examples/orderbook_data/>`_ to extract features form high-frequency data without fixed frequency.
- `Examples <https://github.com/microsoft/qlib/blob/main/examples/orderbook_data/>`_ to extract features from high-frequency data without fixed frequency.
- `A paper <https://github.com/microsoft/qlib/tree/high-freq-execution#high-frequency-execution>`_ for high-frequency trading.
45 changes: 45 additions & 0 deletions docs/component/rl/framework.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
The Framework of QlibRL
=======================

QlibRL contains a full set of components that cover the entire lifecycle of an RL pipeline, including building the simulator of the market, shaping states & actions, training policies (strategies), and backtesting strategies in the simulated environment.

QlibRL is basically implemented with the support of Tianshou and Gym frameworks. The high-level structure of QlibRL is demonstrated below:

.. image:: ../../_static/img/QlibRL_framework.png
:width: 600
:align: center

Here, we briefly introduce each component in the figure.

EnvWrapper
------------
EnvWrapper is the complete capsulation of the simulated environment. It receives actions from outside (policy/strategy/agent), simulates the changes in the market, and then replies rewards and updated states, thus forming an interaction loop.

In QlibRL, EnvWrapper is a subclass of gym.Env, so it implements all necessary interfaces of gym.Env. Any classes or pipelines that accept gym.Env should also accept EnvWrapper. Developers do not need to implement their own EnvWrapper to build their own environment. Instead, they only need to implement 4 components of the EnvWrapper:

- `Simulator`
The simulator is the core component responsible for the environment simulation. Developers could implement all the logic that is directly related to the environment simulation in the Simulator in any way they like. In QlibRL, there are already two implementations of Simulator for single asset trading: 1) ``SingleAssetOrderExecution``, which is built based on Qlib's backtest toolkits and hence considers a lot of practical trading details but is slow. 2) ``SimpleSingleAssetOrderExecution``, which is built based on a simplified trading simulator, which ignores a lot of details (e.g. trading limitations, rounding) but is quite fast.
- `State interpreter`
The state interpreter is responsible for "interpret" states in the original format (format provided by the simulator) into states in a format that the policy could understand. For example, transform unstructured raw features into numerical tensors.
- `Action interpreter`
The action interpreter is similar to the state interpreter. But instead of states, it interprets actions generated by the policy, from the format provided by the policy to the format that is acceptable to the simulator.
- `Reward function`
The reward function returns a numerical reward to the policy after each time the policy takes an action.

EnvWrapper will organically organize these components. Such decomposition allows for better flexibility in development. For example, if the developers want to train multiple types of policies in the same environment, they only need to design one simulator and design different state interpreters/action interpreters/reward functions for different types of policies.

QlibRL has well-defined base classes for all these 4 components. All the developers need to do is define their own components by inheriting the base classes and then implementing all interfaces required by the base classes. The API for the above base components can be found `here <../../reference/api.html#module-qlib.rl>`_.

Policy
------------
QlibRL directly uses Tianshou's policy. Developers could use policies provided by Tianshou off the shelf, or implement their own policies by inheriting Tianshou's policies.

Training Vessel & Trainer
-------------------------
As stated by their names, training vessels and trainers are helper classes used in training. A training vessel is a ship that contains a simulator/interpreters/reward function/policy, and it controls algorithm-related parts of training. Correspondingly, the trainer is responsible for controlling the runtime parts of training.

As you may have noticed, a training vessel itself holds all the required components to build an EnvWrapper rather than holding an instance of EnvWrapper directly. This allows the training vessel to create duplicates of EnvWrapper dynamically when necessary (for example, under parallel training).

With a training vessel, the trainer could finally launch the training pipeline by simple, Scikit-learn-like interfaces (i.e., ``trainer.fit()``).

The API for Trainer and TrainingVessel and can be found `here <../../reference/api.html#module-qlib.rl.trainer>`_.
Loading