This project is to about finding the optimal Fee mechanism in the Exchange. RL agents acts as people under certain Fee policies. We observe how RL agents's behavior changes with Fee mechanism changes. Fee mechanism would change total trade volume and total fee. This project is maintained as Binance Fellowship.
The project overview video : https://youtu.be/kBjv4KmkEHU
Our project environment is based on https://github.com/Yvictor/TradingGym/
Project Explanation : https://medium.com/decon-simulation/dynamic-fee-mechanism-simulation-with-reinforcement-learning-97c847aa5c
Project Explanation[KR]: https://medium.com/@jeffrey_7616/dynamic-fee-mechanism-simulation-with-reinforcement-learning-6d15951dec05
- agent Stores trading agents and specify how to train the agents and how to use them.
- data Stores the historical data to train the agents
- env Stores the environment where fee different fee mechanisms applied
-
Train RL agents using trading gym.
-
Transfer agents to different environments where different fee mechanism is applied. Agents will trained again for 500 episodes more to adapt to each environment. Also, differentiate agents by varying risk_aversion ratio so that some agents prefer risk while others not.
-
Observe how agents behave in each environment. Especially watch the total_volume and total_fee from each environment. Derive insights from the observation what characteristics of fee mechanism makes the difference.
Provide environment where Limit order available -> lagged matching available to reflect more realistic trading environment
- No fee
- fee = 0.003 (0.3%)
- fee = 0.005 (0.5%)
- Bollinger band bound Environment
- RSI bound Environment
- MACD bound Environment
- Stochastic slow bound Environment
https://arxiv.org/abs/1707.06347
https://arxiv.org/abs/1710.02298
http://nlp.seas.harvard.edu/2018/04/03/attention.html
pip install -r requirements.txt
- Train original agent
cd agent/PPO
python ppo_start.py
cd agent/Attention
python attention_start.py
cd agent/DQN
python dqn_start.py
if you want to train multiple agents,
cd agent/DQN
bash run.sh
- Transfer Learning
cd agent/DQN
python transfer_learning.py --environment=[environment]
if you want to transfer learn multiple agents,
cd agent/DQN
bash transfer.sh
- Observation
cd agent/DQN
Open Observation notebook and run all cell
Using integrated_gradient, we can interpret how agents observe the data. X axis represents actions and Y axis represents the feature of data. The graph shows how the feature of data affects the action decision of trading agent. You can see that the weight distribution of feature is different depending on the training algorithms.
The above figure shows the trading volume of Agent1 differing the fee environment. It shows same agent under same OHLCV situation makes different decisions.