Skip to content

hammer-wang/RL_literature

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 

Repository files navigation

RL Literature

Survey

  • Yu, Yang. "Towards Sample Efficient Reinforcement Learning." IJCAI. 2018. [pdf]
  • Meta-Learning in Neural Networks: A Survey, arix, 2020. [pdf]
  • Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems, arxiv, 2020. [pdf]

Books

  • Reinforcement Learning: Theory and Algorithms, 2020. [pdf]

Popular Base Learner

  • TRPO
  • PPO
  • DDPG
  • DQN

Model-based RL (inlcuding model-based optimization)

  • A Game Theoretic Framework for Model Based Reinforcement Learning, arxiv, 2020. [pdf]
  • Benchmarking Model-Based Reinforcement Learning
  • Calibrated model-based deep reinforcement learning
  • Clavera, Ignasi, et al. "Model-based reinforcement learning via meta-policy optimization." arXiv preprint arXiv:1809.05214 (2018).
  • Deep Model-Based Reinforcement Learning via Estimated Uncertainty and Conservative Policy Optimization. AAAI, 2020. [pdf]
  • Deep reinforcement learning in a handful of trials using probabilistic dynamics models
  • End-to-end differentiable physics for learning and control
  • Exploring model-based planning with policy networks
  • Intelligent Trainer for Dyna-Style Model-Based Deep Reinforcement Learning,
  • Deep reinforcement learning in a handful of trials using probabilistic dynamics models. arXiv preprint. arXiv:1805.12114, 2018.
  • Learning latent dynamics for planning from pixels
  • Meta-Model-Based Meta-Policy Optimization.
  • Model Based Reinforcement Learning for Atari.
  • Model-based active exploration.
  • Model-based Adversarial Meta-Reinforcement Learning, arxiv, 2020. [pdf]
  • Model-based Reinforcement Learning for Semi-Markov Decision Processes with Neural ODEs, arxiv, 2020. [pdf]
  • Model-based Trust-Region Policy Optimization.
  • MOReL: Model-Based Offline Reinforcement Learning, arxiv, 2020. [pdf]
  • Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning
  • On Optimism in Model-Based Reinforcement Learning
  • On the Expressivity of Neural Networks for Deep Reinforcement Learning, ICML, 2020. [pdf]
  • Overcoming Model Bias for Robust Offline Deep Reinforcement Learning, arxiv, 2020. [pdf]
  • Policy optimization with model-based explorations, AAAI, 2019. [pdf]
  • Ready Policy One: World Building Through Active Learning
  • Recurrent World Models Facilitate Policy Evolution
  • Sample complexity of reinforcement learning using linearly combined model ensembles
  • Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion
  • Scalable Bayesian Optimization Using Deep Neural Networks
  • Search on the replay buffer: Bridging planning and reinforcement learning
  • Self-Supervised Exploration via Disagreement
  • SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning, arxiv, 2020. [pdf]
  • When to trust your model: Model-based policy optimization

Exploration

  • If MaxEnt RL is the Answer, What is the Question?
  • Provably Efficient Q-Learning with Low Switching Cost

Meta RL

  • A Model-based Approach for Sample-efficient Multi-task Reinforcement Learning, arxiv, 2019. [pdf]
  • Learning to adapt in dynamic, real-world environments through meta-reinforcement learning
  • RL2: Fast Reinforcement Learning via Slow Reinforcement Learning, ICLR, 2017. [pdf]

Causal RL

  • Provably Efficient Causal Reinforcement Learning with Confounded Observational Data, arxiv, 2020. [pdf]

Distributional RL

  • Bellemare, M. G., Dabney, W., and Munos, R. A distributional perspective on reinforcement learning. ICML, 2017. [pdf]
  • Dabney, W., Rowland, M., Bellemare, M. G., and Munos, R. Distributional reinforcement learning with quantile regression. AAAI, 2018 [pdf]

Offline RL

Researchers: Sergey Levine, Nando de Freitas, Nan Jiang, Emma Brunskill

  • Accelerating Online Reinforcement Learning with Offline Datasets, arxiv, 2020. [pdf]
  • Accelerating Reinforcement Learning with Learned Skill Priors, Arxiv, 2020. [pdf]
  • Batch Exploration with Examples for Scalable Robotic Reinforcement Learning, arxiv, 2020. [pdf]
  • Behavior regularized offline reinforcement learning. arXiv preprint arXiv:1911.11361, 2019. [pdf]
  • Critic Regularized Regression, arxiv, 2020. [pdf]
  • D4RL: Datasets for Deep Data-Driven Reinforcement Learning, 2020. [pdf]
  • Defining Admissible Rewards for High-Confidence Policy Evaluation in Batch Reinforcement Learning, ACM CHIL, 2020. [pdf]
  • Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization, arxiv, 2020. [pdf]
  • EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL, 2020. [pdf]
  • Hyperparameter selection for offline reinforcement learning, arxiv, 2020. [pdf]
  • Information-Theoretic Considerations in Batch Reinforcement Learning, ICML, 2019. [pdf]
  • Learning Deep Features in Instrumental Variable Regression, ICLR, 2021. [pdf]
  • MOPO: Model-based Offline Policy Optimization, arxiv, 2020. [pdf]
  • Model-Based Offline Planning, arxiv, 2020. [pdf]
  • Near Optimal Provable Uniform Convergence in Off-Policy Evaluation for Reinforcement Learning [pdf]
  • NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning, arxiv, 2021. [pdf]
  • Offline Meta Reinforcement Learning, arxiv, 2020. [pdf]
  • Offline Meta-Reinforcement Learning with Advantage Weighting, arxiv, 2020. [pdf]
  • Overcoming Model Bias for Robust Offline Deep Reinforcement Learning, arxiv, 2020. [pdf]
  • Overfitting and Optimization in Offline Policy Learning, arxiv, 2020. [pdf]
  • Q Approximation Schemes for Batch Reinforcement Learning: A Theoretical Comparison, arxiv, 2020. [pdf]
  • Rl unplugged: Benchmarks for offline reinforcement learning. NeurIPS, 2020. [pdf]
  • Semi-supervised reward learning for offline reinforcement learning, arxiv, 2020, [pdf]
  • Stabilizing off-policy q-learning via bootstrapping error reduction (BEAR). In Advances in Neural Information Processing Systems, pages 11761–11771, 2019. [pdf]
  • The Importance of Pessimism in Fixed-Dataset Policy Optimization, arxiv, 2020. [pdf]
  • An Optimistic Perspective on Offline Reinforcement Learning, ICML, 2020. [pdf]
  • Conservative Q-Learning for Offline Reinforcement Learning, arxiv, 2020. [pdf]
  • MOReL: Model-Based Offline Reinforcement Learning, NeurIPS, 2020. [pdf]
  • Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems, arxiv, 2020. [pdf]

Off-policy Evaluation

  • Minimax Weight and Q-Function Learning for Off-Policy Evaluation, arxiv, 2019. [pdf]
  • CoinDICE: Off-Policy Confidence Interval Estimation, arxiv, 2020. [pdf]

Multi-objective RL

  • An Optimal Policy for Patient Laboratory Tests in Intensive Care Units, arxiv, 2019. [pdf]

Policy Gradient

  • Phasic Policy Gradient, arxiv, 2020. [pdf]

Applications

  • Eastman, Peter, et al. "Solving the RNA design problem with reinforcement learning." PLoS computational biology 14.6 (2018): e1006176.
  • Adaptive Droplet Routing in Digital Microfluidic Biochips Using Deep Reinforcement Learning, ICML, 2020. [pdf]
  • Angermueller, Christof, et al. "Model-based reinforcement learning for biological sequence design." International Conference on Learning Representations. 2020. [pdf]
  • Automated Optical Multi-layer Design via Deep Reinforcement Learning, arxiv, 2020. [pdf]
  • Micro/Nano Motor Navigation and Localization via Deep Reinforcement Learning, Advanced Theory and Simulations, 2020. [pdf]
  • Mills, Kyle, Pooya Ronagh, and Isaac Tamblyn. "Finding the ground state of spin Hamiltonians with reinforcement learning." Nature Machine Intelligence (2020): 1-9.
  • Meta-AAD: Active Anomaly Detection with Deep Reinforcement Learning, arxiv, 2020. [pdf]
  • Optimal policy learning for COVID-19 prevention using reinforcement learning, Journal of Information Science, 2020. [pdf]
  • Learning to Drive in a Day, 2018. [pdf]
  • Data Valuation with Reinforcement Learning, ICML, 2020. [pdf]
  • Learning When-to-Treat Policies, arxiv, 2020. [pdf]

Navigation

  • SoundSpaces: Audio-Visual Navigation in 3D Environments, ECCV, 2020. [pdf]

Others (these are not RL papers but share some conceptual similarity)

Model-based Optimization

  • Model Inversion Networks for Model-Based Optimization
  • Incomplete Conditional Density Estimation for Fast Materials Discovery
  • Autofocused oracles for model-based design, arxiv, 2020. [pdf]

Multi-objective Optimization

  • Diversity-Guided Multi-Objective Bayesian Optimization With Batch Evaluations, NeurIPS, 2020. [pdf]
  • Predictive Entropy Search for Multi-objective Bayesian Optimization, ICML, 2020. [pdf]

Conferences

ICML 2021 Reading List

Offline RL

  • Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning
  • Offline Contextual Bandits with Overparameterized Models
  • Augmented World Models Facilitate Zero-Shot Dynamics Generalization From a Single Offline Environment
  • Offline Reinforcement Learning with Fisher Divergence Critic Regularization
  • Offline Meta-Reinforcement Learning with Advantage Weighting
  • Multi-layered Network Exploration via Random Walks: From Offline Optimization to Online Learning
  • Offline Reinforcement Learning with Pseudometric Learning
  • Representation Matters: Offline Pretraining for Sequential Decision Making
  • Is Pessimism Provably Efficient for Offline RL?
  • OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation
  • Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills
  • EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL
  • Conservative Objective Models for Effective Offline Model-Based Optimization
  • Instabilities of Offline RL with Pre-Trained Neural Representation

Meta Learning

  • A Distribution-dependent Analysis of Meta Learning
  • MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration
  • Improving Generalization in Meta-learning via Task Augmentation
  • PACOH: Bayes-Optimal Meta-Learning with PAC-Guarantees
  • Data Augmentation for Meta-Learning
  • How Important is the Train-Validation Split in Meta-Learning?
  • Provable Meta-Learning of Linear Representations
  • Decoupling Exploration and Exploitation for Meta-Reinforcement Learning without Sacrifices
  • A Representation Learning Perspective on the Importance of Train-Validation Splitting in Meta-Learning
  • Meta-Learning Bidirectional Update Rules
  • Function Contrastive Learning of Transferable Meta-Representations
  • Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning
  • Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation
  • Meta-Thompson Sampling
  • Memory Efficient Online Meta Learning
  • Meta-learning Hyperparameter Performance Prediction with Neural Processes

Model-based RL

  • Combining Pessimism with Optimism for Robust and Efficient Model-Based Deep Reinforcement Learning
  • Model-Free and Model-Based Policy Evaluation when Causality is Uncertain
  • A Sharp Analysis of Model-based Reinforcement Learning with Self-Play
  • Model-based Reinforcement Learning for Continuous Control with Posterior Sampling
  • Continuous-time Model-based Reinforcement Learning
  • Model-Based Reinforcement Learning via Latent-Space Collocation
  • PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration
  • Temporal Predictive Coding For Model-Based Planning In Latent Space

Frameworks

  1. Spinning Up [link]
  2. OpenAI Baselines [link]
  3. Stable Baselines [link]
  4. Ray RLlib [link]

Curated Paper List

  1. Awesome Meta Learning [link]

Blogs

Relevant Courses

  1. Stanford CS234 Reinforcement Learning: http://web.stanford.edu/class/cs234/index.html
  2. Stanford CS330 Deep Multi-task and meta-learning: https://cs330.stanford.edu/
  3. UIUC CS598 Reinforcement learning theory: https://nanjiang.cs.illinois.edu/cs598/
  4. Berkeley CS285 Deep Reinforcement Learning: http://rail.eecs.berkeley.edu/deeprlcourse
  5. NYU Deep Learning: https://atcold.github.io/pytorch-Deep-Learning/

About

Awesome reinforcement learning papers.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published