Skip to content

Latest commit

 

History

History
177 lines (89 loc) · 13.3 KB

llm-robotics-and-embodied-ai.md

File metadata and controls

177 lines (89 loc) · 13.3 KB

Paper collection of LLM + tool use for robotics and embodied AI

Introduction

Papers

  1. CLIPort: What and Where Pathways for Robotic Manipulation. CoRL 2021

    Mohit Shridhar, Lucas Manuelli, Dieter Fox [pdf], 2021.9

  2. Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents. ICML 2022

    Wenlong Huang, Pieter Abbeel, Deepak Pathak, Igor Mordatch [pdf], 2022.1

  3. Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language. ICLR 2023

    Andy Zeng, Maria Attarian, Brian Ichter, Krzysztof Choromanski, Adrian Wong, Stefan Welker, Federico Tombari, Aveek Purohit, Michael Ryoo, Vikas Sindhwani, Johnny Lee, Vincent Vanhoucke, Pete Florence [pdf], 2022.4

  4. Do As I Can, Not As I Say: Grounding Language in Robotic Affordances. CoRL 2022

    Michael Ahn, Anthony Brohan, Noah Brown, Yevgen Chebotar, Omar Cortes, Byron David, Chelsea Finn, Chuyuan Fu, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Daniel Ho, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Eric Jang, Rosario Jauregui Ruano, Kyle Jeffrey, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Kuang-Huei Lee, Sergey Levine, Yao Lu, Linda Luu, Carolina Parada, Peter Pastor, Jornell Quiambao, Kanishka Rao, Jarek Rettinghouse, Diego Reyes, Pierre Sermanet, Nicolas Sievers, Clayton Tan, Alexander Toshev, Vincent Vanhoucke, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Mengyuan Yan, Andy Zeng [pdf], 2022.4

  5. Inner Monologue: Embodied Reasoning through Planning with Language Models. CoRL 2022

    Wenlong Huang, Fei Xia, Ted Xiao, Harris Chan, Jacky Liang, Pete Florence, Andy Zeng, Jonathan Tompson, Igor Mordatch, Yevgen Chebotar, Pierre Sermanet, Noah Brown, Tomas Jackson, Linda Luu, Sergey Levine, Karol Hausman, Brian Ichter [pdf], 2022.7

  6. JARVIS: A Neuro-Symbolic Commonsense Reasoning Framework for Conversational Embodied Agents. SoCal NLP 2022

    Kaizhi Zheng, Kaiwen Zhou, Jing Gu, Yue Fan, Jialu Wang, Zonglin Di, Xuehai He, Xin Eric Wang [pdf], 2022.8

  7. ProgPrompt: Generating Situated Robot Task Plans using Large Language Models. ICRA 2023

    Ishika Singh, Valts Blukis, Arsalan Mousavian, Ankit Goyal, Danfei Xu, Jonathan Tremblay, Dieter Fox, Jesse Thomason, Animesh Garg [pdf], 2022.9

  8. Code as Policies: Language Model Programs for Embodied Control. ICRA 2023

    Jacky Liang, Wenlong Huang, Fei Xia, Peng Xu, Karol Hausman, Brian Ichter, Pete Florence, Andy Zeng [pdf], 2022.9

  9. VIMA: General Robot Manipulation with Multimodal Prompts. ICML 2023

    Yunfan Jiang, Agrim Gupta, Zichen Zhang, Guanzhi Wang, Yongqiang Dou, Yanjun Chen, Li Fei-Fei, Anima Anandkumar, Yuke Zhu, Linxi Fan [pdf], 2022.10

  10. LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models. Arxiv

    Chan Hee Song, Jiaman Wu, Clayton Washington, Brian M. Sadler, Wei-Lun Chao, Yu Su [pdf], 2022.12

  11. RT-1: Robotics Transformer for Real-World Control at Scale. Arxiv

    Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Joseph Dabis, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Tomas Jackson, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Kuang-Huei Lee, Sergey Levine, Yao Lu, Utsav Malla, Deeksha Manjunath, Igor Mordatch, Ofir Nachum, Carolina Parada, Jodilyn Peralta, Emily Perez, Karl Pertsch, Jornell Quiambao, Kanishka Rao, Michael Ryoo, Grecia Salazar, Pannag Sanketi, Kevin Sayed, Jaspiar Singh, Sumedh Sontakke, Austin Stone, Clayton Tan, Huong Tran, Vincent Vanhoucke, Steve Vega, Quan Vuong, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Tianhe Yu, Brianna Zitkovich [pdf], 2022.12

  12. Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making using Language Guided World Modelling ICML 2023

    Kolby Nottingham, Prithviraj Ammanabrolu, Alane Suhr, Yejin Choi, Hannaneh Hajishirzi, Sameer Singh, Roy Fox [pdf], 2023.1

  13. Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents. Arxiv

    Zihao Wang, Shaofei Cai, Anji Liu, Xiaojian Ma, Yitao Liang [pdf], 2023.2

  14. Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control Arxiv

    Wenlong Huang, Fei Xia, Dhruv Shah, Danny Driess, Andy Zeng, Yao Lu, Pete Florence, Igor Mordatch, Sergey Levine, Karol Hausman, Brian Ichter [pdf], 2023.3

  15. PaLM-E: An Embodied Multimodal Language Model ICML 2023

    Danny Driess, Fei Xia, Mehdi S. M. Sajjadi, Corey Lynch, Aakanksha Chowdhery, Brian Ichter, Ayzaan Wahid, Jonathan Tompson, Quan Vuong, Tianhe Yu, Wenlong Huang, Yevgen Chebotar, Pierre Sermanet, Daniel Duckworth, Sergey Levine, Vincent Vanhoucke, Karol Hausman, Marc Toussaint, Klaus Greff, Andy Zeng, Igor Mordatch, Pete Florence [pdf], 2023.3

  16. Text2Motion: From Natural Language Instructions to Feasible Plans ICRA 2023 PT4R Workshop

    Kevin Lin, Christopher Agia, Toki Migimatsu, Marco Pavone, Jeannette Bohg [pdf], 2023.3

  17. Programmatically Grounded, Compositionally Generalizable Robotic Manipulation ICLR 2023

    Renhao Wang, Jiayuan Mao, Joy Hsu, Hang Zhao, Jiajun Wu, Yang Gao [pdf], 2023.4

  18. TidyBot: Personalized Robot Assistance with Large Language Models IROS 2023

    Jimmy Wu, Rika Antonova, Adam Kan, Marion Lepert, Andy Zeng, Shuran Song, Jeannette Bohg, Szymon Rusinkiewicz, Thomas Funkhouser [pdf], 2023.5

  19. EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought Arxiv

    Yao Mu, Qinglong Zhang, Mengkang Hu, Wenhai Wang, Mingyu Ding, Jun Jin, Bin Wang, Jifeng Dai, Yu Qiao, Ping Luo [pdf], 2023.5

  20. SPRING: GPT-4 Out-performs RL Algorithms by Studying Papers and Reasoning Arxiv

    Yue Wu, Shrimai Prabhumoye, So Yeon Min, Yonatan Bisk, Ruslan Salakhutdinov, Amos Azaria, Tom Mitchell, Yuanzhi Li [pdf], 2023.5

  21. Voyager: An Open-Ended Embodied Agent with Large Language Models Arxiv

    Guanzhi Wang, Yuqi Xie, Yunfan Jiang, Ajay Mandlekar, Chaowei Xiao, Yuke Zhu, Linxi Fan, Anima Anandkumar [pdf], 2023.5

  22. Mindstorms in Natural Language-Based Societies of Mind Arxiv

    Mingchen Zhuge, Haozhe Liu, Francesco Faccio, Dylan R. Ashley, Róbert Csordás, Anand Gopalakrishnan, Abdullah Hamdi, Hasan Abed Al Kader Hammoud, Vincent Herrmann, Kazuki Irie, Louis Kirsch, Bing Li, Guohao Li, Shuming Liu, Jinjie Mai, Piotr Piękos, Aditya Ramesh, Imanol Schlag, Weimin Shi, Aleksandar Stanić, Wenyi Wang, Yuhui Wang, Mengmeng Xu, Deng-Ping Fan, Bernard Ghanem, Jürgen Schmidhuber [pdf], 2023.5

  23. Embodied Executable Policy Learning with Language-based Scene Summarization Arxiv

    Jielin Qiu, Mengdi Xu, William Han, Seungwhan Moon, Ding Zhao [pdf], 2023.6

  24. Generating Language Corrections for Teaching Physical Control Tasks ICML 2023

    Megha Srivastava, Noah Goodman, Dorsa Sadigh [pdf], 2023.6

  25. SayTap: Language to Quadrupedal Locomotion Arxiv

    Yujin Tang, Wenhao Yu, Jie Tan, Heiga Zen, Aleksandra Faust, Tatsuya Harada [pdf], 2023.6

  26. Language to Rewards for Robotic Skill Synthesis Arxiv

    Wenhao Yu, Nimrod Gileadi, Chuyuan Fu, Sean Kirmani, Kuang-Huei Lee, Montse Gonzalez Arenas, Hao-Tien Lewis Chiang, Tom Erez, Leonard Hasenclever, Jan Humplik, Brian Ichter, Ted Xiao, Peng Xu, Andy Zeng, Tingnan Zhang, Nicolas Heess, Dorsa Sadigh, Jie Tan, Yuval Tassa, Fei Xia [pdf], 2023.6

  27. REFLECT: Summarizing Robot Experiences for Failure Explanation and Correction Arxiv

    Zeyi Liu, Arpit Bahety, Shuran Song [pdf], 2023.6

  28. ChatGPT for Robotics: Design Principles and Model Abilities Arxiv

    Sai Vemprala, Rogerio Bonatti, Arthur Bucker, Ashish Kapoor [pdf], 2023.6

  29. Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Theory of Mind. Arxiv

    Swarnadeep Saha, Peter Hase, Mohit Bansal [pdf], 2023.6

  30. RoboCat: A Self-Improving Foundation Agent for Robotic Manipulation. Arxiv

    Konstantinos Bousmalis, Giulia Vezzani, Dushyant Rao, Coline Devin, Alex X. Lee, Maria Bauza, Todor Davchev, Yuxiang Zhou, Agrim Gupta, Akhil Raju, Antoine Laurens, Claudio Fantacci, Valentin Dalibard, Martina Zambelli, Murilo Martins, Rugile Pevceviciute, Michiel Blokzijl, Misha Denil, Nathan Batchelor, Thomas Lampe, Emilio Parisotto, Konrad Żołna, Scott Reed, Sergio Gómez Colmenarejo, Jon Scholz, Abbas Abdolmaleki, Oliver Groth, Jean-Baptiste Regli, Oleg Sushkov, Tom Rothörl, José Enrique Chen, Yusuf Aytar, Dave Barker, Joy Ortiz, Martin Riedmiller, Jost Tobias Springenberg, Raia Hadsell, Francesco Nori, Nicolas Heess [pdf], 2023.6

  31. Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners Arxiv

    Allen Z. Ren, Anushri Dixit, Alexandra Bodrova, Sumeet Singh, Stephen Tu, Noah Brown, Peng Xu, Leila Takayama, Fei Xia, Jake Varley, Zhenjia Xu, Dorsa Sadigh, Andy Zeng, Anirudha Majumdar [pdf], 2023.7

  32. Building Cooperative Embodied Agents Modularly with Large Language Models Arxiv

    Hongxin Zhang, Weihua Du, Jiaming Shan, Qinhong Zhou, Yilun Du, Joshua B. Tenenbaum, Tianmin Shu, Chuang Gan [pdf], 2023.7

  33. VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models Arxiv

    Wenlong Huang, Chen Wang, Ruohan Zhang, Yunzhu Li, Jiajun Wu, Li Fei-Fei [pdf], 2023.7

  34. Demonstrating Large Language Models on Robots RSS 2023 Demo Track

    Google DeepMind [pdf], 2023.7

  35. GenSim: Generative Models for Supersizing Robotic Simulation Tasks Github

    Lirui Wang [pdf], 2023.7

  36. Large Language Models as General Pattern Machines Arxiv

    Suvir Mirchandani, Fei Xia, Pete Florence, Brian Ichter, Danny Driess, Montserrat Gonzalez Arenas, Kanishka Rao, Dorsa Sadigh, Andy Zeng [pdf], 2023.7

  37. SayPlan: Grounding Large Language Models using 3D Scene Graphs for Scalable Task Planning Arxiv

    Krishan Rana, Jesse Haviland, Sourav Garg, Jad Abou-Chakra, Ian Reid, Niko Suenderhauf [pdf], 2023.7

  38. RoCo: Dialectic Multi-Robot Collaboration with Large Language Models Arxiv

    Zhao Mandi, Shreeya Jain, Shuran Song [pdf], 2023.7

  39. RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control Arxiv

    Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Xi Chen, Krzysztof Choromanski, Tianli Ding, Danny Driess, Avinava Dubey, Chelsea Finn, Pete Florence, Chuyuan Fu, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Kehang Han, Karol Hausman, Alexander Herzog, Jasmine Hsu, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Lisa Lee, Tsang-Wei Edward Lee, Sergey Levine, Yao Lu, Henryk Michalewski, Igor Mordatch, Karl Pertsch, Kanishka Rao, Krista Reymann, Michael Ryoo, Grecia Salazar, Pannag Sanketi, Pierre Sermanet, Jaspiar Singh, Anikait Singh, Radu Soricut, Huong Tran, Vincent Vanhoucke, Quan Vuong, Ayzaan Wahid, Stefan Welker, Paul Wohlhart, Jialin Wu, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Tianhe Yu, Brianna Zitkovich [pdf], 2023.7

  40. Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition Arxiv

    Huy Ha, Pete Florence, Shuran Song [pdf], 2023.7

  41. Learning to Model the World with Language Arxiv

    Jessy Lin, Yuqing Du, Olivia Watkins, Danijar Hafner, Pieter Abbeel, Dan Klein, Anca Dragan [pdf], 2023.8

  42. Physically Grounded Vision-Language Models for Robotic Manipulation Arxiv

    Jensen Gao, Bidipta Sarkar, Fei Xia, Ted Xiao, Jiajun Wu, Brian Ichter, Anirudha Majumdar, Dorsa Sadigh [pdf], 2023.9

  43. Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning Arxiv

    Tianbao Xie, Siheng Zhao, Chen Henry Wu, Yitao Liu, Qian Luo, Victor Zhong, Yanchao Yang, Tao Yu [pdf], 2023.9