基础模型在机器人认知与控制的应用研究综述
-
Instruct2Act:"Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model",arXiV, May 2023,[Paper] [Code]
-
TidyBot: "TidyBot: Personalized Robot Assistance with Large Language Models", arXiV, May 2023, [Paper] [Website]
-
PaLM-E: "PaLM-E: An Embodied Multimodal Language Model", arXiV, Mar 2023. [Paper] [Website]
-
Grounded Decoding:"Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control",arXiv,Mar 2023.[Paper] [Website]
-
RT-1: "RT-1: Robotics Transformer for Real-World Control at Scale", arXiv, Dec 2022. [Paper] [Website]
-
Housekeep: "Housekeep: Tidying Virtual Households using Commonsense Reasoning", arXiv, May 2022. [Paper] [Code] [Website]
-
Code-As-Policies: "Code as Policies: Language Model Programs for Embodied Control", arXiv, Sept 2022. [Paper] [Code] [Website]
-
Say-Can: "Do As I Can, Not As I Say: Grounding Language in Robotic Affordances", arXiv, Apr 2022. [Paper] [Code] [Website]
-
Socratic: "Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language", arXiv, Apr 2022. [Paper] [Code] [Website]
-
"Large Language Models as Zero-Shot Human Models for Human-Robot Interaction", arXiv, Mar 2023. [Paper]
-
VLaMP: "Pretrained Language Models as Visual Planners for Human Assistance", arXiV, Apr 2023, [Paper]
-
COWP: "Integrating Action Knowledge and LLMs for Task Planning and Situation Handling in Open Worlds", arXiv, May 2023. [Paper] [Website]
-
ChatGPT Robot Collaboration: "Improved Trust in Human-Robot Collaboration with ChatGPT", arXiv, April 2023. [Paper]
-
ProgPrompt: "Generating Situated Robot Task Plans using Large Language Models", arXiv, Sept 2022. [Paper] [Website]
-
InnerMonlogue: "Inner Monologue: Embodied Reasoning through Planning with Language Models", arXiv, July 2022. [Paper] [Website]
-
VirtualHome: "VirtualHome: Simulating Household Activities via Programs", arXiv, Jun 2018. [Paper] [Code] [Website]
-
VoxPoser: "VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models", arXiv, July 2023 [Paper] [Website]
-
DIAL:"Robotic Skill Acquistion via Instruction Augmentation with Vision-Language Models", arXiv, Nov 2022, [Paper] [Website]
-
R3M:"R3M: A Universal Visual Representation for Robot Manipulation", arXiv, Nov 2022, [Paper] [Code] [Website]
-
VIMA:"VIMA: General Robot Manipulation with Multimodal Prompts", arXiv, Oct 2022, [Paper] [Code] [Website]
-
Perceiver-Actor:"A Multi-Task Transformer for Robotic Manipulation", CoRL, Sep 2022. [Paper] [Code] [Website]
-
MOO: "Open-World Object Manipulation using Pre-Trained Vision-Language Models", arXiv, Mar 2022. [Paper] [Website]
-
CLIPort: "CLIPort: What and Where Pathways for Robotic Manipulation", CoRL, Sept 2021. [Paper] [Website]
-
VLMaps: "Visual Language Maps for Robot Navigation", arXiv, Mar 2023. [Paper] [Code] [Website]
-
GNM: "GNM: A General Navigation Model to Drive Any Robot", arXiv, Oct 2022. [Paper] [Code] [Website]
-
CLIP-Nav: "CLIP-Nav: Using CLIP for Zero-Shot Vision-and-Language Navigation", arXiv, Nov 2022. [Paper]
-
ADAPT: "ADAPT: Vision-Language Navigation with Modality-Aligned Action Prompts", CVPR, May 2022. [Paper]
-
CoW: "CLIP on Wheels: Zero-Shot Object Navigation as Object Localization and Exploration", arXiv, Mar 2022. [Paper] [Website]
-
Recurrent VLN-BERT: "A Recurrent Vision-and-Language BERT for Navigation", CVPR, Jun 2021 [Paper] [Code]
-
VLN-BERT: "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web", ECCV, Apr 2020 [Paper] [Code]
-
LM-Nav: "Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action", arXiv, July 2022. [Paper] [Code] [Website]