Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open vocabulary mobile manipulation #73

Merged
merged 48 commits into from
Aug 6, 2024
Merged

Conversation

hello-atharva
Copy link
Collaborator

Description

Adds a basic ovmm app. Uses GPT-3.5 to generate high level plans and executes them in a sequential manner. Uses YOLO for real-time segmentation.

To test on your system:

python3 -m stretch.app.ovmm --robot_ip $ROBOT_IP

Your robot will then proceed to explore. Once exploration is finished, you can enter your long horizon task as a text input on the console.

Checklist

  • I have performed a self-review of my code
  • If it is a core feature, I have added thorough tests
  • I have added documentation for the changes
  • I have updated the README file if necessary
  • I have run on hardware if necessary

Screenshots (if applicable)

Add any relevant screenshots or screen recordings to help reviewers understand the changes.

Additional context

Add any other context or information about the pull request here.

import os

import yaml
from google.cloud import texttospeech
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added to setup?

Copy link
Collaborator

@hello-cpaxton hello-cpaxton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hello-cpaxton hello-cpaxton merged commit 72c6994 into main Aug 6, 2024
1 check passed
@hello-cpaxton hello-cpaxton deleted the hello-atharva/ovmm branch August 6, 2024 22:25
peiqi-liu pushed a commit to peiqi-liu/stretch_ai that referenced this pull request Sep 25, 2024
* Create LLM code, prompts, and file structure

* update login

* updating llms and adding types

* add gemma and llama assistants

* adding simple scripts

* updates to prompt setup

* updates to llm - testing gemma

* updates to base

* gemma client

* updates

* outputs

* LLM testing configs

* testing with Gemma

* updates

* add llama cleint test code

* updates

* updates to setup

* update llama setup

* updates to model config

* llama works

* soeme llama fixes

* update

* update things

* updates

* updates

* text cleanup for llama

* no overrides

* update

* updates

* updates

* update voice chat app with command line options

* update voice chat

* Add prompt class for navigation and object manipulation

* Add primitives

* Increase tokens

* Add command generation

* Improve OVMM prompt

* Fix prompt

* Add YOLO for perception

* Add ovmm capabilities

* Restore query app

---------

Co-authored-by: Chris Paxton <[email protected]>
Co-authored-by: Chris Paxton <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants