This is the repo for SemReBot2, a ROS2 package combining Natural Language Processing, AI planning and robotics to enable semantic reasoning in robots. Specifically, it transforms natural language speech to executable robot actions enabling the possibility to command robots to do what you want. SemReBot2 uses Hugging Face's transformers
, OpenAI's Whisper
for automatic speech recognition, Mistral.AI's Mistral 7B Instruct
in 4-bit precision as language model and PlanSys2
(https://github.com/PlanSys2/ros2_planning_system) for AI planning and execution of PDDL plans with behaviour trees.
Video demo: https://www.youtube.com/watch?v=13fVo1_BrCg
- Dedicated GPU with min. 9 GB memory
- ~30 GB storage
- CUDA version 11.8+
- Docker
- Python3.8+
-
Install
pytorch
forPython 3.8+
CUDA 11.8
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
or CUDA 12.1
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
-
Install
transformers
,bitsandbytes
andhuggingface_hub
pip install transformers bitsandbytes huggingface_hub
-
Clone repo to home directory
git clone [email protected]:stinkyElias/SemReBot2.git
-
Create a user on Hugging Face: https://huggingface.co/.
-
Generate an access token by following this tutorial: https://huggingface.co/docs/hub/en/security-tokens
-
Agree to Mistral AI's terms to access Mistral 7B Instruct v0.2: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2
-
Authenticate to the hub (scroll down to "Authentication" or check the image below): https://huggingface.co/docs/huggingface_hub/quick-start
-
Download Whisper large with flash attention and Mistral 7B Instruct in 4-bit precision by running the
download_models.py
script. Provide your HF token as argument. The default location is~/semrebot2_models
, but this can be set by adding a user-specific path as command line argument. The models take up ~18 GB of storageDefault location:
python3 download_models.py "hugging_face_access_token"
User-specified location:
python3 download_models.py "hugging_face_access_token" "relative/path/to/location"
-
Pull Docker image from Docker Hub
docker pull stinkyelias/rolling:whisper
-
To start the container, run
run.sh
bash script in terminal 1Terminal 1:
cd SemReBot2 ./run.sh
-
Build the ROS2 packages inside the container. Remember to source the workspace afterwards Terminal 1:
./env.sh
source install/setup.bash
This setup of SemReBot2 requires four terminal windows. After step 11, start three new terminals. You can easily attach to the running container by first retrieving the container name.
Terminal 2:
docker container list
>>> some_container_name
-
Attach three terminals to the running container
Terminal 2:
./terminal some_container_name
Terminal 3:
./terminal some_container_name
Terminal 4:
./terminal some_container_name
If Terminal 1 has finished building the ROS2 packages, remember to source each terminal!
source install/setup.bash
- With all four terminals running, bring up SemReBot2, Task controller node and Nav2 sim node
Terminal 1:
Terminal 2:
ros2 launch semrebot2_bringup bringup.launch.py
Terminal 3:ros2 run semrebot2_task_controller task_controller_node
ros2 run semrebot2_task_controller nav2_sim_node
- To use one of the four audio samples, publish to the
/speech
topic the audio file you wish to test. Audio file 3 contain logical inconsistencies not possible to solve! Terminal 4:where 1 <= x <= 4ros2 topic pub --once /speech std_msgs/msg/Int8 "{data: x}"