Merge remote-tracking branch 'origin/main' into docs/hri

RoBorregos · Nov 26, 2024 · 07194f0 · 07194f0
2 parents 387a16b + 159294c
commit 07194f0
Show file tree

Hide file tree

Showing 81 changed files with 681 additions and 88 deletions.
diff --git a/.vscode/settings.json b/.vscode/settings.json
@@ -0,0 +1,3 @@
+{
+    "editor.formatOnSave": false,
+}
diff --git a/docs/.pages b/docs/.pages
@@ -2,4 +2,8 @@ nav:
     - index.md
     - Overview
     - Areas
+    - Team.md
+    - "2024"
+    - "2023"
+    - "2022 - Jun 2023"
     - ...
diff --git a/docs/Areas/HRI.md b/docs/Areas/HRI.md
@@ -2,112 +2,133 @@
 
 Human-Robot Interaction (HRI) refers to the study and design of interactions between humans and robots. It encompasses the development of technologies, interfaces, and systems that enable effective communication, collaboration, and cooperation between humans and robots. HRI aims to create intuitive and natural ways for humans to interact with robots, allowing for seamless integration of robots into various domains such as healthcare, manufacturing, entertainment, and personal assistance.
 
-A general overview of the HRI Pipeline is showed as follows
-![Pipeline](/assets/home/HRI/HRI-Pipeline.jpeg)
+## HRI Overview
+
+<p align="center">
+<img src= "/assets/home/HRI/Pipeline.jpeg" alt="Pipeline overview" width="80%" height="80%">
+</p>
+
+Currently, in RoBorregos, the development of HRI consists of 2 pipelines: speech and NLP processing. The speech pipeline has several modules to achieve robustness while being computationally efficient (i.e. ensure resource-intensive models only infer voiced audio). The first module of speech is the Keyword Spotting node, which was implemented using porcupine. After the keyword is detected, audio is recorded until voice is no longer detected using Silero VAD. The audio captured is then inferred using Whisper.
+
+Finally, the NLP pipeline processes the interpreted text by parsing the string into commands using an LLM. As a strategy to deal with malformed LLM outputs, the resulting commands are matched to the closest valid action according to the robot context and capabilities, found via cosine similarity by embedding the actions if there is no exact match.
+
 ---
 
 - [ROS NODES DESCRITPION](#ROS-Nodes-Documentation)
 - [RUNNING THE SYSTEM](#Comprehensive-Instructions-for-Running-the-System)
 
-
-
 # ROS Nodes Documentation
 
 This document provides detailed documentation on the ROS nodes, their purpose, and usage. It covers all the ROS nodes and their tasks, providing comprehensive instructions for running the system.
 
 ## ROS Nodes in `ws/src/frida_language_processing/scripts`
 
 ### command_interpreter_v2.py
+
 - **Purpose**: This node interprets commands from the user.
-- **Usage**: 
+- **Usage**:
   - Subscribes to: `/speech/raw_command`
   - Publishes to: `/task_manager/commands`
   - Description: This node uses OpenAI's GPT-4 model to interpret commands received from the speech processing and sends the actions to the Task Manager.
 
 ### conversation.py
+
 - **Purpose**: This node handles conversational interactions with the user.
-- **Usage**: 
+- **Usage**:
   - Subscribes to: `/speech/raw_command`
   - Publishes to: `/speech/speak`
   - Description: This node stores context from the environment, previous prompts, and user interactions to provide accurate and conversational responses.
 
 ### guest_analyzer.py
+
 - **Purpose**: This node analyzes guest information.
-- **Usage**: 
+- **Usage**:
   - Subscribes to: `/speech/raw_command`
   - Publishes to: `/speech/speak`
   - Description: This node extracts information from the guest for the Receptionist task of Stage 1.
 
 ### extract_data.py
+
 - **Purpose**: This node extracts data for processing.
-- **Usage**: 
+- **Usage**:
   - Provides service: `/extract_data`
   - Description: This node extracts structured data including summary, keywords, and important points from text.
 
 ### item_categorization.py
+
 - **Purpose**: This node classifies text data of images and groups them.
-- **Usage**: 
+- **Usage**:
   - Provides service: `items_category`
   - Description: This node returns the category of an item or list of items for the Storing groceries task.
 
 ## ROS Nodes in `ws/src/speech/scripts`
 
 ### AudioCapturer.py
+
 - **Purpose**: This node captures audio from the microphone.
-- **Usage**: 
+- **Usage**:
   - Publishes to: `rawAudioChunk`
   - Description: This node captures audio from the microphone and publishes it as raw audio chunks.
 
 ### Say.py
+
 - **Purpose**: This node handles text-to-speech functionality.
-- **Usage**: 
+- **Usage**:
   - Subscribes to: `/speech/speak_now`
   - Provides service: `/speech/speak`
   - Description: This node converts text to speech using either an online or offline TTS engine.
 
 ### ReSpeaker.py
+
 - **Purpose**: This node interfaces with the ReSpeaker hardware.
-- **Usage**: 
+- **Usage**:
   - Publishes to: `DOA`
   - Subscribes to: `ReSpeaker/light`
   - Description: This node interfaces with the ReSpeaker hardware to get the direction of arrival (DOA) of sound and control the LED ring.
 
 ### KWS.py
+
 - **Purpose**: This node handles keyword spotting.
-- **Usage**: 
+- **Usage**:
   - Subscribes to: `rawAudioChunk`
   - Publishes to: `keyword_detected`
   - Description: This node uses Porcupine to detect keywords in the audio stream.
 
 ### UsefulAudio.py
+
 - **Purpose**: This node processes useful audio segments.
-- **Usage**: 
+- **Usage**:
   - Subscribes to: `rawAudioChunk`, `saying`, `keyword_detected`
   - Publishes to: `UsefulAudio`
   - Description: This node processes audio segments to determine if they contain useful speech and publishes the useful audio.
 
 ### Hear.py
+
 - **Purpose**: This node handles speech-to-text functionality.
-- **Usage**: 
+- **Usage**:
   - Subscribes to: `UsefulAudio`
   - Publishes to: `UsefulAudioAzure`, `UsefulAudioWhisper`
   - Description: This node converts speech to text using either an online or offline STT engine.
 
 ### Whisper.py
+
 - **Purpose**: This node processes audio using the Whisper model.
-- **Usage**: 
+- **Usage**:
   - Subscribes to: `UsefulAudioWhisper`
   - Publishes to: `/speech/raw_command`
   - Provides service: `/speech/service/raw_command`
   - Description: This node uses the Whisper model to transcribe audio to text.
 
 ---
+
 # Comprehensive Instructions for Running the System
 
 1. **Setup Docker**:
+
    - Follow the instructions in `docker/README.md` to set up Docker and create the necessary containers.
 
 2. **Build the Workspace**:
+
    - Inside the Docker container, navigate to the workspace directory and build the workspace:
      ```bash
      cd /workspace/ws
@@ -116,6 +137,7 @@ This document provides detailed documentation on the ROS nodes, their purpose, a
      ```
 
 3. **Run the ROS Nodes**:
+
    - To run the speech nodes:
      ```bash
      roslaunch speech speech.launch
@@ -126,6 +148,7 @@ This document provides detailed documentation on the ROS nodes, their purpose, a
      ```
 
 4. **Test the System**:
+
    - To test the text-to-speech functionality:
      ```bash
      rostopic pub /speech/speak_now std_msgs/String "Hi, my name is Frida!"
@@ -139,4 +162,4 @@ This document provides detailed documentation on the ROS nodes, their purpose, a
    - Use `rostopic echo` to listen to the topics and verify the messages being published.
    - Check the logs for any errors or warnings.
 
-By following these instructions, you should be able to set up and run the ROS nodes for the Human-Robot Interaction system.
+By following these instructions, you should be able to set up and run the ROS nodes for the Human-Robot Interaction system.
diff --git a/docs/Areas/index.md b/docs/Areas/index.md
@@ -0,0 +1,12 @@
+# Areas
+
+The following sections describe the main software and hardware modules developed by the team to achieve the functionalities in the robot.
+
+### Sections
+
+- [Navigation](/Areas/Navigation): Describes the software modules that allow the robot to navigate autonomously in an environment.
+- [Manipulation](/Areas/Manipulation): Describes the software modules that allow the robot to manipulate objects.
+- [Integraction and Networks](/Areas/Integration%20and%20Networks): Describes the software modules that allow the robot to interact with other devices and networks.
+- [HRI](/Areas/HRI): Describes the software modules that allow the robot to interact with humans.
+- [Electronics and Control](/Areas/Electronics%20and%20Control): Describes the hardware modules that allow the robot to control its movements and interact with the environment.
+- [Computer Vision](/Areas/Computer%20Vision): Describes the software modules that allow the robot to perceive the environment through cameras and other sensors.
diff --git a/docs/Overview/Media.md b/docs/Overview/Media.md
@@ -1,4 +1,15 @@
-# Media [WIP]
+# Media
 
-(Newspaper, youtube videos, demos with date and description, robot + member pictures)
-(TDPs, other submissions)
+Throughout the years, the Robocup@Home team at RoBorregos has had the opportunity to participate in various events and competitions. Here are some of the highlights of our journey.
+
+### RoboCup@Home 2024 Qualification video
+
+<iframe width="560" height="315" src="https://www.youtube.com/embed/g9W2A_bHTvg" title="RoboCup@Home Video" frameborder="0" allowfullscreen></iframe>
+
+### Compilation of Demos, competitions, and events
+
+<iframe width="560" height="315" src="https://www.youtube.com/embed/M9okZ_EFCcE" title="RoboCup@Home Evolution" frameborder="0" allowfullscreen></iframe>
+
+### Conexión Tec 2021
+
+<iframe width="560" height="315" src="https://www.youtube.com/embed/sBfiKc-LmK8" title="RoboCup@Home: Service robot for the home" frameborder="0" allowfullscreen></iframe>
diff --git a/docs/Overview/index.md b/docs/Overview/index.md
@@ -0,0 +1,2 @@
+# Overview [WIIIPPP]
+