-
Notifications
You must be signed in to change notification settings - Fork 2
Human‐Robot Interaction
For the robot's speaker, it is needed to run the node say
, which is developed in ws/src/speech/Say.py
.
To access the Jetson Xavier, connect to the robot's router, right now being RoBorregosHome
with password RoBorregosHome2024
.
Verify your connection, and modify your IP to match the 192.168.31.XX
pattern.
The Jetson Xavier has its static IP as 192.168.31.23
, to access via SSH type:
Right now, the HRI repo is located in home-hri
directory. Navigate there and access the Docker container:
docker start home-hri
make hri.shell
Inside the container, verify the correct output device by running:
python3 -m sounddevice
Review the outputs and identify the device with the name samplerate, ALSA
(in my case 31):
...
31 samplerate, ALSA (0 in, 128 out)
...
Pass this device number into the environment variable of OUTPUT_DEVICE_INDEX
:
export OUTPUT_DEVICE_INDEX=31
After the previous setup steps, inside the docker container in the Jetson Xavier execute:
source ws/devel/setup.bash
roslaunch speech speech.launch
And in another terminal run:
source ws/devel/setup.bash
rostopic pub /robot_text std_msgs/String "Hi, my name is Frida! I'm here to help you with your domestic tasks"
Whisper is a speech recognition package to be executed in a personal device with GPU capabilities.
The setup and development has been simplified with Docker
, but due to the use of speaker and microphone, some additional steps and customization are required for it to work.
Go to the root of HRI at /home/hri
. Then using the Makefile run:
make hri.build.cuda
It would take some time to be ready. After finished building, run:
bash docker/scripts/speech.bash # Setup pulseaudio
sudo usermod -aG audio $USER # Make sure current user has access to audio resources.
sudo chmod 777 /dev/snd/* # Allow access to audio devices.
Before creating the container, fill the .env
file based on the .env.example
file which contanins instructions to select devices and filling variables. For my case, it was easier to execute the container for the required libraries and then remove it and fill the .env
:
make hri.create.cuda
make hri.shell
Check input and output devices
Go to ws/src/speech/scripts
and run TestSpeaker.py
. A list of devices will display, select the one with Analog or the one with more stuff in its description, which in my case was 3.
Device 2: HD-Audio Generic: HDMI 2 (hw:0,8), 0 input channels, 8 output channels
Device 3: HD-Audio Generic: ALC294 Analog (hw:1,0), 2 input channels, 2 output channels
Device 4: hdmi, 0 input channels, 8 output channels
Then execute python3 -m sounddevice
and look for the default device, in my case was 6.
4 hdmi, ALSA (0 in, 8 out)
5 pulse, ALSA (32 in, 32 out)
* 6 default, ALSA (32 in, 32 out)
Add the devices information to the .env
file:
# Select Index for microphone and speaker
OUTPUT_DEVICE_INDEX=6
INPUT_DEVICE_INDEX=3
And exit the container and remove it to create it again passing the new environment variables:
make hri.down
make hri.remove
make hri.create.cuda
make hri.up
make hri.shell
Inside the recently created Docker container, source the ROS workspace, and run the speech launch file. It should load Whisper and start recognizing the speech:
source ws/devel/setup.bash
roslaunch speech speech.launch