Skip to content

Sensory WakeWord Engine plug-in for Raspberry Pi

License

Notifications You must be signed in to change notification settings

chrisjrob/alexa-rpi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sensory TrulyHandsfree WakeWord Engine

About the Project

This project provides plug-in keyword spotting for the Alexa AVS sample app Raspberry Pi project, using Sensory's TrulyHandsfree technology. It includes speaker-independent recognizers for the phrase "Alexa".

License

The TrulyHandsfree library is provided for non-commercial development use only. See LICENSE.txt for details.

The libsnsr.a library is time-limited: code linked against it will stop working when the library expires. The library included in this repository will, at all times, have an expiration date that is at least 120 days in the future. To continue development after a library has expired, pull the latest update from the repository and re-link.

Please contact Sensory Sales if you wish to use this code in a product. We have solutions available for a large number of architectures, including low-power DSP ports suitable for continuous listening on battery-powered devices. Sensory also offers additional technologies, such as enrolled speaker-specific triggers and speaker verification.

Getting Started

This project is a plug-in for the Alexa AVS sample app project. Please follow the build and configuration instructions for that project.

Performance

The models/ subdirectory contains recognition models of different sizes. The larger models provide better performance but require more CPU resources. The table below provides details.

Model Size MiB FR % FA / day MIPS Pi 2 CPU % Pi 3 CPU %
spot-alexa-rpi-20500.snsr 0.3 10.5 ± 2.1 6.6 ± 1 34 5.1 2.8
spot-alexa-rpi-21000.snsr 1.1 10.2 ± 2.1 4.5 ± 1 90 17.7 8.1
spot-alexa-rpi-31000.snsr 2.0 10.6 ± 2.1 3.7 ± 1 168 33.2 14.0

Key:

  • Size: The file size of the keyword spotter model in Mebibytes. This is also an estimate of the runtime RAM requirement.
  • FR: False Reject percentage. This is the fraction of times the correct wakeword is not recognized.
  • FA: False Accept frequency. The number of times per day the system mistakenly reports a spot of the wakeword.
  • Pi 2 CPU %: Raspberry Pi 2 model B CPU resource usage expressed as a percentage of real-time. 50% is half of one CPU core: Such a recognizer processes audio at twice the rate it arrives from the microphone.
  • Pi 3 CPU %: Raspberry Pi 3 model B CPU resource usage.

Test methodology:

  • Measured on data independent from those used for development.
  • More than 1400 utterances were evaluated, each of these in 24 different noise conditions (six different noise types, four SNR levels).
  • Includes over-the-air tests of physical devices in an anechoic chamber.
  • Results reported in the table above were measured at 12 dBA SNR.
  • In typical use the SNR is around 15 dBA. A better signal-to-noise ratio results in better performance; the reported results are therefore conservative.

Frequently Asked Questions

  1. How do I report a problem with this plug-in?

    • Open a GitHub issue and include detail on how to trigger the unexpected behavior.
  2. The library license key has expired. How do I extend it?

    • Run ./bin/license.sh. Accept the presented EULA to pull a new license key from the GitHub repository and apply it to the binaries.
    • Copy ./lib/libsnsr.a into your project's ext/lib/ directory, replacing the expired library.
    • Copy ./models/*.snsr into your project's ext/resources/ directory, replacing the expired models.
    • Rebuild your wakeWordAgent executable.
  3. What can I do to address audio recording problems?

    • The project uses ALSA for audio recording. It will open a capture session from the default audio device, and record 16-bit signed integers at 16 kHz.
    • There is an example /etc/asound.conf included in the config/ subdirectory in this repository. This file configures ALSA to use the USB microphone as the default input source, and the analog audio jack as the default output. We recommend that you use this configuration as a starting point for your audio routing.
    • Verify that your audio configuration is suitable by running this command to make a ten second long audio recording: arecord -d 10 -f S16_LE -r 16000 test.wav
    • Listen to the test recording and verify that it is clear: aplay test.wav
    • If the recording volume is low, experiment with adjusting the recording levels using alsamixer. Run sudo alsactl store to make these settings permanent.

Copyright © 2016 Sensory, Inc. http://sensory.com/

About

Sensory WakeWord Engine plug-in for Raspberry Pi

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages