Skip to content

On-device voice activity detection (VAD) powered by deep learning. Fork provides SPM support.

License

Notifications You must be signed in to change notification settings

Storyboard-fm/cobra

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cobra

GitHub release GitHub

Crates.io Maven Central npm npm CocoaPods PyPI

Made in Vancouver, Canada by Picovoice

Twitter URL YouTube Channel Views

Cobra is a highly-accurate and lightweight voice activity detection (VAD) engine.

Table of Contents

Demos

Python Demos

Install the demo package:

sudo pip3 install pvcobrademo

With a working microphone connected to your device, run the following in the terminal:

cobra_demo_mic --access_key ${AccessKey}

Replace ${AccessKey} with your AccessKey obtained from Picovoice Console. Cobra will start processing the audio input from the microphone in realtime and output to the terminal when it detects any voice activity.

For more information about the Python demos go to demo/python.

C Demos

Build the demo:

cmake -S demo/c/ -B demo/c/build && cmake --build demo/c/build --target cobra_demo_mic

To list the available audio input devices:

./demo/c/build/cobra_demo_mic -s

To run the demo:

./demo/c/build/cobra_demo_mic -l ${LIBRARY_PATH} -a ${ACCESS_KEY} -d ${AUDIO_DEVICE_INDEX}

Replace ${LIBRARY_PATH} with path to appropriate library available under lib, Replace ${ACCESS_KEY} with AccessKey obtained from Picovoice Console, and ${INPUT_AUDIO_DEVICE} with the index of your microphone device.

For more information about C demos go to demo/c.

Android Demos

Using Android Studio, open demo/android/Activity as an Android project and then run the application. Replace String ACCESS_KEY = "..." inside MainActivity.java with your AccessKey generated by Picovoice Console.

For more information about Android demos go to demo/android.

iOS demos

Run the following from this directory to install the Cobra-iOS CocoaPod:

pod install

Replace let ACCESS_KEY = "..." inside ViewModel.swift with yours obtained from Picovoice Console.

Then, using Xcode, open the generated CobraDemo.xcworkspace and run the application. Press the start button and start talking. The background will change colour while you're talking.

For more information about iOS demos go to demo/ios.

Web Demos

From demo/web run the following in the terminal:

yarn
yarn start

(or)

npm install
npm run start

Open http://localhost:5000 in your browser to try the demo.

NodeJS Demos

Install the demo package:

yarn global add @picovoice/cobra-node-demo

With a working microphone connected to your device, run the following in the terminal:

cobra-mic-demo --access_key ${ACCESS_KEY}

Cobra will start processing the audio input from the microphone in realtime and output to the terminal when it detects any voice activity.

For more information about NodeJS demos go to demo/nodejs.

Rust Demos

From demo/rust/micdemo build and run the demo:

cargo run --release -- --access_key ${ACCESS_KEY}

For more information about Rust demos go to demo/rust.

SDKs

Python

Install the Python SDK:

pip3 install pvcobra

The SDK exposes a factory method to create instances of the engine:

import pvcobra

handle = pvcobra.create(access_key=${AccessKey})

where ${AccessKey} is an AccessKey which should be obtained from Picovoice Console. When initialized, valid sample rate can be obtained using handle.sample_rate. The required frame length (number of audio samples in an input array) is handle.frame_length. The object can be used to monitor incoming audio as follows:

def get_next_audio_frame():
    pass

while True:
    voice_probability = handle.process(get_next_audio_frame())

Finally, when done be sure to explicitly release the resources using handle.delete().

C

include/pv_cobra.h header file contains relevant information. Build an instance of the object:

    pv_cobra_t *handle = NULL;
    pv_status_t status = pv_cobra_init(${ACCESS_KEY}, &handle);
    if (status != PV_STATUS_SUCCESS) {
        // error handling logic
    }

Replace ${ACCESS_KEY} with the AccessKey obtained from Picovoice Console. Now the handle can be used to monitor incoming audio stream. Cobra accepts single channel, 16-bit linearly-encoded PCM audio. The sample rate can be retrieved using pv_sample_rate(). Finally, Cobra accepts input audio in consecutive chunks (aka frames) the length of each frame can be retrieved using pv_cobra_frame_length().

extern const int16_t *get_next_audio_frame(void);

while (true) {
    const int16_t *pcm = get_next_audio_frame();
    float is_voiced = 0.f;
    const pv_status_t status = pv_cobra_process(handle, pcm, &is_voiced);
    if (status != PV_STATUS_SUCCESS) {
        // error handling logic
    }
}

Finally, when done be sure to release the acquired resources:

pv_cobra_delete(handle);

Android

Create an instance of the engine

import ai.picovoice.cobra.Cobra;
import ai.picovoice.cobra.CobraException;

String accessKey = // .. AccessKey provided by Picovoice Console (https://console.picovoice.ai/)
try {
    handle = new Cobra(accessKey);
} catch (CobraException e) {
    // handle error
}

When initialized, valid sample rate can be obtained using handle.getSampleRate(). The required frame length (number of audio samples in an input array) is handle.getFrameLength(). The object can be used to monitor incoming audio as follows:

short[] getNextAudioFrame(){

while(true) {
    try {
        final float voiceProbability = handle.process(getNextAudioFrame());
    } catch (CobraException e) { }
}

Finally, when done be sure to explicitly release the resources using handle.delete().

iOS

To import the Cobra iOS binding into your project, add the following line to your Podfile and run pod install:

pod 'Cobra-iOS'

Create an instance of the engine

import Cobra

let accessKey : String = // .. AccessKey provided by Picovoice Console (https://console.picovoice.ai/)
do {
    handle = try Cobra(accessKey: accessKey)
} catch { }

func getNextAudioFrame() -> [Int16] {
    // .. get audioFrame
    return audioFrame;
}

while true {
    do {
        let voiceProbability = try handle.process(getNextAudioFrame())
    } catch { }
}

Finally, when done be sure to explicitly release the resources using handle.delete().

Web

Install the web SDK using yarn:

yarn add @picovoice/cobra-web

or using npm:

npm install --save @picovoice/cobra-web

Create an instance of the engine using CobraWorker and run the VAD on an audio input stream:

import { CobraWorker } from "@picovoice/cobra-web";

function voiceProbabilityCallback(voiceProbability: number) {
  ... // use voice probability figure
}

function getAudioData(): Int16Array {
  ... // function to get audio data
  return new Int16Array();
}

const cobra = await CobraWorker.create(
  "${ACCESS_KEY}",
  voiceProbabilityCallback
);

for (; ;) {
  cobra.process(getAudioData());
  // break on some condition
}

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console.

When done, release the resources allocated to Cobra using cobra.release().

NodeJS

Install NodeJS SDK:

yarn add @picovoice/cobra-node

Create instances of the Cobra class:

const { Cobra } = require("@picovoice/cobra-node");

const accessKey = "${ACCESS_KEY}"; // Obtained from the Picovoice Console (https://console.picovoice.ai/)
const cobra = new Cobra(accessKey);

When instantiated, cobra can process audio via its .process method.

function getNextAudioFrame() {
  // ...
  return audioFrame;
}

while (true) {
  const audioFrame = getNextAudioFrame();
  const voiceProbability = cobra.process(audioFrame);
  console.log(voiceProbability);
}

When done be sure to release resources using release():

cobra.release();

Rust

Create an instance of the engine and detect voice activity:

use cobra::Cobra;

let cobra = Cobra::new("${ACCESS_KEY}");

fn next_audio_frame() -> Vec<i16> {
    // get audio frame
}

loop {
    if let Ok(voice_probability) = cobra.process(&next_audio_frame()) {
      // ...
    }
}

Releases

v3.0.0 - October 26th, 2023

  • Improvements to error reporting
  • Upgrades to authorization and authentication system
  • Various bug fixes and improvements
  • Node min support bumped to 16

v1.2.0 - January 27th, 2023

  • Updated Cobra engine for improved accuracy and performance
  • iOS minimum requirement moved to iOS 11.0
  • Minor bug fixes

v1.1.0 - January 21st, 2022

  • Improved types for web binding
  • Various bug fixes and improvements

v1.0.0 - October 8th, 2021

  • Initial release

About

On-device voice activity detection (VAD) powered by deep learning. Fork provides SPM support.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 31.0%
  • TypeScript 28.2%
  • Rust 22.9%
  • Swift 9.4%
  • C 6.9%
  • Ruby 1.1%
  • Shell 0.5%