Skip to content

A node module to do offline speech to text, using pocketsphinx and ffmpeg. Originally component from Videogrep project

License

Notifications You must be signed in to change notification settings

pietrop/pocketsphinx-stt

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pocketsphinx-stt

Speech to text module initially Video grep Mac OSX Electron app, by Sam Lavine @sam_lavigne

Then refactored as part of autoEdit, and subsequently as part of Digital Paper Edit app.

Pocketshphinx is set with American english dictionary.

Setup

git clone https://github.com/OpenNewsLabs/pocketsphinx-stt
cd pocketsphinx-stt
npm install

Usage

on npm pocketsphinx-stt

npm install pocketsphinx-stt

There are two options, one expect the file to be already an audio file that can work with pocketsphixn

const transcribe = require('pocketsphinx-stt').transcribe;
const videoFilePath = // some video file

transcribe(videoFilePath)
    .then((res) => {
        console.log('transcribe', res);
    })

While the other will use ffmpeg convert the audio or video file to the right format for pocketsphinx.

const convertAndTranscribe = require('pocketsphinx-stt').convertAndTranscribe;
const videoFilePath = // some video file

convertAndTranscribe(videoFilePath)
    .then((res) => {
        console.log('transcribe', res);
    })

It can take an optional parameter to specifiy where you'd want to save the audio file, if not provided it saves it in the same folder as the original media, with same name but audio extension.

Check out and try the example usage node src/example-usage.js for more

Example output

Uses json format for transcript from Digital Paper Edit project.

{ words:
   [ { text: 'why', start: 0.28, end: 1.23, accuracy: 0.018412, id: 0 },
     { text: 'not', start: 1.32, end: 1.85, accuracy: 0.851958, id: 1 },
     { text: 'she\'s', start: 2.4, end: 2.7, accuracy: 0.067643, id: 2 },
    ...
    ],
  paragraphs:
   [ { id: 0, start: 0.28, end: 3.93, speaker: 'U_UKN' },
     { id: 1, start: 4.69, end: 5.81, speaker: 'U_UKN' },
     { id: 2, start: 6.55, end: 7.37, speaker: 'U_UKN' },
  ...
   ]
}

System Architecture

TBC

Development env

Build

npm run build

packages via babel, in dist folder

Tests

NA

Deployment

On npm

npm run publish:public

Runs build and then publishes dist folder to npm with copy of README and package.json

About

A node module to do offline speech to text, using pocketsphinx and ffmpeg. Originally component from Videogrep project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 39.7%
  • C 33.9%
  • Python 10.5%
  • Objective-C 6.5%
  • Roff 5.7%
  • JavaScript 3.4%
  • Perl 0.3%