Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Commit

Permalink
Add streaming microphone sample for Speech (#87)
Browse files Browse the repository at this point in the history
Add streaming microphone sample for Speech
  • Loading branch information
CallistoCF authored and JustinBeckwith committed Sep 10, 2018
1 parent f2c42b5 commit 5760658
Show file tree
Hide file tree
Showing 3 changed files with 258 additions and 108 deletions.
207 changes: 99 additions & 108 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,149 +2,140 @@
[//]: # "To regenerate it, use `npm run generate-scaffolding`."
<img src="https://avatars2.githubusercontent.com/u/2810941?v=3&s=96" alt="Google Cloud Platform logo" title="Google Cloud Platform" align="right" height="96" width="96"/>

# [Google Cloud Speech API: Node.js Client](https://github.com/googleapis/nodejs-speech)
# Google Cloud Speech API: Node.js Samples

[![release level](https://img.shields.io/badge/release%20level-general%20availability%20%28GA%29-brightgreen.svg?style&#x3D;flat)](https://cloud.google.com/terms/launch-stages)
[![CircleCI](https://img.shields.io/circleci/project/github/googleapis/nodejs-speech.svg?style=flat)](https://circleci.com/gh/googleapis/nodejs-speech)
[![AppVeyor](https://ci.appveyor.com/api/projects/status/github/googleapis/nodejs-speech?branch=master&svg=true)](https://ci.appveyor.com/project/googleapis/nodejs-speech)
[![codecov](https://img.shields.io/codecov/c/github/googleapis/nodejs-speech/master.svg?style=flat)](https://codecov.io/gh/googleapis/nodejs-speech)

> Node.js idiomatic client for [Speech API][product-docs].
[![Open in Cloud Shell][shell_img]][shell_link]

The [Cloud Speech API](https://cloud.google.com/speech/docs) enables easy integration of Google speech recognition technologies into developer applications. Send audio and receive a text transcription from the Cloud Speech API service.

## Table of Contents

* [Speech API Node.js Client API Reference][client-docs]
* [github.com/googleapis/nodejs-speech](https://github.com/googleapis/nodejs-speech)
* [Speech API Documentation][product-docs]

Read more about the client libraries for Cloud APIs, including the older
Google APIs Client Libraries, in [Client Libraries Explained][explained].

[explained]: https://cloud.google.com/apis/docs/client-libraries-explained

**Table of contents:**

* [Quickstart](#quickstart)
* [Before you begin](#before-you-begin)
* [Installing the client library](#installing-the-client-library)
* [Using the client library](#using-the-client-library)
* [Before you begin](#before-you-begin)
* [Samples](#samples)
* [Versioning](#versioning)
* [Contributing](#contributing)
* [License](#license)

## Quickstart
* [Speech Recognition](#speech-recognition)
* [Speech Recognition v1p1beta1](#speech-recognition-v1p1beta1)

### Before you begin
## Before you begin

1. Select or create a Cloud Platform project.
Before running the samples, make sure you've followed the steps in the
[Before you begin section](../README.md#before-you-begin) of the client
library's README.

[Go to the projects page][projects]
## Samples

1. Enable billing for your project.
### Speech Recognition

[Enable billing][billing]
View the [source code][recognize_0_code].

1. Enable the Google Cloud Speech API API.
[![Open in Cloud Shell][shell_img]](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/googleapis/nodejs-speech&page=editor&open_in_editor=samples/recognize.js,samples/README.md)

[Enable the API][enable_api]
__Usage:__ `node recognize.js --help`

1. [Set up authentication with a service account][auth] so you can access the
API from your local workstation.
```
recognize.js <command>
Commands:
recognize.js sync <filename> Detects speech in a local audio file.
recognize.js sync-gcs <gcsUri> Detects speech in an audio file located in a Google Cloud Storage bucket.
recognize.js sync-words <filename> Detects speech in a local audio file with word time offset.
recognize.js async <filename> Creates a job to detect speech in a local audio file, and waits for the job to
complete.
recognize.js async-gcs <gcsUri> Creates a job to detect speech in an audio file located in a Google Cloud
Storage bucket, and waits for the job to complete.
recognize.js async-gcs-words <gcsUri> Creates a job to detect speech with word time offset in an audio file located
in a Google Cloud Storage bucket, and waits for the job to complete.
recognize.js stream <filename> Detects speech in a local audio file by streaming it to the Speech API.
recognize.js listen Detects speech in a microphone input stream. This command requires that you
have SoX installed and available in your $PATH. See
https://www.npmjs.com/package/node-record-lpcm16#dependencies
Options:
--version Show version number [boolean]
--encoding, -e [string] [default: "LINEAR16"]
--sampleRateHertz, -r [number] [default: 16000]
--languageCode, -l [string] [default: "en-US"]
--help Show help [boolean]
Examples:
node recognize.js sync ./resources/audio.raw -e LINEAR16 -r 16000
node recognize.js async-gcs gs://gcs-test-data/vr.flac -e FLAC -r 16000
node recognize.js stream ./resources/audio.raw -e LINEAR16 -r 16000
node recognize.js listen
For more information, see https://cloud.google.com/speech/docs
```

[projects]: https://console.cloud.google.com/project
[billing]: https://support.google.com/cloud/answer/6293499#enable-billing
[enable_api]: https://console.cloud.google.com/flows/enableapi?apiid=speech.googleapis.com
[auth]: https://cloud.google.com/docs/authentication/getting-started
[recognize_0_docs]: https://cloud.google.com/speech/docs
[recognize_0_code]: recognize.js

### Installing the client library
### Speech Recognition v1p1beta1

npm install --save @google-cloud/speech
View the [source code][recognize.v1p1beta1_1_code].

### Using the client library
[![Open in Cloud Shell][shell_img]](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/googleapis/nodejs-speech&page=editor&open_in_editor=samples/recognize.v1p1beta1.js,samples/README.md)

```javascript
// Imports the Google Cloud client library
const speech = require('@google-cloud/speech');
const fs = require('fs');
__Usage:__ `node recognize.v1p1beta1.js --help`

// Creates a client
const client = new speech.SpeechClient();
```
recognize.v1p1beta1.js <command>
// The name of the audio file to transcribe
const fileName = './resources/audio.raw';
Commands:
recognize.v1p1beta1.js sync-model <filename> <model> Detects speech in a local audio file using provided model.
recognize.v1p1beta1.js sync-model-gcs <gcsUri> <model> Detects speech in an audio file located in a Google Cloud
Storage bucket using provided model.
// Reads a local audio file and converts it to base64
const file = fs.readFileSync(fileName);
const audioBytes = file.toString('base64');
Options:
--version Show version number [boolean]
--encoding, -e [string] [default: "LINEAR16"]
--sampleRateHertz, -r [number] [default: 16000]
--languageCode, -l [string] [default: "en-US"]
--help Show help [boolean]
// The audio file's encoding, sample rate in hertz, and BCP-47 language code
const audio = {
content: audioBytes,
};
const config = {
encoding: 'LINEAR16',
sampleRateHertz: 16000,
languageCode: 'en-US',
};
const request = {
audio: audio,
config: config,
};
Examples:
node recognize.v1p1beta1.js sync-model ./resources/Google_Gnome.wav video -e LINEAR16 -r 16000
node recognize.v1p1beta1.js sync-model-gcs gs://gcs-test-data/Google_Gnome.wav phone_call -e FLAC -r 16000
// Detects speech in the audio file
client
.recognize(request)
.then(data => {
const response = data[0];
const transcription = response.results
.map(result => result.alternatives[0].transcript)
.join('\n');
console.log(`Transcription: ${transcription}`);
})
.catch(err => {
console.error('ERROR:', err);
});
For more information, see https://cloud.google.com/speech/docs
```

## Samples

Samples are in the [`samples/`](https://github.com/googleapis/nodejs-speech/tree/master/samples) directory. The samples' `README.md`
has instructions for running the samples.
[recognize.v1p1beta1_1_docs]: https://cloud.google.com/speech/docs
[recognize.v1p1beta1_1_code]: recognize.v1p1beta1.js

| Sample | Source Code | Try it |
| --------------------------- | --------------------------------- | ------ |
| Speech Recognition | [source code](https://github.com/googleapis/nodejs-speech/blob/master/samples/recognize.js) | [![Open in Cloud Shell][shell_img]](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/googleapis/nodejs-speech&page=editor&open_in_editor=samples/recognize.js,samples/README.md) |
| Speech Recognition v1p1beta1 | [source code](https://github.com/googleapis/nodejs-speech/blob/master/samples/recognize.v1p1beta1.js) | [![Open in Cloud Shell][shell_img]](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/googleapis/nodejs-speech&page=editor&open_in_editor=samples/recognize.v1p1beta1.js,samples/README.md) |
[shell_img]: //gstatic.com/cloudssh/images/open-btn.png
[shell_link]: https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/googleapis/nodejs-speech&page=editor&open_in_editor=samples/README.md

The [Speech API Node.js Client API Reference][client-docs] documentation
also contains samples.
### betaFeatures v1p1beta1

## Versioning
View the [source code][betaFeatures_code].

This library follows [Semantic Versioning](http://semver.org/).
[![Open in Cloud Shell][shell_img]](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/googleapis/nodejs-speech&page=editor&open_in_editor=samples/betaFeatures.js,samples/README.md)

This library is considered to be **General Availability (GA)**. This means it
is stable; the code surface will not change in backwards-incompatible ways
unless absolutely necessary (e.g. because of critical security issues) or with
an extensive deprecation period. Issues and requests against **GA** libraries
are addressed with the highest priority.
__Usage:__ `node betaFeatures.js --help`

More Information: [Google Cloud Platform Launch Stages][launch_stages]

[launch_stages]: https://cloud.google.com/terms/launch-stages
```
betaFeatures.js <command>
## Contributing
Commands:
betaFeatures.js sync-model <filename> <model> Detects speech in a local audio file using provided model.
betaFeatures.js sync-model-gcs <gcsUri> <model> Detects speech in an audio file located in a Google Cloud
Storage bucket using provided model.
Contributions welcome! See the [Contributing Guide](https://github.com/googleapis/nodejs-speech/blob/master/.github/CONTRIBUTING.md).
Options:
--version Show version number [boolean]
--encoding, -e [string] [default: "LINEAR16"]
--sampleRateHertz, -r [number] [default: 16000]
--languageCode, -l [string] [default: "en-US"]
--help Show help [boolean]
## License
Examples:
node betaFeatures.js sync-model ./resources/Google_Gnome.wav video -e LINEAR16 -r 16000
node betaFeatures.js sync-model-gcs gs://gcs-test-data/Google_Gnome.wav phone_call -e FLAC -r 16000
Apache Version 2.0
For more information, see https://cloud.google.com/speech/docs
```

See [LICENSE](https://github.com/googleapis/nodejs-speech/blob/master/LICENSE)
[betaFeatures_docs]: https://cloud.google.com/speech/docs
[betaFeatures_code]: betaFeatures.js

[client-docs]: https://cloud.google.com/nodejs/docs/reference/speech/latest/
[product-docs]: https://cloud.google.com/speech/docs
[shell_img]: https://gstatic.com/cloudssh/images/open-btn.png
[shell_img]: //gstatic.com/cloudssh/images/open-btn.png
[shell_link]: https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/googleapis/nodejs-speech&page=editor&open_in_editor=samples/README.md
129 changes: 129 additions & 0 deletions samples/MicrophoneStream.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
/**
* Copyright 2017, Google, Inc.
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

/**
* This application demonstrates how to perform basic recognize operations with
* with the Google Cloud Speech API.
*
* For more information, see the README.md under /speech and the documentation
* at https://cloud.google.com/speech/docs.
*/

'use strict';

/**
* Note: Correct microphone settings is required: check enclosed link, and make
* sure the following conditions are met:
* 1. SoX must be installed and available in your $PATH- it can be found here:
* http://sox.sourceforge.net/
* 2. Microphone must be working
* 3. Encoding, sampleRateHertz, and # of channels must match header of audio file you're
* recording to.
* 4. Get Node-Record-lpcm16 https://www.npmjs.com/package/node-record-lpcm16
* More Info: https://cloud.google.com/speech-to-text/docs/streaming-recognize
*/

// const encoding = 'LINEAR16';
// const sampleRateHertz = 16000;
// const languageCode = 'en-US';

function microphoneStream(encoding, sampleRateHertz, languageCode) {
// [START micStreamRecognize]

// Node-Record-lpcm16
const record = require('node-record-lpcm16');

// Imports the Google Cloud client library
const speech = require('@google-cloud/speech');

const config = {
encoding: encoding,
sampleRateHertz: sampleRateHertz,
languageCode: languageCode,
};

const request = {
config,
interimResults: false, //Get interim results from stream
};

// Creates a client
const client = new speech.SpeechClient();

// Create a recognize stream
const recognizeStream = client
.streamingRecognize(request)
.on('error', console.error)
.on('data', data =>
process.stdout.write(
data.results[0] && data.results[0].alternatives[0]
? `Transcription: ${data.results[0].alternatives[0].transcript}\n`
: `\n\nReached transcription time limit, press Ctrl+C\n`
)
);

// Start recording and send the microphone input to the Speech API
record
.start({
sampleRateHertz: sampleRateHertz,
threshold: 0, //silence threshold
recordProgram: 'rec', // Try also "arecord" or "sox"
silence: '5.0', //seconds of silence before ending
})
.on('error', console.error)
.pipe(recognizeStream);

console.log('Listening, press Ctrl+C to stop.');
// [END micStreamRecognize]
}

require(`yargs`)
.demand(1)
.command(
`micStreamRecognize`,
`Streams audio input from microphone, translates to text`,
{},
opts =>
microphoneStream(opts.encoding, opts.sampleRateHertz, opts.languageCode)
)
.options({
encoding: {
alias: 'e',
default: 'LINEAR16',
global: true,
requiresArg: true,
type: 'string',
},
sampleRateHertz: {
alias: 'r',
default: 16000,
global: true,
requiresArg: true,
type: 'number',
},
languageCode: {
alias: 'l',
default: 'en-US',
global: true,
requiresArg: true,
type: 'string',
},
})
.example(`node $0 micStreamRecognize`)
.wrap(120)
.recommendCommands()
.epilogue(`For more information, see https://cloud.google.com/speech/docs`)
.help()
.strict().argv;
Loading

0 comments on commit 5760658

Please sign in to comment.