Update deepgram endpointing #145

seanmuirhead · 2024-02-01T23:12:20Z

Notes:

Added speech_final to SpeechEvent. Defaulted to False because looks like it's being used elsewhere.
Return speech_final for each SpeechEvent.
Add utterance_end_ms for more control over VAD

Relevant Docs:
https://developers.deepgram.com/docs/understanding-end-of-speech-detection
https://developers.deepgram.com/docs/understand-endpointing-interim-results

CLAassistant · 2024-02-01T23:12:25Z

All committers have signed the CLA.

theomonnom

Awesome, thanks for contributing!

deepgram controls vad by endpointing parameter, this fix allows to configure min_silence_duration in agents layer

Vibrat · 2024-02-05T01:46:35Z

Hi @seanmuirhead,

I come from a similar PR #148.

We are sharing some changes on the codebase. I think the end result of both endpointing and interim is to have speech_final and is_final as the final output. Would be nice if we can rebase my change onto your PR so that I can close my ticket and we continue the discussion here

seanmuirhead · 2024-02-05T17:16:50Z

Hi @Vibrat,

Yes, I was thinking the same thing! Will do that by EOD PST.

davidzhao · 2024-02-05T17:38:54Z

what do you think about exposing end_of_speech as the field instead of speech_final? It seems a bit more generic to signal the end of input.. and also trying to avoid having both speech_final and is_final.

seanmuirhead · 2024-02-06T00:49:16Z

@davidzhao Ya I agree, much cleaner that way. Will include in rebase

seanmuirhead · 2024-02-06T04:44:27Z

@davidzhao @Vibrat done! Removed utterance_end_ms according to Deepgram docs that were updated ~6hrs ago. Two notes for comment:

@Vibrat 's min_silence_duration sound good for the attribute name?
Deepgram has endpointing cast as a string, which is why we've done it here, but int works just as fine for the record.

davidzhao

this looks good! we'll get it merged and update the other plugins to behave the same way.

commit 5fab872 Author: Neil Dwyer <[email protected]> Date: Tue Feb 6 17:18:48 2024 -0800 Kitt to handle empty deepgram text + elevenlabs bug fix (livekit#155) hotfix commit 7661824 Author: Neil Dwyer <[email protected]> Date: Tue Feb 6 16:25:35 2024 -0800 Bump deepgram version (livekit#154) commit ddc168e Author: Neil Dwyer <[email protected]> Date: Tue Feb 6 16:10:52 2024 -0800 Fixes to Elevenlabs (livekit#152) commit 5555526 Author: David Zhao <[email protected]> Date: Tue Feb 6 09:12:38 2024 -0800 Update KITT and other plugins to use end_of_speech field (livekit#153) * Update KITT and other plugins to use end_of_speech field Tested with KITT. It significantly improves the end of speech behavior so that we are giving it a 1s wait before starting to process user input. * ruff on 3.10 * use ruff action * fixed ruff commit a98a7c1 Author: Paul Lockett <[email protected]> Date: Tue Feb 6 00:02:51 2024 -0800 update tts init to export StreamAdapter (livekit#149) commit 604d7e3 Author: Sean Muirhead <[email protected]> Date: Mon Feb 5 21:35:55 2024 -0800 Update deepgram endpointing (livekit#145) * deepgram: Add min_silence_duration to deepgram client. deepgram controls vad by endpointing parameter, this fix allows to configure min_silence_duration in agents layer * add utterance_end_ms and speech_final to Deepgram plugin * add utterance_end_ms and speech_final to Deepgram plugin * expose speech_final as end_of_speech --------- Co-authored-by: Lam Nguyen <[email protected]>

Adapted from this discussion: https://discord.com/channels/1079125925163180093/1110708807883034646/1110708812979118183

* deepgram: Add min_silence_duration to deepgram client. deepgram controls vad by endpointing parameter, this fix allows to configure min_silence_duration in agents layer * add utterance_end_ms and speech_final to Deepgram plugin * add utterance_end_ms and speech_final to Deepgram plugin * expose speech_final as end_of_speech --------- Co-authored-by: Lam Nguyen <[email protected]>

theomonnom approved these changes Feb 1, 2024

View reviewed changes

deepgram: Add min_silence_duration to deepgram client.

bd32d09

deepgram controls vad by endpointing parameter, this fix allows to configure min_silence_duration in agents layer

davidzhao mentioned this pull request Feb 4, 2024

deepgram: Add min_silence_duration to deepgram client. #148

Closed

seanmuirhead added 3 commits February 5, 2024 20:28

add utterance_end_ms and speech_final to Deepgram plugin

33006d9

add utterance_end_ms and speech_final to Deepgram plugin

f7c0fd9

expose speech_final as end_of_speech

7f1c15b

davidzhao approved these changes Feb 6, 2024

View reviewed changes

davidzhao merged commit 604d7e3 into livekit:main Feb 6, 2024
1 check passed

parshvadaftari pushed a commit to parshvadaftari/agents that referenced this pull request Jun 23, 2024

Docs to explain how outbound calls work (livekit#145)

5315642

Adapted from this discussion: https://discord.com/channels/1079125925163180093/1110708807883034646/1110708812979118183

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update deepgram endpointing #145

Update deepgram endpointing #145

seanmuirhead commented Feb 1, 2024

CLAassistant commented Feb 1, 2024 •

edited

Loading

theomonnom left a comment •

edited

Loading

Vibrat commented Feb 5, 2024

seanmuirhead commented Feb 5, 2024

davidzhao commented Feb 5, 2024

seanmuirhead commented Feb 6, 2024

seanmuirhead commented Feb 6, 2024

davidzhao left a comment

Update deepgram endpointing #145

Update deepgram endpointing #145

Conversation

seanmuirhead commented Feb 1, 2024

CLAassistant commented Feb 1, 2024 • edited Loading

theomonnom left a comment • edited Loading

Choose a reason for hiding this comment

Vibrat commented Feb 5, 2024

seanmuirhead commented Feb 5, 2024

davidzhao commented Feb 5, 2024

seanmuirhead commented Feb 6, 2024

seanmuirhead commented Feb 6, 2024

davidzhao left a comment

Choose a reason for hiding this comment

CLAassistant commented Feb 1, 2024 •

edited

Loading

theomonnom left a comment •

edited

Loading