Skip to content

WIP exploration using Twilio Media Streams and Generative AI

License

Notifications You must be signed in to change notification settings

craigsdennis/genai-phone-call

Repository files navigation

Generative AI Phone Calling

Generative AI is producing a bunch of fun new models for us devs to poke at. Did you know you can use these over the phone?

Twilio gives you a superpower called Media Streams which gives you a Websocket connection to both sides of a phone call. You can get audio streamed to you, process it, and send audio back.

This repo serves as WIP demo but is exploring two models using Deepgram for Speech to Text and the incredibly fun elevenlabs for Text to Speech.

Installation

Sign up for Deepgram and ElevenLabs

Use something like ngrok to tunnel and then expose port 3000

ngrok http 3000

Copy .env.example to .env and update keys

Set SERVER to your tunneled URL

Install the necessary packages

npm install

Start the web server

node server.js

Wire up your Twilio number using the console or CLI

twilio phone-numbers:update +18889876 --voice-url=https://your-server.ngrok.io/incoming

There is a Stream TwiML verb that will connect a stream to your websocket server.

About

WIP exploration using Twilio Media Streams and Generative AI

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published