-
Notifications
You must be signed in to change notification settings - Fork 877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to stream Text to Speech? #487
Comments
This needs to be added. I've tried all different methods but I don't think it's supported natively in the node SDK at all at the moment. This does return a streamable object but there are no chunks found while iterating through it:
|
Yes, this works today – I'm sorry that the example code doesn't reflect that. You can simply access async function main() {
const response = await openai.audio.speech.create({
model: 'tts-1',
voice: 'alloy',
input: 'the quick brown fox jumped over the lazy dogs',
});
const stream = response.body;
} I'll try to update the example soon, and won't close this issue until I do. Feel free to share use-cases you'd like to see in the example here, with sample code. |
export async function text2Speech({
res,
onSuccess,
onError,
model = defaultAudioSpeechModels[0].model,
voice = Text2SpeechVoiceEnum.alloy,
input,
speed = 1
}: {
res: NextApiResponse;
onSuccess: (e: { model: string; buffer: Buffer }) => void;
onError: (e: any) => void;
model?: string;
voice?: `${Text2SpeechVoiceEnum}`;
input: string;
speed?: number;
}) {
const ai = getAIApi();
const response = await ai.audio.speech.create({
model,
voice,
input,
response_format: 'mp3',
speed
});
const readableStream = response.body as unknown as NodeJS.ReadableStream;
readableStream.pipe(res);
let bufferStore = Buffer.from([]);
readableStream.on('data', (chunk) => {
bufferStore = Buffer.concat([bufferStore, chunk]);
});
readableStream.on('end', () => {
onSuccess({ model, buffer: bufferStore });
});
readableStream.on('error', (e) => {
onError(e);
});
} This is my example, it is a nextjs framework. I hope that will be helpful. |
Note that the Typescript types aren't correct when reading the response as a stream in Node. You have to do |
To fix those type errors, add We're working to improve this. |
Hello everyone, can you please help me implement it on node? I can't make it work... `import path from "path"; const openai = new OpenAI({ const response = openai.audio.speech.create({ response.stream_to_file(path.resolve("./speech.mp3"));` |
We don't provide a import OpenAI from 'openai';
import fs from 'fs';
import path from 'path';
// gets API Key from environment variable OPENAI_API_KEY
const openai = new OpenAI();
const speechFile = path.resolve(__dirname, './speech.mp3');
async function streamToFile(stream: NodeJS.ReadableStream, path: fs.PathLike) {
return new Promise((resolve, reject) => {
const writeStream = fs.createWriteStream(path).on('error', reject).on('finish', resolve);
stream.pipe(writeStream).on('error', (error) => {
writeStream.close();
reject(error);
});
});
}
async function main() {
const mp3 = await openai.audio.speech.create({
model: 'tts-1',
voice: 'alloy',
input: 'the quick brown chicken jumped over the lazy dogs',
});
await streamToFile(mp3.body, speechFile);
}
main(); |
|
@c121914yu |
@rattrayalex , I would like to inquire about the process of streaming the audio response on the Client component in Next.js. Despite searching for the past 1 day, I have been unable to find a solution. |
https://github.com/labring/FastGPT/blob/main/projects/app/src/web/common/utils/voice.ts I haven't brought a computer with me recently, so I can't copy the code easily. You can refer to my code for client streaming through fetch and MediaSource Api. However, I have found that this api has some compatibility issues in apple products. |
You're gonna need polyfill for that |
Can someone please help me. How do I use this or something similar to have an API route handler (endpoint) and call it from the frontend component in Next.js? I am basically trying to rebuild TTS fucntionallity that is in ChatGPT. |
Hey Aleksa, stumbled upon this because I'm building it myself. If you still need help... I re-wrote the above example as a simple API Route (pages router /api/voice.js)
To play it locally from your client side, simple:
|
Hi LeakedDave and Alexa, I implemented the simple route api example nodejs/express and hosted in several environments, google firebase, google app engine and while the streaming works, I observed a strange thing, basically the audio starts playing after 6s to 8s. I tried many things on the servers (increase memory, move to a closer region) but no luck. Any idea? |
Honestly I’m not sure. If possible I would suggest to just host a NextJS API for this, I haven’t tested it with vanilla express at all. It sounds like your API doesn’t support streaming since 6-7 second wait would be the full audio I think. |
@LeakedDave oh, you are right Firebase functions and google app engine don't support streaming. Thanks for putting me on the right path. I tried and it works, stream starts in 3s or when cold start 5s. Much better user experience. Here is the code if someone needs
`/* global fetch */ /* global awslambda */ console.log("Query params" + event["queryStringParameters"]["text"]); //console.log("event json: " + JSON.stringify(event)); const textToTTS = event["queryStringParameters"]["text"]; if (!textToTTS) { const rs = await fetch('https://api.openai.com/v1/audio/speech', { await pipeline(rs.body, responseStream); |
According to the documentation here for Text to Speech:
https://platform.openai.com/docs/guides/text-to-speech?lang=node
There is the possibility of streaming audio without waiting for the full file to buffer. But the example is a Python one. Is there any possibility of streaming the incoming audio using Node JS?
The text was updated successfully, but these errors were encountered: