How to just get the output in Python? #246

Picus303 · 2025-01-11T18:01:53Z

Picus303
Jan 11, 2025

Hi!
Is there any way to not export the generated audio by playing it, writing it in a file... but just get it as a numpy array or pytorch tensor? I still need to process audio post-generation.
I'd like to make a program with two threads, one to input the text and the other to get the output and post-process it in realtime. Is that possible?

Answered by KoljaB

Jan 11, 2025

Yep, that's possible. You can use the on_audio_chunk callback to get the generated audio chunks directly.

Demo (which still writes to a file just to verify it works, but you can skip that if you don't need it):

import wave
import numpy as np
import torch

# Accumulators for all chunks
all_bytes = bytearray()  # Raw byte data
all_numpy_chunks = []    # List of NumPy arrays
all_tensor_chunks = []   # List of PyTorch tensors

def process_chunk(chunk_bytes):
    """
    Processes each chunk and accumulates the data for writing later.
    """
    # Convert bytes to NumPy array
    audio_data_numpy = np.frombuffer(chunk_bytes, dtype=np.int16)

    # Convert NumPy array to PyTorch tensor
    aud…

View full answer

KoljaB · 2025-01-11T20:22:53Z

KoljaB
Jan 11, 2025
Maintainer

Yep, that's possible. You can use the on_audio_chunk callback to get the generated audio chunks directly.

Demo (which still writes to a file just to verify it works, but you can skip that if you don't need it):

import wave
import numpy as np
import torch

# Accumulators for all chunks
all_bytes = bytearray()  # Raw byte data
all_numpy_chunks = []    # List of NumPy arrays
all_tensor_chunks = []   # List of PyTorch tensors

def process_chunk(chunk_bytes):
    """
    Processes each chunk and accumulates the data for writing later.
    """
    # Convert bytes to NumPy array
    audio_data_numpy = np.frombuffer(chunk_bytes, dtype=np.int16)

    # Convert NumPy array to PyTorch tensor
    audio_pytorch_tensor = torch.from_numpy(audio_data_numpy)

    # Accumulate data for later writing
    all_bytes.extend(chunk_bytes)  # Add raw bytes
    all_numpy_chunks.append(audio_data_numpy)  # Add NumPy array
    all_tensor_chunks.append(audio_pytorch_tensor)  # Add PyTorch tensor

if __name__ == "__main__":
    from RealtimeTTS import TextToAudioStream, CoquiEngine

    # Initialize the engine and stream
    engine = CoquiEngine()
    stream = TextToAudioStream(engine, muted=True)

    # Feed the text and play the audio
    stream.feed("Hello World")
    stream.play(on_audio_chunk=process_chunk, muted=True)

    # Retrieve audio parameters
    format, channels, sample_rate = engine.get_stream_info()

    # Write the accumulated raw byte data to a wave file
    with wave.open("audio_data_from_bytes.wav", 'wb') as wf:
        wf.setnchannels(channels)
        wf.setsampwidth(2)  # int16 => 2 bytes
        wf.setframerate(sample_rate)
        wf.writeframes(all_bytes)

    # Write the accumulated NumPy data to a wave file
    combined_numpy = np.concatenate(all_numpy_chunks)  # Combine all NumPy arrays
    with wave.open("audio_data_from_numpy.wav", 'wb') as wf:
        wf.setnchannels(channels)
        wf.setsampwidth(2)
        wf.setframerate(sample_rate)
        wf.writeframes(combined_numpy.tobytes())

    # Write the accumulated PyTorch tensor data to a wave file
    combined_tensor = torch.cat(all_tensor_chunks)  # Combine all tensors
    with wave.open("audio_data_from_tensor.wav", 'wb') as wf:
        wf.setnchannels(channels)
        wf.setsampwidth(2)
        wf.setframerate(sample_rate)
        wf.writeframes(combined_tensor.numpy().tobytes())

    engine.shutdown()

The key part is this line:

stream.play(on_audio_chunk=process_chunk, muted=True)

This tells the stream to call process_chunk whenever it generates an audio chunk. From there, you can do with the chunk data whatever you want.

If you need audio details (like sample rate, format, etc.), you can call:

format, channels, sample_rate = engine.get_stream_info()

1 reply

Picus303 Jan 11, 2025
Author

Cool, thanks a lot!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to just get the output in Python? #246

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

How to just get the output in Python? #246

Picus303 Jan 11, 2025

Replies: 1 comment · 1 reply

KoljaB Jan 11, 2025 Maintainer

Picus303 Jan 11, 2025 Author

Picus303
Jan 11, 2025

Replies: 1 comment 1 reply

KoljaB
Jan 11, 2025
Maintainer

Picus303 Jan 11, 2025
Author