[Feature request] Support for Quantized ONNX Model Conversion for Stream Inference #4043

TranDacKhoa · 2024-11-02T04:01:41Z

🚀 Feature Description

Is there support in Coqui TTS for converting models to a quantized ONNX format for stream inference? This feature would enhance model performance and reduce inference time for real-time applications.

Solution

Implement a workflow or tool within Coqui TTS for easy conversion of TTS models to quantized ONNX format.

Alternative Solutions

Currently, external tools like ONNX Runtime or TensorRT can be used for post-conversion quantization, but having this feature natively would streamline the process.

Additional context

Any existing documentation or insights on this topic would be appreciated. Thank you!

stale · 2024-12-08T09:24:15Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

TranDacKhoa added the feature request feature requests for making TTS better. label Nov 2, 2024

stale bot added the wontfix This will not be worked on but feel free to help. label Dec 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] Support for Quantized ONNX Model Conversion for Stream Inference #4043

[Feature request] Support for Quantized ONNX Model Conversion for Stream Inference #4043

TranDacKhoa commented Nov 2, 2024

stale bot commented Dec 8, 2024

[Feature request] Support for Quantized ONNX Model Conversion for Stream Inference #4043

[Feature request] Support for Quantized ONNX Model Conversion for Stream Inference #4043

Comments

TranDacKhoa commented Nov 2, 2024

stale bot commented Dec 8, 2024