supertone-inc / supertonic

Supertonic — Lightning Fast, On-Device, Accurate TTS

Supertonic — 闪电般快速、端侧、高精度的语音合成系统

Supertonic is a lightning-fast, on-device text-to-speech system designed for local inference with minimal overhead. Powered by ONNX Runtime, it runs entirely on your device—no cloud, no API calls, no privacy concerns. Supertonic 是一款闪电般快速的端侧文本转语音(TTS)系统,专为低开销的本地推理而设计。它由 ONNX Runtime 驱动,完全在您的设备上运行——无需云端、无需 API 调用,也无需担心隐私问题。

📰 Update News

📰 更新日志

2026.04.29 - 🎉 Supertonic 3 released with 31-language support, improved reading accuracy, fewer repeat/skip failures, and v2-compatible public ONNX assets. Demo | Models 2026.04.29 - 🎉 Supertonic 3 发布,支持 31 种语言,提升了阅读准确性,减少了重复/跳过错误,并提供与 v2 兼容的公共 ONNX 资源。演示 | 模型

2026.01.22 - Voice Builder is now live! Turn your voice into a deployable, edge-native TTS with permanent ownership. 2026.01.22 - Voice Builder 现已上线!将您的声音转化为可部署、原生于边缘设备的 TTS,并拥有永久所有权。

2026.01.06 - 🎉 Supertonic 2 released with 5-language support. The v2 code path is preserved on the release/supertonic-2 branch. 2026.01.06 - 🎉 Supertonic 2 发布,支持 5 种语言。v2 代码路径已保留在 release/supertonic-2 分支中。

2025.12.10 - Added supertonic PyPI package! Install via pip install supertonic. For details, visit supertonic-py documentation. 2025.12.10 - 新增 supertonic PyPI 包!通过 pip install supertonic 安装。详情请访问 supertonic-py 文档。

2025.12.10 - Added 6 new voice styles (M3, M4, M5, F3, F4, F5). See Voices for details. 2025.12.10 - 新增 6 种语音风格(M3, M4, M5, F3, F4, F5)。详情请参阅 Voices。

2025.12.08 - Optimized ONNX models via OnnxSlim now available on Hugging Face Models. 2025.12.08 - 通过 OnnxSlim 优化的 ONNX 模型现已在 Hugging Face Models 上架。

2025.11.24 - Added Flutter SDK support with macOS compatibility. 2025.11.24 - 新增 Flutter SDK 支持,并兼容 macOS。


Quick Start

快速开始

Install the Python SDK and generate speech immediately. On the first run, Supertonic downloads the model assets from Hugging Face automatically. 安装 Python SDK 并立即生成语音。首次运行时,Supertonic 会自动从 Hugging Face 下载模型资源。

pip install supertonic
from supertonic import TTS

# First run downloads the model from Hugging Face automatically.
# 首次运行会自动从 Hugging Face 下载模型。
tts = TTS(auto_download=True)
style = tts.get_voice_style(voice_name="M1")
text = "A gentle breeze moved through the open window while everyone listened to the story."
wav, duration = tts.synthesize(text, voice_style=style, lang="en")
tts.save_audio(wav, "output.wav")
print(f"Generated {duration:.2f}s of audio")

Getting Started

入门指南

First, clone the repository: 首先,克隆仓库:

git clone https://github.com/supertone-inc/supertonic.git
cd supertonic

Prerequisites 先决条件

Before running the examples, download the ONNX models and preset voices, and place them in the assets directory: 在运行示例之前,请下载 ONNX 模型和预设语音,并将它们放入 assets 目录中:

Note: The Hugging Face repository uses Git LFS. Please ensure Git LFS is installed and initialized before cloning or pulling large model files. 注意:Hugging Face 仓库使用 Git LFS。请确保在克隆或拉取大型模型文件之前已安装并初始化 Git LFS。

# macOS:
brew install git-lfs && git lfs install
# Generic: see https://git-lfs.com for installers
git lfs install
git clone https://huggingface.co/Supertone/supertonic-3 assets

Technical Details

技术细节

  • Runtime: ONNX Runtime for cross-platform inference. 运行时: 使用 ONNX Runtime 进行跨平台推理。
  • Browser Support: onnxruntime-web for client-side inference. 浏览器支持: 使用 onnxruntime-web 进行客户端推理。
  • Batch Processing: Supports batch inference for improved throughput. 批处理: 支持批量推理以提高吞吐量。
  • Audio Output: Outputs 16-bit WAV files. 音频输出: 输出 16 位 WAV 文件。

Why Supertonic?

为什么选择 Supertonic?

  • Blazingly Fast: Optimized for low-latency, on-device speech generation across desktop, browser, and edge deployments. 极速: 针对桌面、浏览器和边缘部署的低延迟端侧语音生成进行了优化。
  • Lightweight: Compact ONNX assets designed for efficient local execution. 轻量级: 紧凑的 ONNX 资源,专为高效的本地执行而设计。
  • On-Device Capable: Complete privacy and zero network dependency. 端侧能力: 完整的隐私保护,零网络依赖。