Build an AI Audio Translator in Python on Telnyx Inference
Build an AI Audio Translator in Python on Telnyx Inference
在 Telnyx Inference 上使用 Python 构建 AI 音频翻译器
A lot of AI apps are starting to mix voice, language models, and generated audio. I built a small Python example that shows that full loop: take an audio file, transcribe it, translate the transcript with an LLM, and generate translated speech. 许多 AI 应用正开始融合语音、语言模型和生成式音频。我构建了一个小型 Python 示例来展示这一完整闭环:获取音频文件、进行转录、使用大语言模型(LLM)翻译转录内容,并生成翻译后的语音。
Repo: https://github.com/team-telnyx/telnyx-code-examples/tree/main/ai-content-translator-python 代码仓库: https://github.com/team-telnyx/telnyx-code-examples/tree/main/ai-content-translator-python
What it does
功能概述
The app exposes a Flask API for translating spoken content. You send it an audio file and a target language. It returns: the original transcript, the translated text, and generated translated audio. So instead of only translating text, the example shows a practical speech-to-speech style workflow. 该应用提供了一个用于翻译口语内容的 Flask API。你只需发送音频文件和目标语言,它就会返回:原始转录文本、翻译后的文本以及生成的翻译音频。因此,该示例展示的不仅仅是文本翻译,而是一种实用的“语音到语音”工作流。
Why this pattern is useful
为什么这种模式很有用
This kind of flow can be useful for apps that need multilingual voice experiences, like: customer support tools, education apps, internal enablement content, voice agents, media localization, accessibility workflows, and product tutorials in multiple languages. The important part is that each step stays understandable. Speech-to-text, translation, and text-to-speech are separate pieces, so you can debug or replace one part without rewriting the whole app. 这种流程对于需要多语言语音体验的应用非常有用,例如:客户支持工具、教育应用、内部赋能内容、语音代理、媒体本地化、无障碍工作流以及多语言产品教程。关键在于每个步骤都是独立的且易于理解。语音转文字、翻译和文字转语音是分离的模块,因此你可以在不重写整个应用的情况下调试或替换其中任何一部分。
How the example works
示例工作原理
The app uses Telnyx APIs for the voice and AI parts of the workflow. At a high level: Upload source audio, Transcribe the audio, Send the transcript to an LLM for translation, Generate speech from the translated text, and Return text plus audio output. That gives you a clean starting point for building your own multilingual AI workflow. 该应用使用 Telnyx API 来处理工作流中的语音和 AI 部分。简而言之:上传源音频、转录音频、将转录内容发送给 LLM 进行翻译、根据翻译后的文本生成语音,最后返回文本和音频输出。这为你构建自己的多语言 AI 工作流提供了一个清晰的起点。
Try it
如何尝试
Clone the repo:
克隆仓库:
git clone https://github.com/team-telnyx/telnyx-code-examples.git
cd telnyx-code-examples/ai-content-translator-python
Install dependencies and set up your environment:
安装依赖并设置环境:
pip install -r requirements.txt
cp .env.example .env
python app.py
Then call the translation endpoint with an audio file and target language. Check the README for the exact request shape: https://github.com/team-telnyx/telnyx-code-examples/tree/main/ai-content-translator-python 然后使用音频文件和目标语言调用翻译接口。请查看 README 以获取准确的请求格式:https://github.com/team-telnyx/telnyx-code-examples/tree/main/ai-content-translator-python
Why I like this example
为什么我喜欢这个示例
It is a useful pattern for anyone building AI apps where the interface is not just text. Text-only LLM demos are helpful, but a lot of real user experiences involve audio: people speaking, systems responding, and content moving across languages. This example keeps the workflow small enough to understand, while still showing how speech-to-text, LLM translation, and text-to-speech can fit together in one app. The Telnyx code examples repo is also structured to be agent-readable, so coding agents can inspect the examples, understand the API patterns, and help you extend them into fuller applications. 对于任何构建非纯文本交互 AI 应用的开发者来说,这都是一种实用的模式。纯文本的 LLM 演示固然有用,但许多真实的用户体验都涉及音频:人们说话、系统响应以及跨语言的内容流转。这个示例将工作流保持在易于理解的规模,同时展示了语音转文字、LLM 翻译和文字转语音如何整合在一个应用中。此外,Telnyx 代码示例仓库的结构设计便于 AI 代理阅读,因此编码代理可以检查这些示例、理解 API 模式,并帮助你将其扩展为更完整的应用程序。
Resources: 资源:
- Code example (代码示例)
- Telnyx Developer Docs (Telnyx 开发者文档)