Fluid, natural voice translation with Gemini 3.5 Live Translate

Fluid, natural voice translation with Gemini 3.5 Live Translate

使用 Gemini 3.5 Live Translate 实现流畅、自然的语音翻译

Gemini 3.5 Live Translate is our latest audio model, delivering near real-time speech-to-speech translation in over 70 languages. Gemini 3.5 Live Translate 是我们最新的音频模型,可提供 70 多种语言的近乎实时的语音到语音翻译。

Twenty years ago, translation at Google began as one of our pioneering machine learning experiments to turn the science of language into the magic of human connection. That experiment has come a long way with over a trillion words being translated for billions of users across our products every month. 二十年前,谷歌的翻译项目作为我们早期的机器学习实验之一启动,旨在将语言科学转化为人与人之间连接的魔力。这项实验已经取得了长足的进步,每月为我们产品中的数十亿用户翻译超过一万亿个单词。

Today, we’re taking our next step with the release of Gemini 3.5 Live Translate, our latest audio model for live speech-to-speech translation. The model automatically detects 70+ languages and generates smooth, natural-sounding translated speech that preserves the speakers’ intonation, pacing and pitch. 今天,我们发布了 Gemini 3.5 Live Translate,这是我们用于实时语音到语音翻译的最新音频模型,迈出了新的一步。该模型可自动检测 70 多种语言,并生成流畅、自然的翻译语音,同时保留说话者的语调、节奏和音高。

Unlike turn-by-turn systems that wait for the speaker to finish speaking before responding, 3.5 Live Translate generates speech continuously, balancing the trade-off between waiting for context to improve quality and translating immediately to stay in sync with the speaker. It delivers fluid audio without awkward pauses and stays just a few seconds behind the speaker throughout the session. 与那些需要等待说话者说完话才做出响应的“轮流式”系统不同,3.5 Live Translate 可以持续生成语音,在“等待上下文以提高质量”和“立即翻译以保持同步”之间取得平衡。它能提供流畅的音频,没有尴尬的停顿,并且在整个对话过程中始终保持仅落后说话者几秒钟。

Gemini 3.5 Live Translate is rolling out starting today across Google products: Gemini 3.5 Live Translate 从今天开始在谷歌的各项产品中推出:

  • For developers in public preview via the Gemini Live API and Google AI Studio
  • 面向开发者:通过 Gemini Live API 和 Google AI Studio 提供公开预览
  • For enterprises in private preview starting this month in Google Meet
  • 面向企业:本月起在 Google Meet 中提供私有预览
  • For everyone via Google Translate on Android and iOS
  • 面向大众:通过 Android 和 iOS 上的 Google Translate 应用提供

Build with 3.5 Live Translate

使用 3.5 Live Translate 进行开发

Gemini 3.5 Live Translate processes speech as it’s streamed, enabling a more seamless connection across languages. The model handles multilingual inputs without the need to manually configure settings. At the same time, its noise robustness ensures applications can handle loud, unpredictable environments. You can use its capabilities to help facilitate live interpretation for multilingual calls, meetings, lessons, broadcasts and more. Gemini 3.5 Live Translate 在语音流式传输时进行处理,实现了跨语言的无缝连接。该模型无需手动配置设置即可处理多语言输入。同时,其强大的抗噪能力确保应用程序能够应对嘈杂、不可预测的环境。您可以利用其功能为多语言通话、会议、课程、广播等提供实时口译支持。

By utilizing the Gemini Live API, developer platforms like Agora, Fishjam, LiveKit, Pipecat, and Vision Agents enable developers to build and deploy voice translation apps with ease. These integrations handle the complex real-time media streaming infrastructure, so developers can focus on the user experience. 通过利用 Gemini Live API,Agora、Fishjam、LiveKit、Pipecat 和 Vision Agents 等开发者平台使开发者能够轻松构建和部署语音翻译应用。这些集成处理了复杂的实时媒体流基础设施,因此开发者可以将精力集中在用户体验上。

Our partners at Grab are testing the model to enable multilingual communication in near real-time between drivers and travelers at pickups. These users make over 10 million voice calls per month through Grab. 我们的合作伙伴 Grab 正在测试该模型,以实现司机和乘客在接载时近乎实时的多语言沟通。这些用户每月通过 Grab 进行超过 1000 万次语音通话。

Experience 3.5 Live Translate in your video meetings

在视频会议中体验 3.5 Live Translate

Speech translation in Google Meet will soon use 3.5 Live Translate, improving the experience by: Google Meet 中的语音翻译功能即将采用 3.5 Live Translate,通过以下方式改善体验:

  • Offering 70+ languages, an improvement from the previous limit of just five languages.
  • 提供 70 多种语言,较之前仅限五种语言的限制有了显著提升。
  • Enabling conversations across over 2000+ language combinations in one meeting, expanding from the previous state of only translating to and from English.
  • 支持在一次会议中进行 2000 多种语言组合的对话,打破了以往仅限于英语互译的局限。
  • Updating the interface to provide instant access to speech translation.
  • 更新界面,提供对语音翻译功能的即时访问。

Get 3.5 Live Translate in the Google Translate app on Android or iOS

在 Android 或 iOS 上的 Google Translate 应用中使用 3.5 Live Translate

The model is also rolling out on the Google Translate app globally, on both Android and iOS. When using the Live translate feature, simply connect any pair of headphones to experience a more seamless translation that mirrors the speaker’s tone across 70+ languages. 该模型也正在全球范围内的 Android 和 iOS 版 Google Translate 应用中推出。使用“实时翻译”功能时,只需连接任何耳机,即可体验跨 70 多种语言、能够还原说话者语气的无缝翻译。

For Android users, we’re also starting to roll out a new ‘listening mode’ with 3.5 Live Translate that lets you hear translations directly through your phone’s earpiece. Simply hold your phone to your ear just like a regular call, and the translated audio streams straight to you. 对于 Android 用户,我们还开始推出 3.5 Live Translate 的新“听筒模式”,让您可以直接通过手机听筒收听翻译内容。只需像接听普通电话一样将手机贴在耳边,翻译后的音频就会直接传输给您。

Watermarked with SynthID

使用 SynthID 添加水印

All audio generated by our models is watermarked with SynthID. This imperceptible watermark is woven directly into the audio output, ensuring AI-generated content remains detectable to help prevent misinformation. 我们模型生成的所有音频都带有 SynthID 水印。这种不可察觉的水印直接嵌入在音频输出中,确保 AI 生成的内容可被检测,从而有助于防止虚假信息。