altic-dev / FluidVoice
altic-dev / FluidVoice
FluidVoice is an open-source voice-to-text dictation app for macOS with on-device AI enhancement. Install with Homebrew: brew install --cask fluidvoice. Manual download: latest release.
FluidVoice 是一款适用于 macOS 的开源语音转文字听写应用,具备设备端 AI 增强功能。通过 Homebrew 安装:brew install --cask fluidvoice。手动下载:最新版本。
Important: This project is free and open source under GPLv3. If FluidVoice is useful to you, please star the repository — it helps visibility and keeps development going.
重要提示: 本项目基于 GPLv3 协议开源且免费。如果 FluidVoice 对您有帮助,请为该仓库点个 Star,这有助于提高项目的可见性并支持持续开发。
Support FluidVoice: If FluidVoice makes your day a little easier, you can support its continued development on GitHub Sponsors.
支持 FluidVoice: 如果 FluidVoice 让您的工作变得更轻松,您可以通过 GitHub Sponsors 支持其持续开发。
What’s New in 1.6.0
- Insanely fast Parakeet: Rebuilt Parakeet implementation with pretty much zero delay between speaking and seeing words on screen.
- Fluid Intelligence: Fully local AI model for on-device dictation enhancement. No cloud, no API keys, no data leaving your Mac.
- Better Theming: Adaptive light/dark theme with a compact toolbar switcher.
- Refreshed Onboarding: Language-first voice engine setup, real dictation tryout, and AI enhancement setup in one clean pass.
1.6.0 版本更新内容
- 极速 Parakeet: 重构了 Parakeet 实现,几乎实现了从说话到屏幕显示文字的零延迟。
- Fluid Intelligence: 用于设备端听写增强的完全本地化 AI 模型。无需云端、无需 API 密钥,数据绝不离开您的 Mac。
- 更好的主题: 自适应浅色/深色主题,配有紧凑的工具栏切换器。
- 焕新的引导流程: 以语言为先的语音引擎设置、真实的听写试用以及 AI 增强设置,一气呵成。
Warning: Based on early feedback, Fluid Intelligence may cause you to unsubscribe from other dictation apps and save money. You’ve been warned.
警告: 根据早期反馈,Fluid Intelligence 可能会导致您取消订阅其他听写应用从而省下一笔钱。请注意,这并非玩笑。
Fluid Intelligence
FluidVoice is fully open source under GPLv3. Fluid Intelligence is a separate, privately maintained local AI runtime that powers advanced on-device dictation enhancement — smart formatting, context-aware capitalization, and post-processing — all running locally on your Mac. The app works great on its own with any supported speech model and optional cloud AI providers. Fluid Intelligence adds a fully local, private AI layer for users who want on-device enhancement without sending data anywhere. We’re keeping Fluid Intelligence private for now so we can sustainably offer the core dictation experience for free. This may change in the future.
Fluid Intelligence
FluidVoice 基于 GPLv3 协议完全开源。Fluid Intelligence 是一个独立的、私有维护的本地 AI 运行时,为高级设备端听写增强提供支持——包括智能格式化、上下文感知的大小写处理以及后期处理——所有这些都在您的 Mac 上本地运行。该应用本身配合任何受支持的语音模型及可选的云端 AI 提供商即可出色工作。Fluid Intelligence 为那些希望在不发送任何数据的情况下获得设备端增强功能的用户,增加了一个完全本地化、私密的 AI 层。目前我们保持 Fluid Intelligence 的私有性,以便能够可持续地免费提供核心听写体验。未来可能会有所调整。
Features
- Fluid Intelligence: On-device AI enhancement for smart formatting, context-aware capitalization, and post-processing, all running locally on your Mac with zero data leaving your machine.
- Command Mode: Control your Mac by voice: launch apps, run shortcuts, trigger system actions, and automate workflows without touching the keyboard.
- Write Mode: Write or rewrite text directly in any text field across any app. Select text and rewrite it, or dictate new content inline.
- Live Preview: Real-time transcription overlay with notch support, so you see words appear as you speak.
- Multiple Speech Models: Nemotron Speech 3.5, Parakeet Flash, Parakeet TDT v3 & v2, Cohere Transcribe, Apple Speech, and Whisper. Pick the model that fits your language and latency needs.
- AI Enhancement: Optional post-processing via OpenAI, Groq, custom providers, or local Fluid Intelligence for cleaner, more accurate transcripts.
- Audio History: Optional local recording history with budget controls and ZIP export, so you can review past dictations without cloud storage.
- Today-Usage Stats: Daily usage tracking at a glance with a stats header card and toolbar pill.
- Adaptive Theming: Light/dark theme that follows your system, with a compact toolbar switcher.
- Global Hotkey: Instant voice capture from anywhere, no app switching needed.
- Smart Typing: Direct insertion into any app via accessibility APIs for reliable, app-independent text entry.
- Menu Bar Integration: Quick access, status, and settings from the menu bar.
- Auto-Updates: Seamless updates with an optional beta channel for early previews.
- Per-App Configuration: Assign different prompt sets to different apps, so your dictation adapts to whatever you’re working in. Fully optional.
- Notch-Aware Overlay: Transcription overlay that fits cleanly around the MacBook notch, or use a standard overlay if your Mac doesn’t have one.
- Local-First: Your voice and text never leave your machine unless you opt in to a cloud AI provider.
- Fastest Parakeet on Mac: One of the fastest native implementations of Parakeet on macOS, with near-instant transcription and minimal latency.
- Configurable Overlay: Choose from pill-shaped to large overlay sizes to show live preview, or keep it minimal. Everything is optional.
- Everything is Optional: AI enhancement, Fluid Intelligence, audio history, analytics, and beta builds are all opt-in. The core dictation experience works out of the box with zero configuration beyond permissions and a hotkey.
功能特性
- Fluid Intelligence: 设备端 AI 增强,用于智能格式化、上下文感知的大小写处理和后期处理,全部在您的 Mac 上本地运行,数据零外泄。
- 命令模式: 通过语音控制您的 Mac:启动应用、运行快捷指令、触发系统操作以及自动化工作流,无需触碰键盘。
- 写作模式: 在任何应用的文本框中直接书写或重写文本。选中文字即可重写,或直接内联听写新内容。
- 实时预览: 支持刘海屏的实时转录浮窗,让您在说话时即刻看到文字出现。
- 多种语音模型: 支持 Nemotron Speech 3.5、Parakeet Flash、Parakeet TDT v3 & v2、Cohere Transcribe、Apple Speech 和 Whisper。选择最适合您语言和延迟需求模型。
- AI 增强: 通过 OpenAI、Groq、自定义提供商或本地 Fluid Intelligence 进行可选的后期处理,以获得更清晰、更准确的转录结果。
- 音频历史记录: 可选的本地录音历史记录,带有存储空间管理和 ZIP 导出功能,无需云存储即可回顾过往听写。
- 今日使用统计: 通过统计卡片和工具栏小图标,一目了然地查看每日使用情况。
- 自适应主题: 随系统变化的浅色/深色主题,配有紧凑的工具栏切换器。
- 全局快捷键: 随时随地即刻捕捉语音,无需切换应用。
- 智能输入: 通过辅助功能 API 直接插入任何应用,实现可靠且与应用无关的文本输入。
- 菜单栏集成: 从菜单栏快速访问状态和设置。
- 自动更新: 无缝更新,并提供可选的 Beta 通道以获取早期预览。
- 应用级配置: 为不同应用分配不同的提示词集,使听写适应您的工作场景。完全可选。
- 刘海屏感知浮窗: 转录浮窗可完美避开 MacBook 刘海,若您的 Mac 没有刘海则使用标准浮窗。
- 本地优先: 除非您主动选择使用云端 AI 提供商,否则您的语音和文本绝不会离开您的设备。
- Mac 上最快的 Parakeet: macOS 上最快的 Parakeet 原生实现之一,转录近乎即时,延迟极低。
- 可配置浮窗: 从胶囊状到大尺寸浮窗,您可以选择显示实时预览或保持极简。一切皆可自定义。
- 一切皆可选: AI 增强、Fluid Intelligence、音频历史、分析和 Beta 版本均为可选。核心听写体验开箱即用,除权限和快捷键外无需任何配置。
Supported Models
(Table omitted for brevity, please refer to the original source for technical specifications)
支持的模型
(为保持简洁,此处省略表格,技术规格请参考原文)