I rewrote mp3gain in Rust — 'compatible' turned out to be three different things

I rewrote mp3gain in Rust — ‘compatible’ turned out to be three different things

我用 Rust 重写了 mp3gain —— 结果发现“兼容”竟有三种不同含义

If you maintain a podcast, music server, or any audio pipeline that needs consistent volume across files, there’s a non-trivial chance you have one of these somewhere: RUN apt-get install -y mp3gain Or in a beets config: replaygain: backend: command command: mp3gain Or buried in a cron job from 2014. 如果你维护着播客、音乐服务器或任何需要保持文件间音量一致的音频流水线,那么你很有可能在某个地方用到过这些:RUN apt-get install -y mp3gain,或者在 beets 的配置文件里写着 replaygain: backend: command command: mp3gain,又或者它被埋在 2014 年的一个 cron 任务里。

mp3gain was written in C by Glen Sawyer in 2003. Upstream development stopped around 2009. Distributors (Debian, Ubuntu, Homebrew) keep it alive with security patches, but no new features have shipped in 15+ years. Its AAC counterpart aacgain died around the same time and doesn’t even build cleanly on modern 64-bit systems. mp3gain 由 Glen Sawyer 于 2003 年用 C 语言编写。其上游开发在 2009 年左右停止。发行版维护者(Debian、Ubuntu、Homebrew)通过安全补丁维持着它的生命,但 15 年多来没有任何新功能发布。它的 AAC 对应工具 aacgain 也在同一时期停止维护,甚至无法在现代 64 位系统上顺利编译。

People keep using both because the popular alternatives — loudgain, rsgain, ffmpeg loudnorm — solve a related problem (writing ReplayGain tags) but not the same problem. A tag-only tool doesn’t help when your players ignore tags entirely: DJ hardware, smart speakers, most car audio, podcast publishing pipelines that bake volume into the file. For those, you need the bitstream itself rewritten — losslessly, reversibly, fast. 人们一直使用这两款工具,是因为目前流行的替代方案(如 loudgain、rsgain、ffmpeg loudnorm)解决的是相关问题(写入 ReplayGain 标签),而非同一个问题。当你的播放器完全忽略标签时,仅靠标签的工具就无能为力了:比如 DJ 设备、智能音箱、大多数车载音响,以及那些将音量直接写入文件的播客发布流水线。对于这些场景,你需要重写比特流本身——而且要做到无损、可逆且快速。

Rather than CVE-patch a 22-year-old C codebase one more time, I spent the last year writing mp3rgain, a Rust implementation that reads and writes the same files mp3gain does. Halfway through, I realized the word “compatible” was hiding three completely different things. 与其再次为 22 年前的 C 代码库打上 CVE 补丁,我花了一年时间编写了 mp3rgain,这是一个用 Rust 实现的工具,可以读写与 mp3gain 相同的文件。在编写过程中,我意识到“兼容”这个词背后隐藏着三种完全不同的含义。

Layer 1 — byte-identical output

第一层 —— 字节级完全一致的输出

The strictest compatibility claim is that the output file is bit-for-bit identical: 最严格的兼容性要求是输出文件必须逐位(bit-for-bit)相同: cp original.mp3 a.mp3 && cp original.mp3 b.mp3 mp3gain -g 2 a.mp3 mp3rgain -g 2 b.mp3 sha256sum a.mp3 b.mp3 # → same hash

To get there, the Rust implementation has to match every detail of the C version’s bitstream rewrite: synchronization word detection, MPEG version dispatch, side-information size calculation (which differs by MPEG version × channel mode), and bit-level reads/writes that span byte boundaries. I wanted to “clean up” something the C code did awkwardly more than once. Every time I had to remind myself: the moment I lose byte-identical output, I lose the right to call this a drop-in replacement. 为了实现这一点,Rust 实现必须匹配 C 版本比特流重写的所有细节:同步字检测、MPEG 版本分发、侧信息(side-information)大小计算(根据 MPEG 版本和声道模式而异),以及跨越字节边界的位级读写。我曾多次想“清理”一下 C 代码中处理得笨拙的地方,但每次我都提醒自己:一旦我失去了字节级的一致性,我就失去了将其称为“直接替换(drop-in replacement)”的资格。

There’s a CI script (scripts/compatibility-test.sh) that diffs SHA-256 hashes between both tools across MPEG1/2/2.5, mono/stereo/joint stereo, CBR/VBR, and a range of gain values. If even one case mismatches, the PR doesn’t merge. 我编写了一个 CI 脚本 (scripts/compatibility-test.sh),用于对比两个工具在 MPEG1/2/2.5、单声道/立体声/联合立体声、CBR/VBR 以及各种增益值下的 SHA-256 哈希值。只要有一个案例不匹配,PR 就无法合并。

Layer 2 — tag interoperability

第二层 —— 标签互操作性

mp3gain stores undo information in APEv2 tags: mp3gain 将撤销信息存储在 APEv2 标签中: mp3gain_undo: -3,-2,N mp3gain_minmax: 100,148

If I run mp3gain -g 2, then later mp3rgain -u, the undo has to work — and vice versa. This is a different layer from byte-identical output: it’s about the metadata block, not the audio frame data. mp3rgain reads and writes the same APEv2 fields with the same string format. There’s one intentional break: after -u, mp3gain leaves an empty APEv2 tag block in place (probably because rewriting it would shift downstream frame offsets). mp3rgain removes the tag completely. The audio data is identical either way and the bidirectional undo property still holds, so I judged this as still “compatible enough.” 如果我运行 mp3gain -g 2,稍后再运行 mp3rgain -u,撤销操作必须生效——反之亦然。这与字节级一致性属于不同层面:它涉及的是元数据块,而非音频帧数据。mp3rgain 使用相同的字符串格式读写相同的 APEv2 字段。这里有一个刻意的改动:在执行 -u 后,mp3gain 会保留一个空的 APEv2 标签块(可能是因为重写它会改变后续帧的偏移量)。而 mp3rgain 会彻底移除该标签。无论哪种方式,音频数据都是相同的,且双向撤销属性依然成立,所以我认为这仍然“足够兼容”。

Layer 3 — text protocol

第三层 —— 文本协议

mp3gain -o (no argument) prints a tab-separated table: mp3gain -o(无参数)会打印一个制表符分隔的表格: File MP3 gain dB gain Max Amplitude Max global_gain Min global_gain song.mp3 0 0.0 17234 148 100

beets parses this with regex. So do an unknown number of personal scripts that have run unmodified for a decade. Change the column order, the header text, or the separator, and you break all of them silently. mp3rgain emits the exact same header — one println! line at main.rs:1275: println!("File\tMP3 gain\tdB gain\tMax Amplitude\tMax global_gain\tMin global_gain"); New structured output lives behind -o json, opt-in, never the default. beets 使用正则表达式解析此输出。无数运行了十年且未曾修改的个人脚本也是如此。如果改变列顺序、标题文本或分隔符,所有这些脚本都会静默失效。mp3rgain 输出完全相同的标题——在 main.rs:1275 处有一行 println!println!("File\tMP3 gain\tdB gain\tMax Amplitude\tMax global_gain\tMin global_gain"); 新的结构化输出隐藏在 -o json 参数之后,需要手动开启,绝不会作为默认选项。

What I deliberately didn’t keep

我刻意放弃的部分

Compatibility isn’t free, and not every quirk is worth preserving: 兼容性是有代价的,并非每一个特性都值得保留:

  • AAC support: mp3gain has none. mp3rgain rewrites AAC global_gain in place (the same idea aacgain used) and stores undo info in MP4 freeform metadata atoms because APEv2 doesn’t fit MP4 containers. AAC 支持: mp3gain 原本不支持。mp3rgain 会原地重写 AAC 的 global_gain(与 aacgain 的思路相同),并将撤销信息存储在 MP4 的自由格式元数据原子中,因为 APEv2 不适用于 MP4 容器。
  • -o json and —dry-run: new flags for automated pipelines. Preview safely, then apply — something the original CLI didn’t really support. -o json 和 —dry-run: 为自动化流水线新增的标志。可以安全预览后再应用——这是原始 CLI 并不真正支持的功能。
  • ID3v2 RVA2 / TXXX ReplayGain tags (-s i): opt-in. foobar2000, mpd, and other ReplayGain-aware players read these; APEv2 tags are invisible to them. ID3v2 RVA2 / TXXX ReplayGain 标签 (-s i): 可选开启。foobar2000、mpd 和其他支持 ReplayGain 的播放器会读取这些标签;而它们无法识别 APEv2 标签。

Migrating: what it actually looks like

迁移:实际操作是怎样的

For most pipelines, migration is one substitution. 对于大多数流水线,迁移只需一次替换。

Shell scripts: sed -i 's/\bmp3gain\b/mp3rgain/g' your_pipeline.sh Shell 脚本:sed -i 's/\bmp3gain\b/mp3rgain/g' your_pipeline.sh

Dockerfile — replace the apt-installed binary with a 2 MB static image: Dockerfile — 将 apt 安装的二进制文件替换为 2 MB 的静态镜像: - RUN apt-get install -y mp3gain && rm -rf /var/lib/apt/lists/* - ENTRYPOINT ["mp3gain"] + FROM ghcr.io/m-igashi/mp3rgain:latest

That’s it. The image is FROM scratch with a musl-static binary: no shell, no glibc, no apt cache to clean. 就是这样。该镜像基于 FROM scratch,包含一个 musl 静态二进制文件:没有 shell,没有 glibc,无需清理 apt 缓存。

beets — change one line in ~/.config/beets/config.yaml: beets — 修改 ~/.config/beets/config.yaml 中的一行: - command: mp3gain + command: mp3rgain

The full migration guide is at docs/migrating-from-mp3gain.md, with sed patterns, CI snippets, and the apt/dnf/pacman/brew/winget/cargo install matrix. 完整的迁移指南位于 docs/migrating-from-mp3gain.md,其中包含了 sed 模式、CI 代码片段以及 apt/dnf/pacman/brew/winget/cargo 的安装矩阵。

Why bother

为什么要折腾

Three reasons that mattered to me: 对我而言,有三个重要的原因:

  1. Memory safety. mp3gain’s history includes a stream of CVEs — heap overflows in the side-info parser, mostly. Patching those in 2025 means tracking down a long-quiet maintainer’s intent. A Rust rewrite removes the whole class from the picture. 内存安全。 mp3gain 的历史中充满了 CVE 漏洞——主要是侧信息解析器中的堆溢出。在 2025 年修复这些漏洞意味着要追溯早已沉寂的维护者的意图。用 Rust 重写则从根本上消除了这类问题。
  2. AAC. Most personal libraries on Apple platforms are AAC, and there’s been no working tool to volume-normalize them losslessly since aacgain stopped building. DJ hardware, car audio, and smart speakers all ignore ReplayGain tags, so tag-only tools don’t help. AAC 支持。 Apple 平台上的大多数个人音乐库都是 AAC 格式,自 aacgain 停止构建以来,一直没有好用的工具能对它们进行无损音量标准化。DJ 设备、车载音响和智能音箱都会忽略 ReplayGain 标签,因此仅靠标签的工具无济于事。
  3. Distribution. Static binary in a 2 MB image, plus packages on Homebrew / Winget / AUR / PPA / Docker / Cargo. No “build from source on this niche distro” required. 分发。 2 MB 镜像中的静态二进制文件,加上 Homebrew / Winget / AUR / PPA / Docker / Cargo 上的软件包。无需在小众发行版上进行“从源码编译”。

M-Igashi / mp3rgain Lossless MP3 volume adjustment - a modern mp3gain replacement written in Rust M-Igashi / mp3rgain 无损 MP3 音量调整 - 一个用 Rust 编写的现代 mp3gain 替代品

mp3rgain Lossless MP3/AAC volume adjustment - a modern mp3gain / aacgain replacement written in Rust mp3rgain 无损 MP3/AAC 音量调整 - 一个用 Rust 编写的现代 mp3gain / aacgain 替代品

mp3rgain adjusts MP3 and AAC volume without re-encoding by modifying the global_gain field in each frame. This preserves audio quality. mp3rgain 通过修改每一帧中的 global_gain 字段来调整 MP3 和 AAC 音量,无需重新编码。这保留了音频质量。