Announcing Zstandard in Rust
Announcing Zstandard in Rust
2026-06-01 | Author: Folkert de Vries
Over the past year, we’ve been silently working on our third compression project. After zlib and bzip2 we’re now taking on zstd with libzstd-rs-sys, and are proud to announce its first release. Zstd is a compression format that has been designed with modern CPUs in mind. It is both much faster and able to compress better than gzip. It is already widely used, and we expect that it will continue to gradually replace gzip for web traffic. 在过去的一年中,我们一直在默默进行第三个压缩项目。继 zlib 和 bzip2 之后,我们现在通过 libzstd-rs-sys 挑战 zstd,并很荣幸地宣布其首个版本发布。Zstd 是一种专为现代 CPU 设计的压缩格式。它不仅速度快得多,而且压缩效果也优于 gzip。它目前已被广泛使用,我们预计它将继续逐步取代 Web 流量中的 gzip。
Why though?
为什么要这样做?
Using zstd in Rust is already supported via the zstd crate, so why bother with a whole new implementation?
目前 Rust 中已经可以通过 zstd crate 使用 zstd,那么为什么还要费心去实现一个全新的版本呢?
Portability 可移植性
One practical advantage of Rust is that it is straightforward to write portable software. Currently the zstd crate compiles the C code from source, which requires a C toolchain for the target and for the target to be supported at all. Setting up a C toolchain on Windows or for webassembly can be a challenge, and isn’t needed with a pure-Rust implementation. For Rust programmers, using dependencies written in Rust is a superior experience.
Rust 的一个实际优势是编写可移植软件非常简单。目前 zstd crate 从源代码编译 C 代码,这要求目标平台具备 C 工具链,且该平台必须得到支持。在 Windows 或 WebAssembly 上配置 C 工具链可能是一项挑战,而纯 Rust 实现则无需这些。对于 Rust 程序员来说,使用用 Rust 编写的依赖项是一种更好的体验。
Drop-in replacement 直接替换
Additionally, like with zlib and bzip2, we support compiling libzstd-rs-sys into a drop-in compatible C library. Hence we are, or aim to be, an alternative to the C reference implementation. 此外,与 zlib 和 bzip2 一样,我们支持将 libzstd-rs-sys 编译为可直接兼容的 C 库。因此,我们是(或旨在成为)C 参考实现的一种替代方案。
Strengthening the ecosystem 加强生态系统
The C reference implementation is maintained by Meta, and requires signing a contributor agreement with them in order to contribute. We believe that an independent, performant and compatible implementation strengthens the open source ecosystem. C 参考实现由 Meta 维护,贡献代码需要签署贡献者协议。我们认为,一个独立、高性能且兼容的实现能够加强开源生态系统。
Current state
当前状态
The reference implementation was initially translated using c2rust, and we have since completed the cleanup work for decompression and the dictionary builder. We test our Rust code (compiled into a C static library) with the the reference implementation’s test suite. We additionally use fuzz testing and Miri, so we’re confident in the correctness of our implementation. The pre-release is available here: github.com/trifectatechfoundation/libzstd-rs-sys/releases/tag/v0.0.1-prerelease.2. This work has also had ecosystem benefits: we’ve found several limitations of Miri (that are now resolved) and made contributions to Clippy. A more complete write-up of our recent contributions can be found here. 参考实现最初是使用 c2rust 转换的,此后我们完成了针对解压缩和字典构建器的清理工作。我们使用参考实现的测试套件来测试我们的 Rust 代码(编译为 C 静态库)。此外,我们还使用了模糊测试(fuzz testing)和 Miri,因此我们对实现的正确性充满信心。预发布版本可在此处获取:github.com/trifectatechfoundation/libzstd-rs-sys/releases/tag/v0.0.1-prerelease.2。这项工作也为生态系统带来了益处:我们发现了 Miri 的几个局限性(现已解决)并为 Clippy 做出了贡献。关于我们近期贡献的更完整报告可以在这里找到。
The cost of memory safety
内存安全的代价
By default decompression performance of our implementation is a few percent slower than the C reference implementation. We benchmark each merge in to main in our benchmark suite. We believe we can justify this regression though, because with the unsafe-performance-experimental feature flag enabled we match C performance. This feature flag disables 4 bounds checks where data from the input is used to index into a data structure. For most users a ~3% performance regression is likely an acceptable price to pay for increased memory safety. If you really do need that last bit of performance, you can enable the flag at your own risk. Its behavior in these four locations matches C which also does not check the bounds and appears to run just fine in many production systems.
默认情况下,我们实现的解压缩性能比 C 参考实现慢几个百分点。我们在基准测试套件中对每次合并到主分支的代码进行基准测试。不过,我们认为这种性能回退是可以接受的,因为启用 unsafe-performance-experimental 功能标志后,我们可以达到与 C 相当的性能。该标志禁用了 4 处使用输入数据索引数据结构时的边界检查。对于大多数用户来说,为了获得更高的内存安全性,约 3% 的性能损失是可以接受的代价。如果您确实需要极致的性能,可以自行承担风险启用该标志。它在这四个位置的行为与 C 语言一致,C 语言同样不检查边界,且在许多生产系统中运行良好。
Future
未来展望
We are looking for funding of the compression portion of this library. Because of code sharing between compression and decompression, we have looked at the compression code a bit, but most of the cleanup work still needs to be done. We did set up benchmarks to ensure compression performance does not unexpectedly regress, and as mentioned we already use the reference implementation’s test suite to check that we produce the correct result. The remaining work is listed in Milestone 4: Encoder implementation. If you’d like to support our work, please contact us; see trifectatech.org/support. 我们正在为该库的压缩部分寻求资金支持。由于压缩和解压缩之间存在代码共享,我们已经初步研究了压缩代码,但大部分清理工作仍需完成。我们已经建立了基准测试以确保压缩性能不会意外回退,并且如前所述,我们已经使用参考实现的测试套件来验证结果的正确性。剩余工作列在 Milestone 4: Encoder implementation 中。如果您愿意支持我们的工作,请联系我们;详情请见 trifectatech.org/support。
Ecosystem integration
生态系统集成
We have our own fork of the zstd that uses libzstd-rs-sys instead of the C library. We’d like to upstream this at some point. For the most commonly-used APIs this is straightforward. For the experimental features we run into some mismatches where zstd-safe uses an enum but we must use a struct for FFI safety.
我们拥有自己的 zstd 分支,它使用 libzstd-rs-sys 而不是 C 库。我们希望在某个时候将其合并到上游。对于最常用的 API,这非常简单。但对于实验性功能,我们遇到了一些不匹配的情况,例如 zstd-safe 使用枚举,而我们为了 FFI 安全必须使用结构体。
Thanks to our sponsors
感谢我们的赞助商
The work on the decompression side has been funded by Chainguard, Astral and NLnet Foundation, and we’re grateful for their support! Sovereign Tech Agency invested in the dictionary builder; thank you! 解压缩方面的工作由 Chainguard、Astral 和 NLnet Foundation 资助,我们非常感谢他们的支持!Sovereign Tech Agency 投资了字典构建器项目;谢谢你们!
About 关于
Trifecta Tech Foundation is a non-profit and a Public Benefit Organisation (501(c)(3) equivalent) that creates open-source building blocks for critical infrastructure software. Our initiatives on data compression, time synchronization, and privilege boundary, impact the digital security of millions of people. If you’d like to support our work, please contact us; see trifectatech.org/support. Trifecta Tech Foundation 是一家非营利性公共利益组织(相当于 501(c)(3)),致力于为关键基础设施软件创建开源构建模块。我们在数据压缩、时间同步和权限边界方面的举措,影响着数百万人的数字安全。如果您愿意支持我们的工作,请联系我们;详情请见 trifectatech.org/support。
Astral builds high-performance developer tools for the Python ecosystem: Ruff, an extremely fast Python linter, written in Rust. uv, an extremely fast Python package manager, written in Rust. Astral’s mission is to make the Python ecosystem more productive. Learn more at astral.sh. Astral 为 Python 生态系统构建高性能开发工具:Ruff(用 Rust 编写的极速 Python linter)和 uv(用 Rust 编写的极速 Python 包管理器)。Astral 的使命是提高 Python 生态系统的生产力。了解更多信息请访问 astral.sh。
NLnet Foundation is a recognised philanthropic non-profit foundation. The foundation stimulates network research and development in the domain of Internet technology. The articles of association for the NLnet Foundation state: “to promote the exchange of electronic information and all that is related or beneficial to that purpose”. The prefered instrument of NLnet is awarding microgrants to small, independent projects supporting independent researchers and developers. Read more on nlnet.nl. NLnet Foundation 是一家公认的慈善非营利基金会。该基金会致力于推动互联网技术领域的网络研究与开发。NLnet 基金会的章程规定:“促进电子信息交流以及所有与该目的相关或有益的事项”。NLnet 的首选方式是向支持独立研究人员和开发人员的小型独立项目提供微型资助。阅读更多信息请访问 nlnet.nl。
Chainguard builds trusted open source software for a secure-by-default stack. Read more on chainguard.dev. Chainguard 构建值得信赖的开源软件,以实现默认安全的软件栈。阅读更多信息请访问 chainguard.dev。
Further Information 更多信息
- Data compression initiative on Trifecta Tech Foundation
- libzstd-rs GitHub Repository
- For inquiries, please contact Erik Jonkers, contact@trifectatech.org