iddqd, or the hardest kind of unsafe Rust
iddqd, or the hardest kind of unsafe Rust
iddqd,或者说最棘手的那类 unsafe Rust
2 Jun 2026 | Rain Paharia | Engineer 2026年6月2日 | Rain Paharia | 工程师
I’m the main author of iddqd, a Rust library for maps (named after the Doom cheat code) where keys are borrowed from values. At Oxide we use it extensively in Omicron, our control plane—the software that sits at the heart of every Oxide rack, provisions resources like compute and storage for our customers, and ensures the rack stays up and running over time. iddqd maintains in-memory indexes of the kinds of large records that show up everywhere in a system like that, such as disks or sled inventories. As a result, it must be correct: if it misbehaves, our control plane can malfunction in ways that are unpredictable and hard to diagnose. 我是 iddqd 的主要作者,这是一个用于映射(Map)的 Rust 库(命名源自《毁灭战士》的作弊码),其特点是键(Key)从值(Value)中借用。在 Oxide,我们在 Omicron 中广泛使用它。Omicron 是我们的控制平面——它是每一台 Oxide 机架的核心软件,负责为客户配置计算和存储等资源,并确保机架长期稳定运行。iddqd 负责维护内存索引,处理系统中随处可见的大型记录,例如磁盘或机架清单。因此,它必须保证正确性:如果它出现异常,我们的控制平面可能会以不可预测且难以诊断的方式发生故障。
iddqd consists of a fair amount of unsafe code underneath. There’s been some recent concern over the amount of unsafe code in Rust rewrites, so I thought I’d write about some of the unsafe code in iddqd and how we try to tame it.
iddqd 的底层包含相当多的 unsafe 代码。最近人们对 Rust 重写项目中 unsafe 代码的数量产生了一些担忧,因此我想写写 iddqd 中的一些 unsafe 代码,以及我们是如何尝试驾驭它们的。
What problem does iddqd solve?
iddqd 解决了什么问题?
With Rust’s standard library maps, keys are stored separately from values. Let’s say you want to store a map of records keyed by an email address. With std::collections::BTreeMap, you might write something like:
在 Rust 的标准库映射中,键和值是分开存储的。假设你想存储一个以电子邮件地址为键的记录映射。使用 std::collections::BTreeMap,你可能会这样写:
// Email is typically a newtype which validates that the email address is well-formed.
// In this example, we alias it to String for simplicity.
type Email = String;
struct User { name: String, age: u8, }
let mut users = BTreeMap::<Email, User>::new();
users.insert(
"alice@example.com".to_string(),
User { name: "Alice".to_string(), age: 30 },
);
users.insert(
"bob@example.com".to_string(),
User { name: "Bob".to_string(), age: 35 },
);
This approach has what I consider to be a pretty major downside: the key (email) is not stored in the same struct as the value (the rest of the record). How would you handle this? One way is to pass around both the email and the user, for example with get_key_value:
这种方法在我看来有一个相当大的缺点:键(电子邮件)并没有与值(记录的其余部分)存储在同一个结构体中。你会如何处理这个问题呢?一种方法是同时传递电子邮件和用户,例如使用 get_key_value:
fn process_user(email: &Email, user: &User) { /* ... */ }
let (email, user) = users.get_key_value("alice@example.com").unwrap();
process_user(email, user);
As an extension, you could maybe have a struct which combines both at fetch time: 作为一种扩展,你或许可以在获取数据时定义一个组合两者的结构体:
struct UserRecord<'a> { email: &'a Email, user: &'a User, }
let (email, user) = users.get_key_value("alice@example.com").unwrap();
let record = UserRecord { email, user };
In practice, this gets quite unwieldy when you have lots of different types of records that need this kind of treatment. 在实践中,当你需要处理大量不同类型的记录时,这种做法会变得非常笨重。
Alternatively, you could duplicate the email across the key and the value: 或者,你也可以在键和值中重复存储电子邮件:
struct User { email: Email, name: String, age: u8, }
let mut users = BTreeMap::<Email, User>::new();
let email = "alice@example.com".to_string();
users.insert(
email.clone(),
User { email, name: "Alice".to_string(), age: 30 },
);
But that has the risk that the emails stored in the key and the value fall out of sync. 但这存在键和值中存储的电子邮件不同步的风险。
iddqd provides a better alternative. With IdOrdMap, you can write something like:
iddqd 提供了一个更好的替代方案。使用 IdOrdMap,你可以这样写:
struct User { email: Email, name: String, age: u8, }
// You implement a trait which tells iddqd how to extract the key from the
// record.
impl IdOrdItem for User {
// The key type can borrow from the value!
type Key<'a> = &'a Email;
fn key(&self) -> Self::Key<'_> { &self.email }
// This macro expands into a small amount of boilerplate to work around
// a Rust borrow checker limitation.
id_upcast!();
}
Then, you can insert records: 然后,你可以插入记录:
let mut users = IdOrdMap::<User>::new();
users.insert_unique(User { email: "alice@example.com".to_string(), name: "Alice".to_string(), age: 30, }).unwrap();
users.insert_unique(User { email: "bob@example.com".to_string(), name: "Bob".to_string(), age: 35, }).unwrap();
// And look them up by email:
assert_eq!(&users.get("alice@example.com").unwrap().name, "Alice");
assert_eq!(users.get("bob@example.com").unwrap().age, 35);
At Oxide, this has proven to be an invaluable pattern: many of the records in our control plane are quite large (think database lookups), and iddqd is a great fit for them. It also comes with several other features that directly address pain points we’ve dealt with at Oxide. A few worth calling out: 在 Oxide,这已被证明是一种极其宝贵的模式:我们控制平面中的许多记录都非常大(想想数据库查询),而 iddqd 非常适合它们。它还附带了其他几个功能,直接解决了我们在 Oxide 遇到的痛点。值得一提的有:
- First-class support for complex keys that borrow from more than one field, without having to resort to workarounds like dynamic dispatch.
- 对从多个字段借用的复杂键的一流支持,无需诉诸动态分发(dynamic dispatch)等变通方法。
- Maps with two or three keys per item, each independently indexing the same record, without the usual pattern of maintaining multiple maps by hand.
- 每个条目拥有两到三个键的映射,每个键独立索引同一条记录,无需手动维护多个映射的常规模式。
- Serde implementations that serialize as sequences rather than maps, so that non-string keys can be serialized in JSON[1]. Importantly, these implementations reject duplicate keys. (For backwards compatibility, serialization as maps is also supported.)
- 序列化为序列而非映射的 Serde 实现,以便非字符串键可以在 JSON 中序列化[1]。重要的是,这些实现会拒绝重复的键。(为了向后兼容,也支持序列化为映射。)
Like many of our other crates, iddqd is built for Oxide’s needs but is generally useful to the Rust community. You’re welcome to use it in your own projects as well. 像我们其他的许多 crate 一样,iddqd 是为 Oxide 的需求而构建的,但对整个 Rust 社区也普遍有用。欢迎你在自己的项目中使用它。
On unsafe Rust
关于 unsafe Rust
Before I move on, I want to talk about what it means for unsafe to exist in a memory-safe language. The big concern is undefined behavior (UB): a program behaves in an unpredictable way because core assumptions made by the language or compiler have been violated. Rust calls an abstraction sound if no safe code can use it to cause UB, and unsound otherwise.
在继续之前,我想谈谈在内存安全语言中存在 unsafe 意味着什么。最大的担忧是未定义行为(UB):程序表现出不可预测的行为,因为语言或编译器所做的核心假设被破坏了。如果没有任何安全代码可以通过某种抽象导致 UB,Rust 称该抽象是“健全的”(sound),否则就是“不健全的”(unsound)。
The vast majority of Rust code is safe, which (assuming that any unsafe used by the safe code is sound) means that no UB can occur. However, due to the fundamental undecidability of static analysis (see Rice’s theorem), it is impossible for the Rust compiler—or any kind of algorithm that terminates—to accept all programs without UB and reject all programs with it. Therefore, when writing such an algorithm, its authors have to make a decision: do they reject some programs without UB, accept some programs with UB, or both? The Rust compiler does the first: within the context of safe Rust, it rejects all programs with UB but also some without UB. (This is the correct choice!)
绝大多数 Rust 代码是安全的,这意味着(假设安全代码所使用的任何 unsafe 都是健全的)不会发生 UB。然而,由于静态分析的基本不可判定性(参见莱斯定理),Rust 编译器——或任何会终止的算法——不可能接受所有没有 UB 的程序并拒绝所有有 UB 的程序。因此,在编写此类算法时,作者必须做出选择:是拒绝一些没有 UB 的程序,还是接受一些有 UB 的程序,或者两者兼有?Rust 编译器选择了前者:在安全 Rust 的上下文中,它拒绝了所有有 UB 的程序,但也拒绝了一些没有 UB 的程序。(这是正确的选择!)
What if your program is in the no-man’s-land where it gets rejected even though you know it doesn’t have UB? To express those kinds of programs, the Rust compiler provides an escape hatch: the unsafe keyword. By writing it,
如果你的程序处于“无人区”,即虽然你知道它没有 UB,但编译器却拒绝了它,该怎么办?为了表达这类程序,Rust 编译器提供了一个逃生舱:unsafe 关键字。通过编写它,