The S in interoperability
The S in interoperability
互操作性中的“S”
This is a blog post about standards, their proliferation and the issues that may arise. My first involvement with standards was just as a reader. To better understand complicated code or unexpected behavior in a protocol. After a while, I also got involved and helped clarify certain things to ensure implementations align on the same behavior in edge cases. 这是一篇关于标准、标准的激增以及可能随之产生的问题的博文。我最初接触标准时只是作为一名读者,为了更好地理解复杂的代码或协议中意料之外的行为。过了一段时间,我也参与其中,帮助澄清某些事项,以确保不同实现在处理边缘情况时能保持行为一致。
Eventually, I found myself co-editing a specification - Subresource Integrity (SRI) which was published as a W3C Recommendation in 2015. The core idea behind SRI is that you include third-party JavaScript combined with a SHA2 digest of the expected file. If the browser does not find the downloaded URL to match the expected digest, the script will not execute. This allows using a fast CDN for JavaScript without giving them full control over the scripts on your page - essentially reducing the security risks. 最终,我参与编辑了一项规范——子资源完整性(Subresource Integrity, SRI),该规范于 2015 年作为 W3C 推荐标准发布。SRI 的核心思想是:在引入第三方 JavaScript 时,附带该文件的 SHA2 摘要。如果浏览器发现下载的 URL 与预期的摘要不匹配,脚本就不会执行。这使得开发者可以使用快速的 CDN 来托管 JavaScript,而无需给予 CDN 对页面脚本的完全控制权,从而从本质上降低了安全风险。
The standard format for these digests is e.g., sha(size)-(base64 encoding of the digest). While computing the hash digest is rather straightforward, base64 comes in two encoding alphabets: First, a-zA-Z0-9/+ and secondly the url-safe variant which uses a-zA-z0-9_-. The specification examples all used the former. Only approximately ten years after publication, in 2025, we still found a bug.
这些摘要的标准格式例如:sha(size)-(base64 编码的摘要)。虽然计算哈希摘要相当简单,但 Base64 有两种编码字符集:第一种是 a-zA-Z0-9/+,第二种是 URL 安全变体,使用 a-zA-z0-9_ -。规范中的示例全部使用了前者。直到发布约十年后的 2025 年,我们才发现了一个漏洞。
As part of a compatibility report against Firefox not properly supporting a website, we found that the core issue was actually with a different browser. The other browser liberally accepted both types of encoding, which resulted in websites expecting support for base64 and base64url interchangeably. The page did not work in Firefox, because it did not accept all hashes a website wanted the browser to check, revealing a minor security issue. 作为针对 Firefox 无法正确支持某网站的兼容性报告的一部分,我们发现核心问题实际上出在另一个浏览器上。那个浏览器宽容地接受了两种编码方式,导致网站开发者认为 base64 和 base64url 可以互换使用。页面在 Firefox 中无法运行,因为它不接受网站要求浏览器检查的所有哈希值,这暴露了一个微小的安全问题。
The real fix would have been that the standard clarifies that the base64url variant is incorrect and the other browser engine changes their behavior. But due to (somewhat unrelated) issues around proliferation of standards, web compatibility and the unfortunate market dominance of certain browsers, we went the other road. To support existing web content, we changed the standard to acknowledging that both types of encoding are considered valid representations. 真正的修复方案本应是:标准明确指出 base64url 变体是不正确的,并要求另一个浏览器引擎更改其行为。但由于(某种程度上不相关的)标准激增、Web 兼容性以及某些浏览器不幸的市场垄断地位等问题,我们选择了另一条路。为了支持现有的 Web 内容,我们修改了标准,承认两种编码方式都被视为有效的表示形式。
This example shows, that it can take multiple years for subtle differences to appear. Interoperable specifications can establish a shared understanding along a “happy path”, but not necessarily in adversarial settings. In addition, standards need to continuous maintenance and active stakeholders who ensure that implementations remain interoperable and secure over time. 这个例子表明,细微的差异可能需要多年时间才会显现出来。可互操作的规范可以在“理想路径”上建立共识,但在对抗性环境下则未必。此外,标准需要持续的维护和积极的利益相关者,以确保实现方案在长期内保持互操作性和安全性。
From specification to standard: Originally, a specification is at first just a write-up, an idea how something could be better: How it should behave, how it works, what the data structures, the algorithms and the interactions of them look like. Anyone can come up with a grammar, a parser and a resulting data structure. For a standard, this specification needs a shared agreement that is also widely and consistently implemented. 从规范到标准:最初,规范只是一份文档,一种关于如何改进某事物的构想:它应该如何表现、如何工作,数据结构、算法以及它们之间的交互是什么样的。任何人都可以提出一种语法、一个解析器和一个结果数据结构。要成为标准,该规范需要达成共识,并得到广泛且一致的实现。
This will work best with iterative co-design of the spec, the implementations and intense discussions of corner cases. Some may go further and use shared test suites. This will lead to Interoperability (interop), but still requires constant maintenance and observation of the ecosystem beyond individual implementations. 这在规范与实现的迭代式共同设计以及对边缘情况的深入讨论中效果最好。有些人可能会更进一步,使用共享的测试套件。这将带来互操作性(interop),但仍然需要持续的维护,并观察超越单个实现的整个生态系统。
While interop is asymptotic and requires a shared agreement over time, security demands understanding - a broader reach that requires the inspection of limitations and subtle boundaries. This deeper level of understanding is often missing when implementations consider syntax “simple enough” without reading the spec. The base64 SRI example is just one example, but there are more: Many people have written their own parsers for text-based languages. You may have seen code that parses HTML with regular expressions. Other great examples of “easily” parsed languages are maybe XML, JSON, or YAML. But these implementations often make different assumptions, leading to subtle incompatibilities or even security flaws. 虽然互操作性是渐近的,需要长期的共识,但安全性需要理解——这是一种更广泛的范畴,需要审视局限性和细微的边界。当实现者认为语法“足够简单”而不去阅读规范时,这种更深层次的理解往往会缺失。Base64 SRI 的例子只是其中之一,还有更多:许多人编写过自己的文本语言解析器。你可能见过用正则表达式解析 HTML 的代码。其他“易于”解析的语言的典型例子可能是 XML、JSON 或 YAML。但这些实现往往基于不同的假设,导致细微的不兼容甚至安全漏洞。
Parser Differentials: More practical, let’s look at an issue with JSON, to demonstrate the impact of handling input that is ostensibly simple. Let’s examine this JSON string and the resulting data structure: { "test": 0, "test": 1 }. When parsed into an object obj, what do you think will obj.test return?
解析器差异:更实际一点,让我们看看 JSON 的一个问题,以展示处理看似简单的输入所带来的影响。让我们检查这个 JSON 字符串及其产生的数据结构:{ "test": 0, "test": 1 }。当它被解析为一个对象 obj 时,你认为 obj.test 会返回什么?
Most JSON parsers are so liberal that they will happily consume two dictionary keys with the same name “test”. One implementation may simply assign obj.test twice: First with 0 and then overwrite it with 1. Another one might check for existing keys and reject the second “test” key silently, keeping the first one. The lack of rigor in the original description of JSON as a “subset of JavaScript” was already acknowledged and raised as problematic in the JSON RFC (which came much later in 2017). But still to this day, many implementations allow input with duplicate dictionary keys and show divergent behavior.
大多数 JSON 解析器都非常宽容,它们会愉快地处理两个同名的字典键 “test”。一种实现可能会简单地对 obj.test 赋值两次:先赋值为 0,然后用 1 覆盖它。另一种实现可能会检查现有键,并静默拒绝第二个 “test” 键,保留第一个。JSON 最初被描述为“JavaScript 的子集”,这种描述缺乏严谨性,这一点在 JSON RFC(2017 年才发布,时间较晚)中已被承认并被指出存在问题。但时至今日,许多实现仍然允许输入重复的字典键,并表现出不同的行为。
While the examples with SRI and JSON are relatively harmless, real parser differential bugs were leading to code execution, authentication bypasses and more. What do we learn from this? Perfect interoperability is not created through a specification, it needs constant maintenance. The ambiguity can only be removed through long-term commitment and regular feedback from implementations and users. 虽然 SRI 和 JSON 的例子相对无害,但真实的解析器差异漏洞曾导致代码执行、身份验证绕过等后果。我们从中能学到什么?完美的互操作性不是通过规范创造出来的,它需要持续的维护。歧义只能通过长期的投入以及来自实现者和用户的定期反馈来消除。
The same is true for security: The SRI bug persisted for ten years and nobody noticed how implementations disagreed and corner cases were overlooked. They only aligned due to a real, user-facing issue. But these examples are not a warning sign, they are scar tissue that shows how the internet is made. Standards can only mature through vigilant maintenance. The bug reports, the spec issues being filed, the shared test cases, sometimes even the random forum complaints. All of these help to remove ambiguity and allow internet standards to mature. In the end, standards are not secure because they are written down. They are secure because people continue to question. 安全性也是如此:SRI 的漏洞存在了十年,没有人注意到不同实现之间的分歧,边缘情况也被忽视了。它们之所以最终达成一致,仅仅是因为出现了一个真实且面向用户的问题。但这些例子不是警示标志,它们是互联网构建过程中的“伤疤”。标准只能通过警惕的维护来成熟。错误报告、提交的规范问题、共享的测试用例,有时甚至是论坛上的随机抱怨,所有这些都有助于消除歧义,并使互联网标准走向成熟。归根结底,标准之所以安全,并不是因为它们被写了下来,而是因为人们在不断地质疑。