Canadian election databases use "canary traps"—and they work

Canadian election databases use “canary traps”—and they work

加拿大选举数据库使用“金丝雀陷阱”——而且非常有效

In a world awash in high-tech security tools like passkeys, quantum-safe algorithms, and public-key cryptography, it can be refreshing to get back to the simple things… like a good old-fashioned canary trap. The canary trap is a simple tool often used to identify leakers or double agents. To make one, you simply share a document, image, or database but make tiny changes that are unique to each recipient. That way, if those changes show up verbatim in any leak of the information, you know immediately which recipient was behind the leak.

在一个充斥着通行密钥（passkeys）、量子安全算法和公钥加密等高科技安全工具的世界里，回归简单的事物往往令人耳目一新……比如老派的“金丝雀陷阱”（canary trap）。金丝雀陷阱是一种常用于识别泄密者或双重间谍的简单工具。制作方法很简单：你在分享文档、图像或数据库时，为每个接收者做出细微且独特的修改。这样一来，如果这些修改原封不动地出现在任何泄露的信息中，你就能立即知道是哪位接收者泄露了信息。

You don’t often see canary traps in the news, though they have long been a staple of spy fiction (and practice), so an account out of Canada last week caught my eye. The Canadian province of Alberta has been the site of recent drama around its electoral list, a database that contains information such as names, addresses, and voting districts for millions of citizens. Political parties can legally get access to the electoral list, though they operate under significant restrictions on how they can use the data. They cannot, for instance, share the list with a third party.

虽然金丝雀陷阱长期以来一直是间谍小说（及现实操作）中的常客，但在新闻中却不常见，因此上周来自加拿大的一则报道引起了我的注意。加拿大阿尔伯塔省最近围绕其选民名单发生了一场风波，该数据库包含了数百万公民的姓名、地址和选区等信息。政党可以合法获取这份选民名单，但他们在如何使用这些数据方面受到严格限制。例如，他们不得将名单分享给第三方。

Despite this, The Centurion Project, described by the CBC as a “separatist group,” used the list to power an online database of voters. Elections Alberta, which maintains the list, went to court last week and obtained an order to shut down the Centurion site. But how had Centurion obtained the data? Elections Alberta quickly investigated and announced that the list used by Centurion was a copy of one legitimately released to the Republican Party of Alberta.

尽管有此规定，“百夫长项目”（The Centurion Project，被加拿大广播公司描述为一个“分裂主义团体”）还是利用该名单建立了一个在线选民数据库。负责维护该名单的“阿尔伯塔省选举局”（Elections Alberta）上周诉诸法庭，并获得了一项关闭该网站的命令。但“百夫长项目”是如何获取这些数据的呢？阿尔伯塔省选举局迅速展开调查，并宣布“百夫长项目”所使用的名单，正是合法提供给阿尔伯塔省共和党的一份副本。

Election officials were confident in their claim because, whenever they release a copy of the electoral list, they salt it with additional but bogus entries. The fake entries inserted in the Republican Party version of the list showed up in Centurion’s online tool, too. Exactly how the data had passed from the Republican Party to Centurion remains unclear, but the canary trap enabled Elections Alberta to lean quickly on both groups. Each publicly pledged to respect the law, and Centurion took down its tool.

选举官员之所以确信这一点，是因为他们每次发布选民名单副本时，都会在其中加入一些虚假的条目作为“盐”（salt）。插入阿尔伯塔省共和党版本名单中的虚假条目，也出现在了“百夫长项目”的在线工具中。虽然数据究竟是如何从共和党流向“百夫长项目”的尚不清楚，但金丝雀陷阱使阿尔伯塔省选举局能够迅速向双方施压。双方均公开承诺遵守法律，“百夫长项目”也撤下了其工具。

Canaries—from Clancy to AI

金丝雀：从克兰西到人工智能

The canary trap has been used by companies ranging from Tesla to Apple; it was even used to stop the leak of Star Trek film scripts. Though the concept comes from the world of espionage, the actual term appears to stem from Tom Clancy’s 1980s thriller, Patriot Games. The idea is to find out which “canary” did the “singing.”

从特斯拉到苹果，许多公司都使用过金丝雀陷阱；它甚至曾被用来阻止《星际迷航》电影剧本的泄露。虽然这个概念源于间谍世界，但“金丝雀陷阱”这一术语似乎源自汤姆·克兰西（Tom Clancy）20世纪80年代的惊悚小说《爱国者游戏》（Patriot Games）。其核心思想是找出是哪只“金丝雀”在“唱歌”（即泄密）。

Ars Technica’s own Clancy aficionado, Lee Hutchinson, tracked down his copy of the book and found the passage in which protagonist Jack Ryan explains the term after an admiral asks him, “What the devil is this Canary Trap?” Ryan says: “Well, you know about all the problems CIA has with leaks. When I was finishing off the first draft of the report, I came up with an idea to make each one unique… Each summary paragraph has six different versions, and the mixture of those paragraphs is unique to each numbered copy of the paper. There are over a thousand possible permutations, but only ninety-six numbered copies of the actual document.”

Ars Technica 的克兰西迷 Lee Hutchinson 翻出了他的藏书，找到了主角杰克·莱恩（Jack Ryan）向一位海军上将解释该术语的段落。当上将问他“这该死的金丝雀陷阱到底是什么”时，莱恩回答道：“嗯，你知道中情局在泄密问题上遇到的所有麻烦。当我在完成报告初稿时，我想出了一个让每一份报告都独一无二的主意……每个摘要段落都有六个不同版本，这些段落的组合对于每一份编号的文档来说都是唯一的。虽然有超过一千种可能的排列组合，但实际文档只有九十六份编号副本。”

“The reason the summary paragraphs are so—well, lurid, I guess—is to entice a reporter to quote them verbatim in the public media. If he quotes something from two or three of those paragraphs, we know which copy he saw and, therefore, who leaked it. They’ve got an even more refined version of the trap working now. You can do it by computer. You use a thesaurus program to shuffle through synonyms, and you can make every copy of the document totally unique.”

“摘要段落之所以写得那么——嗯，我想说是耸人听闻——是为了诱使记者在公共媒体上逐字引用。如果他引用了其中两三个段落的内容，我们就能知道他看到的是哪一份副本，从而知道是谁泄露的。他们现在甚至有更完善的陷阱版本。你可以通过电脑来完成：使用同义词库程序来替换词汇，这样你就能让每一份文档副本都变得独一无二。”

Thesaurus program? We can do better than that, Jack Ryan; this is the age of artificial intelligence! Indeed, as far back as 2021, Dartmouth professor V.S. Subrahmanian built an AI tool called (really) WE-FORGE that “automatically creates false documents to protect intellectual property such as drug design and military technology.” The goal was to create “documents that are sufficiently similar to the original to be plausible, but sufficiently different to be incorrect,” said Subrahmanian at the time. However it’s done, this old system continues to work pretty well. Just ask Elections Alberta.

同义词库程序？杰克·莱恩，我们现在能做得更好；这可是人工智能时代！事实上，早在2021年，达特茅斯学院教授 V.S. Subrahmanian 就开发了一种名为 WE-FORGE 的人工智能工具，它能“自动创建虚假文档，以保护药物设计和军事技术等知识产权”。Subrahmanian 当时表示，其目标是创建“与原始文档足够相似以至于看起来可信，但又足够不同以至于内容错误”的文档。无论手段如何，这个古老的系统依然非常有效。问问阿尔伯塔省选举局就知道了。