NYT slams Microsoft for building copyright-infringing supercomputer for OpenAI

NYT slams Microsoft for building copyright-infringing supercomputer for OpenAI

《纽约时报》抨击微软为 OpenAI 构建侵权超级计算机

In a heavily redacted court filing Thursday, The New York Times proposed to amend its copyright complaint against OpenAI and Microsoft to clarify a claim and allege that Microsoft actively encouraged OpenAI to steal NYT works by building a bespoke supercomputing system ranked among the most powerful in the world.

在周四一份经过大量删节的法庭文件中,《纽约时报》(NYT)提议修改其针对 OpenAI 和微软的版权诉讼,旨在澄清一项指控,并声称微软通过构建一套跻身全球最强大行列的定制超级计算系统,积极鼓励 OpenAI 窃取《纽约时报》的作品。

NYT’s motion comes after the Supreme Court sided with Cox Communications in a case where Sony tried and failed to claim that Cox was contributing to music piracy as an Internet service provider, which set a new standard for contributory infringement. Moving forward, plaintiffs will have to prove that parties intentionally acted to induce illegal conduct. Recognizing that the legal precedent has changed, the NYT now wants to amend its complaint to align its contributory infringement claim against Microsoft with that new standard.

《纽约时报》此举是在最高法院在另一起案件中支持 Cox Communications 之后提出的。在该案中,索尼试图指控作为互联网服务提供商的 Cox 助长了音乐盗版行为,但未获成功,该判决为“帮助侵权”(contributory infringement)设定了新的标准。此后,原告必须证明被告方有意诱导非法行为。意识到法律先例已经改变,《纽约时报》现在希望修改其诉状,使其针对微软的帮助侵权指控符合这一新标准。

“Today, we asked the court for permission to file an amended complaint that further strengthens our case, clarifying our claim of contributory infringement against Microsoft based on new law and new evidence uncovered during discovery,” Graham James, an NYT spokesperson, said in a statement provided to Ars. In addition to clarifying one claim, NYT also agreed to voluntarily dismiss two claims of contributory copyright infringement and trademark dilution against all defendants.

“今天,我们请求法院允许提交一份修订后的诉状,以进一步加强我们的论点,并根据新法律和在证据开示阶段发现的新证据,澄清我们对微软的帮助侵权指控,”《纽约时报》发言人格雷厄姆·詹姆斯(Graham James)在提供给 Ars 的声明中表示。除了澄清一项指控外,《纽约时报》还同意主动撤销针对所有被告的两项帮助版权侵权和商标淡化的指控。

A Microsoft spokesperson told Ars that the company views the amended complaint as “a last-ditch effort by the plaintiff to save its claim from unfavorable precedent set in other recent rulings.” But in its motion, the NYT argued that neither Microsoft nor OpenAI would be prejudiced by allowing the amended complaint. It’s proper to allow plaintiffs to revise arguments when legal standards change, the NYT argued, and the case schedule would not be set back because “The Times does not seek any additional discovery in support of its amended claims.”

微软发言人告诉 Ars,该公司认为这份修订后的诉状是“原告为了挽救其诉求,使其免受近期其他裁决所设定的不利先例影响而做的最后挣扎。”但在动议中,《纽约时报》辩称,允许修改诉状不会对微软或 OpenAI 造成损害。《纽约时报》认为,当法律标准发生变化时,允许原告修改论点是恰当的,且案件进度不会因此延误,因为“《纽约时报》并不寻求任何额外的证据开示来支持其修订后的主张。”

“As we have long alleged, Microsoft actively encouraged OpenAI to steal our copyrighted works,” James said. “Beyond amending that claim and streamlining the case to its most potent arguments, our core claims remain the same from the day we filed this lawsuit—that Microsoft and OpenAI stole millions of The Times’s copyrighted works to compete with our products and illegally enrich themselves.”

“正如我们长期以来所指控的那样,微软积极鼓励 OpenAI 窃取我们的版权作品,”詹姆斯说。“除了修改该指控并精简案件以突出最有力的论点外,我们的核心诉求与我们提起诉讼之日保持一致——即微软和 OpenAI 窃取了《纽约时报》数百万件版权作品,以与我们的产品竞争并非法牟利。”

NYT targets Microsoft supercomputer

《纽约时报》瞄准微软超级计算机

In 2023, the NYT became the first major publisher to sue OpenAI. The prominent newspaper alleged that ChatGPT was illegally trained on its articles, infringed on its copyrights by outputting articles verbatim, and caused market harms by positioning ChatGPT as a substitute for a NYT subscription, as well as reputational harms by falsely attributing claims to NYT reporting. Additionally, ChatGPT outputs summarizing Wirecutter reviews robbed writers of commissions from lost clicks on affiliate links, the NYT alleged.

2023 年,《纽约时报》成为首家起诉 OpenAI 的大型出版商。这家知名报纸指控 ChatGPT 非法利用其文章进行训练,通过逐字输出文章侵犯了其版权,并将 ChatGPT 定位为《纽约时报》订阅服务的替代品,从而造成了市场损害,同时还因将虚假声明归咎于《纽约时报》的报道而造成了声誉损害。此外,《纽约时报》还指控 ChatGPT 对 Wirecutter 评论的总结输出,导致作者失去了联盟链接的点击佣金。

In the initial complaint, the NYT discussed Microsoft’s supercomputing systems as if they were providing generic cloud computing services. The updated complaint seeks to specify that the supercomputer was tailor-made to help OpenAI infringe and allege that it was built for the explicit purpose of training AI on copyrighted works without permission. And as the NYT alleged, its articles were more heavily weighted by this system, as both firms hoped to train models on the highest-quality journalism possible, so that level of writing could be confidently mimicked in outputs.

在最初的诉状中,《纽约时报》讨论微软的超级计算系统时,将其描述为提供通用的云计算服务。更新后的诉状旨在明确指出,该超级计算机是为帮助 OpenAI 侵权而量身定制的,并指控其构建的明确目的就是为了在未经许可的情况下利用版权作品训练 AI。正如《纽约时报》所指控的那样,该系统对其文章的权重更高,因为两家公司都希望利用尽可能高质量的新闻报道来训练模型,以便在输出中自信地模仿这种写作水平。

By building this “unusually complex” machine, Microsoft not only helped select the works that were infringed but also provided a means to seize copyrighted works without permission, the NYT alleged. “Microsoft specifically designed it for the purpose of using essentially the whole Internet—curated to disproportionately feature Times Works—to train the most capable LLM in history,” the NYT alleged.

《纽约时报》指控称,通过建造这台“异常复杂”的机器,微软不仅帮助选择了被侵权的作品,还提供了一种在未经许可的情况下获取版权作品的手段。“微软专门设计它,目的是利用几乎整个互联网——并经过筛选,不成比例地突出《纽约时报》的作品——来训练历史上最强大的大语言模型(LLM),”《纽约时报》称。

And now it’s allegedly unfairly profiting. “Microsoft’s deployment of Times-trained LLMs throughout its product line helped boost its market capitalization by a trillion dollars in the past year alone,” the NYT alleged.

现在,它被指控正在进行不公平的牟利。《纽约时报》称:“微软在其产品线中部署经《纽约时报》内容训练的大语言模型,仅在过去一年就帮助其市值增加了万亿美元。”

Model outputs show market harms, NYT alleged

《纽约时报》称模型输出显示市场损害

For the NYT, outputs shared during discovery—including a huge chunk of users’ ChatGPT sessions—remain some of the strongest evidence that OpenAI and Microsoft built tools that allegedly replaced the NYT by producing near-verbatim excerpts of its copyrighted works. In some cases, users told ChatGPT they were trying to skirt paywalls and were able to see significant chunks of articles by requesting to see the “next paragraph.” In other cases, “models simply spit out several paragraphs” without such finagling.

对于《纽约时报》而言,在证据开示期间分享的输出内容——包括大量用户的 ChatGPT 会话记录——仍然是证明 OpenAI 和微软构建的工具通过生成其版权作品的近乎逐字摘录来取代《纽约时报》的最有力证据之一。在某些情况下,用户告诉 ChatGPT 他们试图绕过付费墙,并通过要求查看“下一段”而看到了文章的重要部分。在其他情况下,“模型无需任何技巧就直接吐出几段内容”。

Similarly as problematic for the NYT are hallucinations where Microsoft and OpenAI models falsely cite the NYT for content that they never published. The complaint listed examples like Bing Chat citing fake quotes from Steve Forbes’ daughter Moira Forbes and ChatGPT fabricating an NYT article that was never published.

同样让《纽约时报》感到困扰的是“幻觉”问题,即微软和 OpenAI 的模型错误地引用《纽约时报》从未发布过的内容。诉状列举了一些例子,例如 Bing Chat 引用了史蒂夫·福布斯(Steve Forbes)的女儿莫伊拉·福布斯(Moira Forbes)的虚假引语,以及 ChatGPT 编造了一篇从未发表过的《纽约时报》文章。