Cloudflare’s new policy pushes AI companies to pay for publishers’ content
Cloudflare’s new policy pushes AI companies to pay for publishers’ content
Cloudflare 新政策推动 AI 公司为出版商的内容付费
Cloudflare has just issued the AI industry a new deadline to separate the web crawlers used for traditional search purposes, like Google Search, from those used for AI agents and training. Cloudflare 刚刚向 AI 行业下达了最后通牒,要求将用于传统搜索(如 Google 搜索)的网络爬虫与用于 AI 代理及模型训练的爬虫区分开来。
Starting on September 15, 2026, Cloudflare’s default settings will block “mixed-use” crawlers from any pages that host ads, the company announced on Wednesday. That means that the crawlers that blend search, agent use, and training will be blocked from crawling these sites by default, unless the site owner adjusts the settings otherwise. Cloudflare 周三宣布,从 2026 年 9 月 15 日起,其默认设置将阻止“混合用途”爬虫访问任何投放广告的页面。这意味着,除非网站所有者另行调整设置,否则那些集搜索、代理使用和训练功能于一体的爬虫将被默认禁止抓取这些网站。
These changes to the defaults will apply to new Cloudflare customers, new sites set up by existing customers, and all existing free customers, the company says. The move could impact how AI model providers are able to access web content for training purposes and to help power their agentic services. 该公司表示,这些默认设置的变更将适用于 Cloudflare 的新客户、现有客户建立的新网站以及所有现有的免费客户。此举可能会影响 AI 模型提供商获取网络内容以进行训练及为其代理服务提供支持的方式。
Cloudflare points out that most website owners want their content to be discoverable via search and often through AI services as well, but they want protections against having their intellectual property given away for free. Cloudflare 指出,大多数网站所有者希望自己的内容能通过搜索以及 AI 服务被发现,但他们同时也希望保护自己的知识产权,不被免费攫取。
Cloudflare specifically calls out the “world’s largest search engine” (clearly a Google reference!) as having access to about “2x more information” than other AI companies because the search giant makes it difficult for customers to remain discoverable without being used for AI. Cloudflare 特别点名了“全球最大的搜索引擎”(显然是指 Google),称其获取的信息量比其他 AI 公司多出约“两倍”,因为这家搜索巨头让客户在保持可搜索性的同时,很难拒绝被用于 AI 训练。
Google has pushed back against this generalization in the past, noting that it provides a bot called Google Extended that lets site owners opt out of having their content used for training and AI products and services like Gemini Apps and Vertex API. Its use doesn’t impact a site’s inclusion in Google Search. However, the tech giant’s flagship Googlebot crawls for Search, including AI features like AI Overviews and AI Mode. Google 此前曾反驳过这种概括,指出其提供了一个名为 Google Extended 的机器人,允许网站所有者选择退出,拒绝将内容用于训练以及 Gemini Apps 和 Vertex API 等 AI 产品和服务。其使用不会影响网站在 Google 搜索中的收录。然而,该科技巨头的旗舰爬虫 Googlebot 仍会为搜索进行抓取,其中包括 AI Overviews 和 AI Mode 等 AI 功能。
“Now that the majority of traffic on the Internet is non-human, we must go further and act faster so that a sustainable ecosystem can emerge,” said Cloudflare co-founder and CEO Matthew Prince in his announcement of the news, referring to the recent milestone where bots surpassed human traffic online for the first time. That shift was not expected to occur until next year. “既然互联网上的大部分流量已非人类,我们必须走得更远、行动更快,以便建立一个可持续的生态系统,”Cloudflare 联合创始人兼首席执行官 Matthew Prince 在宣布这一消息时表示。他提到了近期的一个里程碑事件:机器人流量首次超过了人类流量,而这一转变原本预计要到明年才会发生。
“Cloudflare’s new tools and partnerships give website owners increased visibility and commercial opportunities and benefit AI companies that have bots with clear and transparent intent. We hope that our proposed default changes encourage mixed-use crawlers to separate out search from agent use and training,” Prince said. Prince 表示:“Cloudflare 的新工具和合作伙伴关系为网站所有者提供了更高的可见度和商业机会,同时也惠及那些拥有清晰透明意图的 AI 公司。我们希望所提议的默认设置变更能鼓励混合用途爬虫将搜索功能与代理使用及训练功能分离开来。”
While Cloudflare offers a number of products to help users launch their own AI systems, the company has also released a range of tools to give publishers more control over their content in the AI era. In recent years, Cloudflare launched tools to combat AI bots, including a marketplace that lets websites charge AI bots for scraping, dubbed Pay Per Crawl. 虽然 Cloudflare 提供多种产品来帮助用户启动自己的 AI 系统,但该公司也发布了一系列工具,旨在让出版商在 AI 时代对自己的内容拥有更多控制权。近年来,Cloudflare 推出了打击 AI 机器人的工具,包括一个名为“Pay Per Crawl”(按抓取付费)的市场,允许网站向抓取其内容的 AI 机器人收费。
The latter is now also evolving into “Pay Per Use,” the company said, which will allow publishers to charge AI companies when their content creates value, not just when it’s fetched. The change could also help conserve publishers’ bandwidth and compute resources for AI model providers, as Cloudflare’s data suggested that over 50% of crawl traffic from AI crawlers is spent re-fetching unchanged pages. 该公司表示,后者目前正在演变为“按使用付费”(Pay Per Use),这将允许出版商在内容产生价值时向 AI 公司收费,而不仅仅是在内容被抓取时收费。这一变化还有助于为 AI 模型提供商节省出版商的带宽和计算资源,因为 Cloudflare 的数据显示,超过 50% 的 AI 爬虫流量被浪费在重复抓取未更改的页面上。
To put this into action, Cloudflare is initially working with two partners, Ceramic.ai and You.com. When a publisher opts in, they’re paid when their content appears in Ceramic’s AI search results or when You.com accesses a piece of their premium content. Other AI companies can customize this model for how they work, Cloudflare says. 为了落实这一举措,Cloudflare 最初与 Ceramic.ai 和 You.com 两个合作伙伴展开合作。当出版商选择加入后,如果其内容出现在 Ceramic 的 AI 搜索结果中,或者 You.com 访问了其付费内容,出版商就能获得报酬。Cloudflare 表示,其他 AI 公司也可以根据自身运作方式定制此模式。