Coders are refusing to work without AI — and that could come back to bite them
Coders are refusing to work without AI — and that could come back to bite them
程序员拒绝在没有 AI 的情况下工作——这可能会让他们自食其果
In 2026, you cannot snatch AI coding tools out of developers’ vice-grip hands, researchers have discovered. But while AI is undoubtedly helping coders produce code faster, it may not be producing better code, other researchers warn. And that could cause problems down the road for them. 研究人员发现,到了 2026 年,你已经无法从程序员紧握的手中夺走 AI 编程工具了。然而,尽管 AI 无疑在帮助程序员更快地编写代码,但其他研究人员警告称,它未必能产出更好的代码。这可能会给他们带来长远的麻烦。
Specifically, in February 2026, respected AI research lab METR published a surprising revelation: most developers won’t work, even on a limited number of tasks, without AI anymore. METR had hoped to provide an update to some groundbreaking research published a few months earlier, in 2025, on AI coding productivity. In it, researchers measured how much time open source developers took to do tasks by hand versus with AI. While developers in that study reported that AI was making them more productive, they were shocked to learn it actually slowed them down. Sure, it generated code faster, but then they spent extra time finding and fixing errors, steering the AI and waiting on it to complete tasks. 具体来说,2026 年 2 月,受人尊敬的 AI 研究实验室 METR 发布了一个令人惊讶的发现:大多数开发者已经无法在没有 AI 的情况下工作了,哪怕是处理少量任务也不行。METR 原本希望对 2025 年发表的一项关于 AI 编程生产力的开创性研究进行更新。在那项研究中,研究人员对比了开源开发者手动完成任务与使用 AI 完成任务所需的时间。虽然参与研究的开发者声称 AI 提高了他们的生产力,但他们震惊地发现,AI 实际上拖慢了他们的进度。诚然,AI 生成代码的速度更快,但他们随后花费了额外的时间去查找和修复错误、引导 AI 以及等待它完成任务。
When METR set out to repeat the experiment to measure advances in AI and coder proficiency, they couldn’t. Devs weren’t willing to participate “because they do not wish to work without AI” even just for the study, the researchers confessed. Instead, METR published a survey in May that allowed technical employees to self-report their AI productivity gains. Not surprisingly, they perceived that AI made them twice as valuable to their organizations. But recent headlines about the wild expense of so-called tokenmaxxing, coupled with a smattering of recent research, make such self-perceptions dubious. 当 METR 试图重复该实验以衡量 AI 和程序员能力的进步时,他们失败了。研究人员坦言,开发者们不愿意参与,因为他们“不想在没有 AI 的情况下工作”,哪怕只是为了这项研究也不行。于是,METR 在 5 月发布了一项调查,允许技术员工自述其 AI 带来的生产力提升。不出所料,他们认为 AI 使他们对组织的价值翻了一番。但最近关于所谓“Token 最大化”(tokenmaxxing)高昂成本的头条新闻,加上零星的最新研究,使得这种自我认知显得十分可疑。
Tokenmaxxing, or using the number of tokens a person uses as a proxy for productivity with AI, has been the trend of 2026 so far. And it may already be over. Amazon shut down its internal token-tracking leaderboard called Kirorank after employees were gaming it by using AI agents excessively, and running up costs, the Financial Times reported this week. The employees proved that AI use does not automatically translate to increased productivity. “Token 最大化”,即以个人消耗的 Token 数量作为 AI 生产力的衡量指标,是 2026 年迄今为止的趋势。但这个趋势可能已经结束了。《金融时报》本周报道称,亚马逊关闭了其名为 Kirorank 的内部 Token 追踪排行榜,因为员工通过过度使用 AI 代理来刷榜,从而推高了成本。员工们证明了使用 AI 并不等同于生产力的自动提升。
Uber blew through its 2026 AI budget within the first four months of the year, The Information reported. COO Andrew Macdonald recently said on a podcast that such spending hadn’t led to a measurable increase in projects or productivity. AI-generated code also doesn’t necessarily reduce ongoing code maintenance needs, and may even increase it, programmer and author James Shore elegantly argued in a blog post that went viral on Hacker News. “You write code twice as quick now? Better hope you’ve halved your maintenance costs,” he wrote. “Otherwise, you’re screwed. You’re trading a temporary speed boost for permanent indenture.” 据《The Information》报道,Uber 在 2026 年的前四个月就耗尽了全年的 AI 预算。首席运营官 Andrew Macdonald 最近在播客中表示,这种支出并没有带来项目或生产力的可衡量增长。程序员兼作家 James Shore 在一篇在 Hacker News 上疯传的博文中精辟地指出,AI 生成的代码并不一定会减少持续的代码维护需求,甚至可能增加维护负担。“你现在写代码的速度快了两倍?最好祈祷你的维护成本也减半了,”他写道,“否则,你就完蛋了。你是在用暂时的速度提升换取永久的奴役。”
There’s other evidence that AI can increase code maintenance woes. A viral tweet from Aiswarya Sankar, founder and CEO of reliability engineering agent startup Entelligence AI, proclaims that companies are spending 44% of their tokens on bug fixes that their AI generated. Meanwhile, code-reviewing tool company Code Rabbit says it analyzed open source pull requests and found that AI produced 1.7x more problems than human code. Those are, admittedly, self-serving stats from those trying to sell AI code reviewing tools. Yet independent researchers have also found such issues. Researchers from the respected Singapore Management University published a report in April warning that “AI-generated code can introduce long-term maintenance costs into real software projects.” 还有其他证据表明 AI 会增加代码维护的痛苦。可靠性工程代理初创公司 Entelligence AI 的创始人兼 CEO Aiswarya Sankar 发推文称,公司 44% 的 Token 支出都花在了修复 AI 生成的 Bug 上。与此同时,代码审查工具公司 Code Rabbit 表示,他们分析了开源 Pull Request,发现 AI 产生的错误比人类代码多出 1.7 倍。诚然,这些数据来自试图推销 AI 代码审查工具的公司,带有一定的自利性。但独立研究人员也发现了类似的问题。新加坡管理大学(SMU)的研究人员在 4 月发表的一份报告中警告称:“AI 生成的代码可能会给实际的软件项目带来长期的维护成本。”
Given that programmers love their AI assistants, what’s the solution? Well, those who want to sell you AI coding agents say devs can just use AI coding agents to do the bone-wearying tasks of fixing code as fast as AI spits it out. That’s what Cognition founder and CEO Scott Wu — the maker of AI coding agent Devin — suggests. But even he admits that, while Devin can work independently, he’d currently rate its skill between a junior and mid-level programmer, depending on the task. This is not a hand-it-off and forget it solution. 既然程序员如此喜爱他们的 AI 助手,解决办法是什么呢?好吧,那些想向你推销 AI 编程代理的人会说,开发者只需使用 AI 编程代理,就能像 AI 生成代码一样快速地完成修复代码这种苦差事。这就是 AI 编程代理 Devin 的制造商、Cognition 创始人兼 CEO Scott Wu 的建议。但即使是他也承认,虽然 Devin 可以独立工作,但他目前将其技能水平评定为初级到中级程序员之间,具体取决于任务。这绝不是一种“甩手不管”的解决方案。
The SMU researchers suggest a more human approach. Programmers should know what tasks AI does and doesn’t do well as deeply as they know their favorite coding languages. They need strong quality assurance systems designed for AI and they are stuck with carefully reviewing the AI’s work as if it were a junior dev. Meanwhile, the researchers say (and Wu agrees), humans should still be doing the big-picture work like software architecture and security design. SMU 的研究人员建议采取一种更人性化的方法。程序员应该像了解自己最喜欢的编程语言一样,深入了解 AI 在哪些任务上表现出色,在哪些任务上表现不佳。他们需要为 AI 设计强大的质量保证系统,并且必须像审查初级开发人员一样仔细审查 AI 的工作。与此同时,研究人员表示(Wu 也表示赞同),人类仍应负责软件架构和安全设计等宏观层面的工作。