The user is visibly frustrated
The user is visibly frustrated
用户显然已经抓狂了
In this article, I try to understand why coding agents can be infuriating to use. I think the problem is their conversational UX: they behave enough like helpful colleagues to trigger our social instincts, but they don’t learn, adapt, or take responsibility the way people do, which makes their repeated mistakes feel much more frustrating than they should. 在这篇文章中,我试图探究为什么使用编程智能体(coding agents)会让人感到如此恼火。我认为问题出在它们的对话式用户体验(UX)上:它们的表现足以像乐于助人的同事,从而触发我们的社交本能,但它们又不像人类那样能够学习、适应或承担责任,这使得它们反复犯错时,带来的挫败感远超预期。
Despite the usual allegations against Italians, I’m generally a composed person. Tame, even, especially at work. Yet, lately I often find myself mildly displeased, furiously hammering on my laptop “WHAT THE FUCK DID YOU DO???”. The recipient of these tirades is, you might have guessed, a coding agent. 尽管人们常对意大利人有些刻板印象,但我通常是个沉稳的人。甚至可以说很温和,尤其是在工作中。然而,最近我却经常感到不满,对着笔记本电脑疯狂敲击:“你到底在搞什么鬼???”你可能已经猜到了,这些咆哮的对象正是编程智能体。
It’s completely pointless, I know. Coding agents are just probabilistic machines generating patches. Sometimes they’re good, sometimes they’re bad. Pick the ones you like, discard the others. No big deal, right? Well, not quite. For some reason, bad results often feel exasperating. But why am I getting mad at an algorithm? Am I the only one affected? Are coding agents surfacing a sadistic streak I didn’t know I had? 我知道这毫无意义。编程智能体不过是生成代码补丁的概率机器。有时它们表现出色,有时则很糟糕。选出你满意的,丢弃剩下的。没什么大不了的,对吧?嗯,其实不然。不知为何,糟糕的结果往往让人感到抓狂。但我为什么要对一个算法发火?难道只有我一个人这样吗?是编程智能体激发了我潜意识里的施虐倾向吗?
I think there’s another explanation: the conversational UX is bound to frustrate you. Coding agents pretend to be people. Of course, if you ask them directly they tell you they’re just “AI assistants with no feelings or subjective experience”, but that’s not how they behave. They talk like real people. They use a relaxed and friendly tone. They often praise you, and when they “push back” they’re gentle and attentive. 我认为还有另一种解释:这种对话式用户体验注定会让你感到沮丧。编程智能体假装成人类。当然,如果你直接问它们,它们会告诉你它们只是“没有情感或主观体验的 AI 助手”,但它们的行为并非如此。它们像真人一样交谈,语气轻松友好。它们经常夸奖你,而当它们“反驳”你时,也显得温和且专注。
Even though, rationally, you know you’re just reading blobs of probable text, these tools lull you into feeling that you’re interacting with a person, a helpful coworker who’s a pleasure to work with. Until it’s not. As in every relationship, the cracks begin to show when things start to go wrong. 尽管从理性上讲,你知道自己只是在阅读一堆概率生成的文本,但这些工具会让你产生一种错觉,仿佛你正在与一个人互动——一个乐于助人、合作愉快的同事。直到情况变糟。正如任何一段关系一样,当事情开始出错时,裂痕便显现出来。
The first time you catch a mistake, you shrug. You point it out and the agent apologizes. Five minutes later, however, same mistake again. You correct them a second time, noting their recidivism, so now they also update their memory and promise you “it will never happen again”. But it does, over and over, because these tools follow the most probable path, and in some cases no amount of HARD RULES can push them off it. 第一次发现错误时,你会耸耸肩。你指出来,智能体便道歉。然而五分钟后,同样的错误再次出现。你第二次纠正它们,指出它们的“累犯”行为,于是它们更新了记忆并向你保证“绝不再犯”。但错误还是会反复发生,因为这些工具遵循的是概率最高的路径,在某些情况下,无论多少“硬性规则”都无法让它们偏离这条路径。
If the agent were a human colleague, you’d have good reason to feel a bit miffed. But it’s an algorithm; losing your patience is absurd. And yet, since it behaves like a colleague, the illusion ends up tripping the same emotional wires. With a colleague, the desire not to be a horrible human being restrains you, but with an agent you feel free to lash out. It’s not cathartic, however; you just feel the frustration and realize that whatever you do or say will have absolutely no effect. 如果对方是人类同事,你感到恼火是有充分理由的。但它只是一个算法;失去耐心是很荒谬的。然而,由于它的行为像个同事,这种错觉最终还是触动了同样的心理开关。面对同事时,你因为不想做一个糟糕的人而克制自己,但面对智能体时,你会觉得可以肆意发泄。但这并不能带来宣泄感;你只会感到挫败,并意识到无论你做什么或说什么,都毫无作用。
I’ve been using Claude Code for the past few months, and lately I’ve noticed that, when corrected, it often reflects on where it went wrong and what it should have done instead. Maybe this is an attempt to improve how you perceive the tool. I can’t say it works for me, though. I don’t really get anything useful out of these postmortems (e.g., clues about how to rephrase my instructions), and they just end up reading as annoying filler. 过去几个月我一直在使用 Claude Code,最近我注意到,当被纠正时,它经常会反思自己哪里出了错以及本该怎么做。也许这是为了改善你对该工具的感知。但我不能说这对我有效。我并没有从这些“事后总结”中获得任何有用的信息(例如关于如何重写指令的线索),它们最终读起来只像是令人烦躁的废话。
Maybe I would prefer a more radical solution: drop the human pretense entirely. Make the agent sound clinical, robotic. Dispel the idea that I’m interacting with a person, and make me feel like I’m just approving or rejecting random outcomes. 也许我更倾向于一个激进的解决方案:彻底抛弃这种人类伪装。让智能体的声音变得冷冰冰、机械化。消除我正在与人互动的错觉,让我感觉自己只是在批准或拒绝随机生成的结果。
Of course, “trying to behave like a human would” is the mechanism that gives LLMs their intelligence, so it makes sense that conversational interfaces emerged as the default way to interact with them. And in many ways, they work very well. Practically speaking, I probably just need to condition myself not to get caught in the illusion of speaking with a human. Though I’m not really thrilled about a future where I need to guard against the tools I use for my job. 当然,“试图表现得像人类”正是赋予大语言模型智能的机制,因此对话式界面成为与它们交互的默认方式也就不足为奇了。在许多方面,它们确实表现得很好。从实际角度来看,我可能只需要训练自己,不要陷入“正在与人交谈”的错觉中。尽管对于未来我需要时刻提防自己工作所用的工具,我实在感到高兴不起来。