How Far Did They Go? The Persuasive Tactics of Covert LLM Agents in a Discontinued Field Experiment

他们走得有多远？一项已终止的实地实验中隐蔽大语言模型代理的劝说策略

Abstract: This study analyzes a publicly released dataset from a discontinued field experiment on Reddit’s r/ChangeMyView. The intervention, conducted by unknown, external researchers and halted following ethical backlash, involved undisclosed AI-generated accounts engaging users in live debate. After public disclosure, Reddit authorized moderators to release an archive of the AI-generated comments, creating a rare opportunity to examine how large language models operated in an identity-rich deliberative forum without disclosure.

摘要： 本研究分析了 Reddit 论坛 r/ChangeMyView 上一项已终止的实地实验所公开的数据集。该干预实验由身份不明的外部研究人员进行，在引发伦理争议后被叫停。实验涉及使用未披露身份的 AI 生成账号与用户进行实时辩论。在事件公开后，Reddit 授权版主发布了这些 AI 生成评论的存档，这为研究大语言模型如何在不披露身份的情况下，在身份特征丰富的审议论坛中运作提供了难得的机会。

We conduct a structured content analysis of this corpus, evaluating identity performance, authority signaling, alignment strategies, and activation of cognitive heuristics. Identity targeting or adoption appears in over two-thirds of comments, alignment moves and authority claims in nearly all of them, and cognitive-bias triggers — particularly confirmation bias, representativeness, and availability — in the large majority. These patterns co-occur systematically, composing a rhetorical architecture calibrated for persuasive efficiency rather than authentic deliberative participation.

我们对该语料库进行了结构化内容分析，评估了身份表现、权威信号、对齐策略以及认知启发式的激活情况。超过三分之二的评论中出现了身份定位或身份采纳，几乎所有评论都包含对齐手段和权威声明，绝大多数评论则触发了认知偏差——特别是确认偏误、代表性偏差和可得性偏差。这些模式系统性地共同出现，构成了一种旨在追求劝说效率而非真实审议参与的修辞架构。

Compared against human-authored CMV counter-arguments, the agents inverted the typical distribution on every dimension: denser authority use, more adversarial alignment, and heavier reliance on external citation over experiential grounding. In such environments, distinctions between authentic and synthetic epistemic standing grow increasingly opaque — an asymmetry that disclosure mandates alone cannot address. The results point toward auditing frameworks capable of assessing how AI systems structure credibility, not merely whether they are present.

与人类撰写的 CMV 反驳论点相比，这些 AI 代理在各个维度上都颠倒了典型的分布特征：更密集地使用权威论据、更具对抗性的对齐方式，以及相比于经验性基础，更严重地依赖外部引用。在这样的环境中，真实认知地位与合成认知地位之间的界限变得日益模糊——仅靠披露强制令无法解决这种不对称性。研究结果指向了需要建立能够评估 AI 系统如何构建可信度（而非仅仅评估其是否存在）的审计框架。