Trustworthy Agent Network: Trust in Agent Networks Must Be Baked In, Not Bolted On

可信智能体网络：智能体网络的信任必须是“原生内置”而非“事后补丁”

Abstract: The rapid advancement of Large Language Models has given rise to autonomous LLM-based agents capable of complex reasoning and execution. As these agents transition from isolated operation to collaborative ecosystems, we witness the emergence of the Agent-to-Agent (A2A) network, a paradigm where heterogeneous agents autonomously coordinate to solve multi-step tasks.

摘要： 大语言模型的飞速发展催生了能够进行复杂推理和执行的自主智能体。随着这些智能体从孤立运行转向协作生态系统，我们见证了智能体间（A2A）网络的兴起——这是一种异构智能体自主协作以解决多步骤任务的范式。

While these networks may offer better task performance compared to simply using one agent to complete the entire task, they introduce systemic vulnerabilities, such as adversarial composition, semantic misalignment, and cascading operational failures, that existing agent alignment techniques cannot address.

虽然与仅使用单个智能体完成整个任务相比，这些网络可能提供更好的任务性能，但它们也引入了系统性漏洞，例如对抗性组合、语义偏差以及级联操作故障，而现有的智能体对齐技术无法解决这些问题。

In this vision paper, we argue that the trustworthiness of A2A networks cannot be fully guaranteed via retrofitting on existing protocols that are largely designed for individual agents. Rather, it must be architected from the very beginning of the A2A coordination framework. We present a comprehensive conceptual framework that situates trust in A2A systems through four design pillars.

在这篇愿景论文中，我们认为 A2A 网络的信任度无法通过对主要为单个智能体设计的现有协议进行“事后修补”来完全保障。相反，信任必须在 A2A 协作框架构建之初就融入其架构之中。我们提出了一个全面的概念框架，通过四个设计支柱在 A2A 系统中确立信任。