RIFT-Bench: Dynamic Red-teaming For Agentic AI Systems

RIFT-Bench：针对智能体 AI 系统的动态红队测试

Abstract: Agentic AI systems powered by large language models (LLMs) are rapidly evolving into autonomous decision-making systems, exposing attack vectors beyond those of traditional LLM vulnerabilities. Existing security evaluations are often tied to specific implementations or domains, limiting unified comparison across heterogeneous systems.

摘要： 由大语言模型（LLM）驱动的智能体 AI 系统正迅速演变为自主决策系统，其暴露出的攻击向量已超出了传统 LLM 的漏洞范畴。现有的安全评估往往局限于特定的实现或领域，限制了在异构系统之间进行统一比较的能力。

To address this gap, we introduce RIFT-Bench, a graph representation-driven methodology for dynamic red-teaming that enables unified evaluations across diverse agentic architectures. Building on a novel hierarchical representation, RIFT-Bench operates in two automated phases: Discovery, which extracts system structure, and Scanning, which deploys adaptive adversarial attacks and produces a comprehensive evaluation report.

为了解决这一差距，我们引入了 RIFT-Bench，这是一种基于图表示的动态红队测试方法，能够对各种智能体架构进行统一评估。基于一种新颖的分层表示，RIFT-Bench 在两个自动化阶段运行：发现（Discovery）阶段用于提取系统结构；扫描（Scanning）阶段用于部署自适应对抗攻击并生成综合评估报告。

It evaluates the examined system itself, leveraging a broad set of dynamically adaptable adversarial probes across diverse attack vectors and objectives. We demonstrate the effectiveness of the proposed evaluation pipeline across 45 agentic systems spanning a diverse range of implementations, showing that the approach generalizes effectively to heterogeneous agentic architectures.

它直接评估受测系统本身，利用跨越多种攻击向量和目标的广泛动态自适应对抗探测手段。我们在涵盖多种实现的 45 个智能体系统上验证了所提评估流程的有效性，结果表明该方法能有效推广至异构智能体架构。

Beyond systems and attacks, RIFT-Bench also supports direct evaluation of mitigation strategies. These key capabilities make RIFT-Bench a scalable foundation for security evaluation of agentic AI systems.

除了系统和攻击评估外，RIFT-Bench 还支持对缓解策略进行直接评估。这些关键能力使 RIFT-Bench 成为智能体 AI 系统安全评估的可扩展基础。