RIFT-Bench: Dynamic Red-teaming For Agentic AI Systems
RIFT-Bench: Dynamic Red-teaming For Agentic AI Systems
RIFT-Bench:针对智能体 AI 系统的动态红队测试
Abstract: Agentic AI systems powered by large language models (LLMs) are rapidly evolving into autonomous decision-making systems, exposing attack vectors beyond those of traditional LLM vulnerabilities. Existing security evaluations are often tied to specific implementations or domains, limiting unified comparison across heterogeneous systems.
摘要: 由大语言模型(LLM)驱动的智能体 AI 系统正迅速演变为自主决策系统,其暴露出的攻击向量已超出了传统 LLM 的漏洞范畴。现有的安全评估往往局限于特定的实现或领域,限制了在异构系统之间进行统一比较的能力。
To address this gap, we introduce RIFT-Bench, a graph representation-driven methodology for dynamic red-teaming that enables unified evaluations across diverse agentic architectures. Building on a novel hierarchical representation, RIFT-Bench operates in two automated phases: Discovery, which extracts system structure, and Scanning, which deploys adaptive adversarial attacks and produces a comprehensive evaluation report.
为了解决这一差距,我们引入了 RIFT-Bench,这是一种基于图表示的动态红队测试方法,能够对各种智能体架构进行统一评估。基于一种新颖的分层表示,RIFT-Bench 在两个自动化阶段运行:发现(Discovery)阶段用于提取系统结构;扫描(Scanning)阶段用于部署自适应对抗攻击并生成综合评估报告。
It evaluates the examined system itself, leveraging a broad set of dynamically adaptable adversarial probes across diverse attack vectors and objectives. We demonstrate the effectiveness of the proposed evaluation pipeline across 45 agentic systems spanning a diverse range of implementations, showing that the approach generalizes effectively to heterogeneous agentic architectures.
它直接评估受测系统本身,利用跨越多种攻击向量和目标的广泛动态自适应对抗探测手段。我们在涵盖多种实现的 45 个智能体系统上验证了所提评估流程的有效性,结果表明该方法能有效推广至异构智能体架构。
Beyond systems and attacks, RIFT-Bench also supports direct evaluation of mitigation strategies. These key capabilities make RIFT-Bench a scalable foundation for security evaluation of agentic AI systems.
除了系统和攻击评估外,RIFT-Bench 还支持对缓解策略进行直接评估。这些关键能力使 RIFT-Bench 成为智能体 AI 系统安全评估的可扩展基础。