Benchmarking Web Agent Safety under E-commerce Deceptive Interfaces

在电子商务欺骗性界面下对网络智能体安全性进行基准测试

As autonomous web agents are increasingly deployed to perform real-world tasks, ensuring their safety has become a critical concern. In this work, we study web agent behavior under realistic deceptive interfaces in the e-commerce domain. 随着自主网络智能体越来越多地被部署用于执行现实世界的任务，确保其安全性已成为一个关键问题。在这项工作中，我们研究了电子商务领域中，网络智能体在真实欺骗性界面下的行为表现。

We introduce WebDecept, a lightweight and configurable plugin framework that enables controlled injection of deceptive interface patterns into existing web environments. Using WebDecept, we instantiate seven deceptive patterns commonly observed on the open web, including targeted advertisements, domain redirection, and shopping manipulation. 我们引入了 WebDecept，这是一个轻量级且可配置的插件框架，能够将欺骗性界面模式受控地注入到现有的网络环境中。利用 WebDecept，我们实例化了在开放网络中常见的七种欺骗模式，包括定向广告、域名重定向和购物操纵。

By injecting these patterns into the frontend during task execution, we perform controlled evaluation of multiple multimodal web agents. Our results show that current web agents are highly susceptible to multiple classes of deceptive interfaces, and that prompt-based constraints are often insufficient to mitigate these failures. 通过在任务执行期间将这些模式注入前端，我们对多个多模态网络智能体进行了受控评估。研究结果表明，当前的网络智能体极易受到多种欺骗性界面的影响，且仅依靠基于提示（prompt-based）的约束往往不足以缓解这些失效问题。

We further analyze how the design choices of deceptive patterns influence the success of such manipulations. These findings highlight safety challenges that should be addressed as web agents are scaled toward real-world deployment. 我们进一步分析了欺骗性模式的设计选择如何影响此类操纵的成功率。这些发现强调了在网络智能体向现实世界部署扩展的过程中，必须解决的安全挑战。