SDOF: Taming the Alignment Tax in Multi-Agent Orchestration with State-Constrained Dispatch

SDOF：通过状态约束调度驯服多智能体编排中的“对齐税”

Abstract: Multi-agent orchestration frameworks such as LangChain, LangGraph, and CrewAI route tasks through graph-based pipelines but do not enforce the stage constraints that govern real business processes. We present SDOF, a framework that treats multi-agent execution as a constrained state machine.

摘要： LangChain、LangGraph 和 CrewAI 等多智能体编排框架通过基于图的流水线来路由任务，但无法强制执行管理实际业务流程的阶段约束。我们提出了 SDOF，这是一个将多智能体执行视为受限状态机的框架。

SDOF operates through two primary defensive layers, implemented by three components: (1) an Online-RLHF Specialized Intent Router trained via Generative Reward Modeling (GRPO) and (2) a StateAwareDispatcher with GoalStage finite-automaton checks and precondition/postcondition SkillRegistry validation for auditable execution control.

SDOF 通过三个组件实现的两个主要防御层进行运作：(1) 通过生成式奖励建模 (GRPO) 训练的在线 RLHF 专用意图路由器；(2) 带有 GoalStage 有限自动机检查以及用于可审计执行控制的前置/后置条件 SkillRegistry 验证的 StateAwareDispatcher（状态感知调度器）。

On a recruitment system backed by the Beisen iTalent platform (6000+ enterprises), 185 expert-curated scenarios trigger 1671 live API calls. Our GSPO-aligned 7B Intent Router achieves higher joint accuracy than zero-shot GPT-4o on this FSM-constrained adversarial routing benchmark (80.9% versus 48.9%).

在由北森 iTalent 平台（服务 6000 多家企业）支持的招聘系统上，185 个由专家策划的场景触发了 1671 次实时 API 调用。在这一受有限状态机 (FSM) 约束的对抗性路由基准测试中，我们经过 GSPO 对齐的 7B 意图路由器实现了比零样本 GPT-4o 更高的联合准确率（80.9% 对比 48.9%）。

In end-to-end execution, SDOF reaches 86.5% task completion (95% confidence interval 80.8 to 90.7) and blocks all 22 operations in the injection, illegal HR subset. Under a broader message-level blocking audit, SDOF attains precision 100% and recall 88%, expert agreement kappa=0.94.

在端到端执行中，SDOF 的任务完成率达到 86.5%（95% 置信区间为 80.8 到 90.7），并成功拦截了注入攻击和非法 HR 操作子集中的全部 22 项操作。在更广泛的消息级拦截审计下，SDOF 的精确率为 100%，召回率为 88%，专家一致性系数 (kappa) 为 0.94。

A separate evaluation on 960 SGD-derived dialogues spanning 8 service domains surfaces 201 stage-order conflicts under our FSM mapping, 41 of which arise in the normal split. This arXiv version reports the current validated scope; extended multi-seed training comparisons and deeper workflow evaluations will be released in a subsequent update.

针对跨越 8 个服务领域的 960 个基于 SGD（对话状态跟踪）生成的对话进行的独立评估显示，在我们的 FSM 映射下存在 201 个阶段顺序冲突，其中 41 个出现在正常拆分中。本 arXiv 版本报告了当前已验证的范围；扩展的多种子训练对比和更深入的工作流评估将在后续更新中发布。