Safe and Generalizable Hierarchical Multi-Agent RL via Constraint Manifold Control

通过约束流形控制实现安全且可泛化的分层多智能体强化学习

Multi-agent systems are widely used in safety-critical applications that require coordinated behavior under strict safety constraints. 多智能体系统被广泛应用于需要严格安全约束下协同行为的安全关键型应用中。

Existing approaches face a fundamental trade-off: learning-based methods achieve strong empirical performance but lack theoretical safety guarantees, while control-theoretic methods enforce safety but often lead to overly conservative and inefficient behaviors. 现有的方法面临一个根本性的权衡：基于学习的方法虽然能实现强大的经验性能，但缺乏理论上的安全保证；而基于控制理论的方法虽然能强制执行安全，却往往导致过于保守且低效的行为。

We propose a hierarchical multi-agent reinforcement learning framework that enforces hard safety constraints under mild assumptions at low level via a constraint manifold, while enabling effective coordination through high-level policy learning. 我们提出了一种分层多智能体强化学习框架，该框架通过约束流形在底层以温和的假设强制执行硬安全约束，同时通过高层策略学习实现有效的协同。

Our approach provides theoretical safety guarantees in the multi-agent setting and yields stationary learning dynamics, thereby enabling stable and efficient training. 我们的方法在多智能体环境下提供了理论上的安全保证，并产生了平稳的学习动态，从而实现了稳定且高效的训练。

Empirically, our method achieves competitive performance while maintaining nearly perfect safety rates, and generalizes effectively to varying numbers of agents and obstacles. 实验结果表明，我们的方法在保持近乎完美的安全性同时，实现了具有竞争力的性能，并能有效地泛化到不同数量的智能体和障碍物场景中。