A Structural Threshold in Decision Capacity Governs Collapse in Self-Play Reinforcement Learning

A Structural Threshold in Decision Capacity Governs Collapse in Self-Play Reinforcement Learning

自博弈强化学习中决策能力的结构性阈值决定了模型崩溃

Abstract: We show that a threshold in decision capacity determines whether self-play reinforcement learning agents collapse under asymmetric rule perturbations. Across poker variants, matrix games, a dice game, and multiple learning algorithms, eliminating all positive-reach contingent decisions causes rapid convergence to a deterministic exploitation attractor, a fixed point at near-maximal loss.

摘要: 我们研究发现,决策能力的一个阈值决定了自博弈(self-play)强化学习智能体在非对称规则扰动下是否会发生崩溃。通过对多种扑克变体、矩阵博弈、骰子游戏以及多种学习算法的测试,我们发现消除所有具有正可达性的条件决策(positive-reach contingent decisions)会导致模型迅速收敛到一个确定性的剥削吸引子(deterministic exploitation attractor),即一个接近最大损失的固定点。

Preserving even a single positive-reach contingent decision point prevents this collapse. A frozen baseline and fixed-opponent control confirm that the mechanism is co-adaptation under constraint, not the perturbation itself. The phenomenon is timing-invariant, fully reversible upon action restoration, and intensifies under function approximation.

只要保留哪怕一个具有正可达性的条件决策点,就能防止这种崩溃。通过冻结基线和固定对手的对照实验证实,其背后的机制是约束条件下的协同适应(co-adaptation),而非扰动本身。该现象具有时间不变性,在恢复动作后可完全逆转,并且在函数近似(function approximation)下会进一步加剧。

These results establish a sharp threshold at zero reach-weighted contingent action capacity, with severity scaling continuously via reach-weighted capacity in the tested domains.

这些结果确立了一个明确的阈值,即当“可达性加权条件动作容量”为零时,模型会发生崩溃;而在所测试的领域中,崩溃的严重程度会随着该容量的变化而连续缩放。