Universal Quantum Transformer

Universal Quantum Transformer (通用量子 Transformer)

Abstract: Classical continuous-space neural networks fundamentally struggle to lock into exact mathematical symmetries, such as modular arithmetic and non-commutative algebra. To approximate these discrete logical rules, they often rely on massive parameter scaling, resulting in stochastic instability even after delayed generalization phenomena known as grokking.

摘要： 传统的连续空间神经网络在锁定精确数学对称性（如模运算和非交换代数）方面存在根本性困难。为了逼近这些离散逻辑规则，它们通常依赖于大规模的参数扩展，即便在经历了被称为“顿悟”（grokking）的延迟泛化现象后，仍会导致随机不稳定性。

Here, we introduce the Universal Quantum Transformer (UQT), a fundamentally novel, quantum-native computing architecture that uses the physical properties of multi-qubit systems as a universal inductive bias for exact mathematical and algebraic reasoning. Rather than translating classical neural mechanisms, our framework relies entirely on parameterized geometric phase embedding and $SU(2)$ wave-interference.

在此，我们引入了通用量子 Transformer (UQT)，这是一种本质上全新的、原生于量子的计算架构。它利用多量子比特系统的物理特性作为通用归纳偏置，用于精确的数学和代数推理。我们的框架并不依赖于对经典神经机制的转换，而是完全基于参数化的几何相位嵌入和 $SU(2)$ 波干涉。

We demonstrate that the quantum attention circuit, operating on a highly compact 5-qubit substrate, perfectly learns two highly distinct formal classes: cyclic modular arithmetic ($\mathbb{Z}_{11}$) and non-Abelian algebra (the $S_4$ permutation group). While classical attention-based networks exhibit stochastic instability at convergence, the UQT achieves mathematically exact, deterministic generalization. We refer to this phenomenon as crystallization: a step beyond the well-known phenomenon of grokking.

我们证明了在高度紧凑的 5 量子比特基底上运行的量子注意力电路，能够完美学习两类截然不同的形式化类别：循环模运算 ($\mathbb{Z}_{11}$) 和非阿贝尔代数（$S_4$ 置换群）。虽然基于注意力的经典网络在收敛时表现出随机不稳定性，但 UQT 实现了数学上精确的确定性泛化。我们将这种现象称为“结晶”（crystallization）：这是超越了广为人知的“顿悟”现象的进一步突破。

Crucially, this framework yields massive computational and memory advantages by theoretically bypassing the quadratic bottleneck of classical self-attention, and by logarithmically compressing the required representation dimension to eliminate the massive over-parameterization inherent to classical networks. Finally, we deploy this architecture on noisy intermediate-scale quantum (NISQ) hardware, proving its viability on current IBM Quantum computers. These results establish parameterized quantum topology as a universally superior physical substrate for exact artificial intelligence.

至关重要的是，该框架通过理论上绕过经典自注意力机制的二次方瓶颈，并以对数方式压缩所需的表示维度，从而消除了经典网络固有的海量过参数化问题，带来了巨大的计算和内存优势。最后，我们将该架构部署在含噪声中等规模量子 (NISQ) 硬件上，证明了其在当前 IBM 量子计算机上的可行性。这些结果确立了参数化量子拓扑作为精确人工智能的通用优越物理基底。