FRAME: Learning the Adaptation Domain with a Mixture of Fractional-Fourier Experts

FRAME：利用分数阶傅里叶专家混合模型学习适应域

Parameter-efficient fine-tuning (PEFT) reparameterizes weight updates in a fixed basis: low-rank adapters operate in the spatial domain, while a recent line of spectral methods operates in a fixed Fourier domain. We argue that the choice of domain is itself a design degree of freedom that should be learned, and that no single basis is optimal across tasks, layers, or tokens.

参数高效微调（PEFT）在固定基底中对权重更新进行重参数化：低秩适配器（low-rank adapters）在空间域中运行，而近期的一系列谱方法则在固定的傅里叶域中运行。我们认为，域的选择本身就是一个应该被学习的设计自由度，且没有任何单一的基底在所有任务、层或标记（token）上都是最优的。

We introduce Fractional-Fourier Mixture of Experts, a mixture-of-experts adapter in which every expert carries a learnable fractional-Fourier order that continuously interpolates between the spatial domain (recovering vanilla LoRA) and the Fourier domain (recovering a spectral adapter). Routing tokens through experts that occupy different points on this spatial-spectral continuum lets the model place each low-rank update in the domain where it is most compact, and — because fractional-Fourier operators of different orders are mutually incoherent — makes the experts naturally decorrelated, which reduces interference and improves multi-task composition.

我们引入了分数阶傅里叶专家混合模型（Fractional-Fourier Mixture of Experts），这是一种专家混合适配器，其中每个专家都携带一个可学习的分数阶傅里叶阶数，该阶数可以在空间域（还原为原始 LoRA）和傅里叶域（还原为谱适配器）之间进行连续插值。通过将标记路由至占据该空间-谱连续体上不同点的专家，模型能够将每个低秩更新放置在最紧凑的域中；此外，由于不同阶数的分数阶傅里叶算子是互不相关的，这使得专家们自然地去相关化，从而减少了干扰并改善了多任务组合效果。

The order is a single scalar per expert, trained with a separate optimizer, and the transform is computed with an $\mathcal{O}(d\log d)$ chirp—FFT surrogate, so Fractional-Fourier Mixture of Experts adds negligible cost over standard MoE-LoRA.

每个专家仅包含一个标量阶数，通过独立的优化器进行训练，且变换过程使用 $\mathcal{O}(d\log d)$ 的 Chirp-FFT 代理计算，因此分数阶傅里叶专家混合模型相比标准 MoE-LoRA 几乎没有增加额外成本。

Across commonsense, mathematical, code, and knowledge benchmarks on LLaMA-3.1-8B and Qwen2.5-7B, Fractional-Fourier Mixture of Experts improves over strong MoE-LoRA and spectral baselines — including FlyLoRA, FourierMoE, and HMoRA — while keeping the active-parameter budget small, and analysis shows that the learned orders specialize by task and layer in interpretable ways.

在 LLaMA-3.1-8B 和 Qwen2.5-7B 模型上的常识、数学、代码和知识基准测试中，分数阶傅里叶专家混合模型超越了强大的 MoE-LoRA 和谱基准方法（包括 FlyLoRA、FourierMoE 和 HMoRA），同时保持了较小的活跃参数预算。分析表明，所学习到的阶数能够根据任务和层级以可解释的方式进行专门化。