FlowLM: Few-Step Language Modeling via Diffusion-to-Flow Adaptation
FlowLM: Few-Step Language Modeling via Diffusion-to-Flow Adaptation
FlowLM:通过扩散到流适配实现少步语言建模
We present FlowLM, a flow matching language model transformed from pre-trained diffusion language models via efficient fine-tuning. 我们提出了 FlowLM,这是一种通过高效微调,从预训练扩散语言模型转换而来的流匹配(flow matching)语言模型。
By re-aligning the curved sampling trajectories of diffusion models into straight-line flows, FlowLM enables high quality few-step generation that rivals or even outperforms the quality of 2,000-step diffusion sampling with very few training epochs. 通过将扩散模型弯曲的采样轨迹重新对齐为直线流,FlowLM 实现了高质量的少步生成,仅需极少的训练轮次,其效果即可媲美甚至超越 2,000 步的扩散采样质量。
Remarkably, finetuned FlowLM reaches performance saturation with only half as many training epochs as training from scratch, both approaches greatly outperforming the original diffusion model, thereby validating our method. 值得注意的是,微调后的 FlowLM 仅需从头训练一半的轮次即可达到性能饱和,且这两种方法的效果都远超原始扩散模型,从而验证了我们方法的有效性。
Furthermore, we validate a more effective training objective for flow matching: predicting clean data to consistently guide the sampling process towards the true data distribution. 此外,我们验证了一种更有效的流匹配训练目标:通过预测干净数据,持续引导采样过程向真实数据分布靠拢。
Empirical results demonstrate that our approach is highly effective for high-quality, few-step text generation. 实证结果表明,我们的方法在高质量、少步文本生成方面非常有效。