From Context Shift to Stylistic Collapse: Why Training Objectives Matter More Than Scale
From Context Shift to Stylistic Collapse: Why Training Objectives Matter More Than Scale
从语境偏移到文体坍缩:为何训练目标比模型规模更重要
In modern LLMs, linguistic features function not as stylistic artifacts but as probes of probability mass, allocated under training alignment objectives. Language models trained with contemporary pipelines exhibit severe reshaping of linguistic features, leading to extreme language re-distribution.
在现代大语言模型(LLM)中,语言特征并非仅仅是文体上的修饰,而是作为概率质量的探测器,在训练对齐目标下进行分配。采用当代流水线训练的语言模型表现出对语言特征的严重重塑,导致了极端的语言重新分布。
While previous stylometric analyses explored linguistic differences between AI-generated and human texts, we focus on the reshaping plaguing the LLM training pipeline itself. We analyze 17 models (410M-100B+ parameters) across 24 linguistically-motivated probes, documenting that instruction-tuned systems systematically collapse language entropy along discourse and structural dimensions (mean amplification: 1,949-16,853%, peaks: 5,181-209,675%), while selectively suppressing complex punctuation to 3.2-23.2% of baseline frequencies.
虽然此前的文体计量分析探讨了人工智能生成文本与人类文本之间的语言差异,但我们关注的是困扰 LLM 训练流水线本身的重塑问题。我们分析了 17 个模型(参数量从 4.1 亿到 1000 亿以上),通过 24 个语言学驱动的探测器进行研究,记录到指令微调系统在话语和结构维度上系统性地坍缩了语言熵(平均放大倍数:1,949-16,853%,峰值:5,181-209,675%),同时有选择地将复杂标点符号的使用频率抑制到基准频率的 3.2-23.2%。
These effects do not worsen under RLHF, as divergence patterns are statistically indistinguishable (p > 0.25) across matched base and instruction-tuned model pairs. Weak intervention (lambda=1.0) exacerbates collapse by 240%, while strong control (lambda=5.0) achieves 40.5% improvement and outperforms frontier models by 96.7-98.2% despite 200-1000x scale disadvantage.
这些效应在人类反馈强化学习(RLHF)下并不会恶化,因为在匹配的基础模型和指令微调模型对之间,其差异模式在统计学上是无法区分的(p > 0.25)。弱干预(lambda=1.0)使坍缩加剧了 240%,而强控制(lambda=5.0)则实现了 40.5% 的改进,尽管在规模上处于 200-1000 倍的劣势,其性能仍比前沿模型高出 96.7-98.2%。
Additionally, lambda=5.0 delivers 15% higher distinct-4, 27% higher vocabulary diversity, and 78% lower repetition than moderate regularization, establishing that alignment requires sufficient control strength, not merely distributional smoothing. Our findings underscore how modern LLMs reallocate stylistic probability mass, despite RLHF and scale.
此外,与中度正则化相比,lambda=5.0 的 distinct-4 指标提高了 15%,词汇多样性提高了 27%,重复率降低了 78%,这确立了一个事实:对齐需要足够的控制强度,而不仅仅是分布平滑。我们的研究结果强调了现代 LLM 如何在 RLHF 和规模效应之外,重新分配文体概率质量。
More broadly, our work reveals a structural limitation of current alignment pipelines: preference optimization reshapes language distributions invisible to standard quality metrics yet detectable through distributional probes, with implications for AI detection, training data contamination, and long-term linguistic evolution.
更广泛地说,我们的工作揭示了当前对齐流水线的一个结构性局限:偏好优化重塑了语言分布,这种重塑在标准质量指标下是不可见的,但可以通过分布探测器检测到。这对人工智能检测、训练数据污染以及长期语言演化具有深远影响。