Domain Adaptation and Reasoning Frameworks in Language Models: A Controlled Experiment with Historical Cosmology
Domain Adaptation and Reasoning Frameworks in Language Models: A Controlled Experiment with Historical Cosmology
语言模型中的领域适应与推理框架:以历史宇宙学为受控实验的研究
Abstract: We investigate how domain adaptation reshapes explanatory behavior in language models using historical cosmology as a controlled setting. 摘要: 我们以历史宇宙学作为受控环境,研究了领域适应(domain adaptation)如何重塑语言模型的解释行为。
In Phase 1, we train a small language model from scratch on a pre-Copernican corpus from which explicit heliocentric references were removed, and evaluate whether Earth-motion or heliocentric continuations nevertheless emerge. 在第一阶段,我们从零开始训练了一个小型语言模型,所使用的语料库为前哥白尼时期的文献,并剔除了其中明确的日心说引用,旨在评估模型是否仍会产生关于地球运动或日心说的续写内容。
In Phase 2, we fine-tune a larger pretrained model using QLoRA on the same corpus in order to study how adaptation modifies explanatory framing and cosmological stance. 在第二阶段,我们使用 QLoRA 在同一语料库上对一个更大的预训练模型进行了微调,以研究领域适应如何改变模型的解释框架和宇宙学立场。
Model outputs are evaluated using an LLM-as-judge framework that labels both cosmological stance (geocentric, heliocentric, or ambiguous) and explanatory frame (premodern versus modern). 模型输出通过“大模型作为裁判”(LLM-as-judge)的框架进行评估,该框架对宇宙学立场(地心说、日心说或模糊)和解释框架(前现代与现代)进行标注。
In the constrained setting of Phase 1, the smaller models occasionally generate local Earth-motion continuations, but these remain globally unstable and insufficient to support coherent cosmological reasoning. 在第一阶段的受限环境下,小型模型偶尔会生成局部的地球运动续写,但这些内容在全局上是不稳定的,不足以支持连贯的宇宙学推理。
In Phase 2, fine-tuning induces a large and statistically significant shift toward premodern explanatory framing, while the conditional cosmological stance distributions remain comparatively stable within those frames. 在第二阶段,微调导致模型在解释框架上出现了显著的统计学偏移,倾向于前现代的解释方式,而这些框架内的条件宇宙学立场分布则保持相对稳定。
As a result, increases in geocentric outputs arise primarily from redistribution over explanatory regimes rather than from direct modification of stance. 因此,地心说输出的增加主要源于解释范式的重新分配,而非立场本身的直接改变。
These results suggest that domain adaptation may primarily reshape the linguistic frameworks from which continuations are generated, with changes in stance emerging secondarily from those shifts. 这些结果表明,领域适应可能主要重塑了生成续写内容的语言框架,而立场的改变则是这些转变的次生结果。