Computational conceptual history of scientific concepts: From early digital methods to LLMs

科学概念的计算概念史：从早期数字方法到大语言模型

Abstract: This article situates large language models (LLMs) within the longer history of computational approaches to concept analysis in the history, philosophy, and sociology of science (HPSS). We examine what LLMs add to existing methods, how they inherit longstanding problems, and review recent case studies that employ them.

摘要： 本文将大语言模型（LLMs）置于科学史、科学哲学和科学社会学（HPSS）中概念分析计算方法的长远历史背景下进行定位。我们探讨了 LLMs 对现有方法有何增益，它们如何继承了长期存在的问题，并回顾了近期应用这些模型的案例研究。

In the first part, we reconstruct computational conceptual history before LLMs by bringing together three strands of work: early digital methods in HPSS, distributional approaches from digital history and related research, and lexical semantic change detection. We provide an overview of the main challenges and opportunities, focusing on corpus construction, operationalization and modelling choices, and evaluation and interpretation.

在第一部分中，我们通过整合三类研究工作，重构了 LLMs 出现之前的计算概念史：HPSS 中的早期数字方法、数字历史及相关研究中的分布语义方法，以及词汇语义变化检测。我们概述了其中的主要挑战与机遇，重点关注语料库构建、操作化与建模选择，以及评估与解释。

In the second part, we turn to the era of LLMs, starting with a short introduction to LLMs before reviewing LLM-based work on lexical semantic change detection and relevant case studies in HPSS. We then revisit the earlier methodological questions, showing how issues of corpus construction, model choice and training data, operationalization trade-offs, and evaluation and interpretation play out in LLM-based workflows.

在第二部分中，我们转向 LLMs 时代，在简要介绍 LLMs 之后，回顾了基于 LLMs 的词汇语义变化检测研究以及 HPSS 中的相关案例。随后，我们重新审视了前述的方法论问题，展示了语料库构建、模型选择与训练数据、操作化权衡，以及评估与解释等议题在基于 LLMs 的工作流程中是如何体现的。

Paper Details:

Authors: Michael Zichert, Arno Simons
arXiv ID: 2606.04118
Date: 2 Jun 2026
Subject: Computation and Language (cs.CL)

论文详情：

作者： Michael Zichert, Arno Simons
arXiv ID： 2606.04118
日期： 2026年6月2日
学科： 计算与语言 (cs.CL)