Nothing from Something: Can a Language Model Discover 0?

无中生有：语言模型能发现“0”吗？

Abstract: AI systems based on artificial neural networks are being developed with aspirations of pushing the boundary of human mathematical knowledge. A key question for these systems is how much they can reach beyond their training data. Mathematical discovery requires a strong form of out-of-distribution generalization; the ability to hypothesize genuinely new - and potentially logically more powerful - mathematical structures. It has been hypothesized that language abilities support such generalizations in human cognition.

摘要： 基于人工神经网络的 AI 系统正在开发中，旨在突破人类数学知识的边界。对于这些系统而言，一个关键问题是它们能在多大程度上超越其训练数据。数学发现需要一种强大的分布外泛化能力，即假设出真正新颖且在逻辑上可能更强大的数学结构的能力。有假说认为，语言能力在人类认知中支持了此类泛化。

In this work, we use simple arithmetic as a case study for examining how modern AI models could expand their mathematical horizons, evaluating whether these models can independently discover the concept of “zero”. We show that (1) language models of a GPT-2 size are unable to perform this generalization at test time regardless of language pretraining, but (2) models can improve substantially after training on tens or hundreds of examples of zero. Additionally, we find that language pretraining reduces the number of required examples by approximately 50%, showing that language abilities can scaffold mathematical discovery in neural models.

在这项工作中，我们以简单的算术为例，研究现代 AI 模型如何扩展其数学视野，并评估这些模型是否能独立发现“零”的概念。我们发现：(1) 无论是否经过语言预训练，GPT-2 规模的语言模型在测试时都无法实现这种泛化；但 (2) 模型在经过数十或数百个关于“零”的示例训练后，性能会有显著提升。此外，我们发现语言预训练将所需的示例数量减少了约 50%，这表明语言能力可以为神经模型中的数学发现提供支撑（脚手架作用）。