Pretraining Data Exposure in Large Language Models: A Survey of Membership Inference, Data Contamination, and Security Implications

大语言模型中的预训练数据泄露：成员推理、数据污染与安全影响综述

Abstract: Large Language Models (LLMs) have become the predominant paradigm in NLP, advancing both research and industry. As model sizes and pretraining data grow, concerns about Pretraining Data Exposure (PDE) increase due to the scale and opacity of training datasets. 摘要： 大语言模型（LLMs）已成为自然语言处理（NLP）领域的主导范式，推动了学术研究与工业界的发展。随着模型规模和预训练数据的不断增长，由于训练数据集的规模庞大且具有不透明性，人们对预训练数据泄露（Pretraining Data Exposure, PDE）的担忧日益加剧。

PDE refers to determining whether specific data appeared in an LLM’s pretraining corpus. It is critical for ensuring evaluation integrity and protecting privacy, intersecting two key areas: data contamination and membership inference. PDE 指的是确定特定数据是否出现在大语言模型的预训练语料库中。这对于确保评估的完整性和保护隐私至关重要，它涵盖了两个关键领域：数据污染和成员推理。

Though conceptually related, these areas have often been studied in isolation. This paper offers the first unified survey of both under the PDE framework. We formalize PDE across exposure levels, review attack and defense methods, synthesize empirical findings, and highlight open challenges and future research directions. 尽管这两个领域在概念上相关，但以往的研究往往将它们孤立开来。本文首次在 PDE 框架下对这两个领域进行了统一综述。我们对不同泄露程度下的 PDE 进行了形式化定义，回顾了相关的攻击与防御方法，综合了实证研究结果，并指出了当前面临的挑战及未来的研究方向。

Paper Details:

Authors: Ziyi Tong, Feifei Sun, Le Minh Nguyen
Submission Date: 21 May 2026
Primary Category: Computation and Language (cs.CL)
DOI: 10.48550/arXiv.2605.26133

论文详情：

作者： Ziyi Tong, Feifei Sun, Le Minh Nguyen
提交日期： 2026年5月21日
主要分类： 计算与语言 (cs.CL)
DOI： 10.48550/arXiv.2605.26133