PRISMat: Policy-Driven, Permutation-Invariant Autoregressive Material Generation

PRISMat：策略驱动、置换不变的自回归材料生成模型

Abstract: Rapid identification of candidate materials with target properties has become a key task in materials science. Machine learning has emerged as an alternative to physics-based simulation, offering a faster and cheaper way to filter materials based on their stability and other target properties, reducing the number of candidates that reach the costly synthesis stage.

摘要： 快速识别具有目标属性的候选材料已成为材料科学中的一项关键任务。机器学习已成为物理模拟的一种替代方案，它提供了一种更快、更经济的方法来根据稳定性及其他目标属性筛选材料，从而减少了进入昂贵合成阶段的候选材料数量。

Recently, Large Language Models (LLMs) have been applied to this role, but these models are parameter-heavy and computationally expensive both during training and at inference time, making them unsuitable for high-throughput tasks. This inefficiency stems from both the large over-parameterization of language models and the difficulty of framing material generation as a sequence learning problem.

最近，大型语言模型（LLMs）已被应用于此角色，但这些模型参数量巨大，且在训练和推理阶段的计算成本高昂，使其不适合高通量任务。这种低效性既源于语言模型严重的过参数化，也源于将材料生成构建为序列学习问题的难度。

In this paper, we present PRISMat, a cost-effective, permutation-invariant model, which addresses these limitations. We show that PRISMat, despite taking less time for inference, is able to outperform LLMs in generating crystal slabs conditioned on critical materials’ surface properties.

在本文中，我们提出了 PRISMat，这是一种经济高效且具有置换不变性的模型，旨在解决上述局限性。我们证明，尽管 PRISMat 的推理时间更短，但在基于关键材料表面属性生成晶体板（crystal slabs）方面，其性能优于 LLMs。

In targeted material discovery, we achieve mean absolute errors of 0.188 eV/A$^2$ and 2.79 eV for cleavage energy and work function tasks, respectively, reducing the error of the next best model by 4$\times$.

在目标材料发现任务中，我们在解理能（cleavage energy）和功函数（work function）任务上分别实现了 0.188 eV/A$^2$ 和 2.79 eV 的平均绝对误差，将次优模型的误差降低了 4 倍。