Minimalist Genetic Programming

Minimalist Genetic Programming

Abstract: Genetic programming (GP) is based on two important insights. First, that any learning task can fundamentally be posed as a program induction problem, where the goal is to construct a symbolic hierarchical model that is expressed as a syntax tree. Second, to pose this task as a search problem, and use evolution to locate the desired model.

摘要: 遗传编程(GP)基于两个重要的见解。首先,任何学习任务从根本上都可以被视为一个程序归纳问题,其目标是构建一个以语法树形式表达的符号化分层模型。其次,将此任务视为一个搜索问题,并利用进化算法来定位所需的模型。

Since it was proposed, GP has produced notable results in a wide range of tasks and problem domains. This work presents an alternative view by modifying the second core insight of GP, posing the problem as a syntactic derivation task instead.

自提出以来,GP 在广泛的任务和问题领域中取得了显著成果。本研究提出了一种替代视角,通过修改 GP 的第二个核心见解,将该问题重新定义为一项句法推导任务。

In particular, this paper presents Minimalist Genetic Programming (MGP), an algorithm that like GP is biologically inspired, but instead of evolution it takes inspiration from the Minimalist Program to human language, in which syntax is understood as an optimal solution to the problem of linking two other mental systems.

具体而言,本文提出了极简遗传编程(Minimalist Genetic Programming, MGP)。该算法与 GP 一样具有生物学启发,但它并非借鉴进化论,而是从人类语言的“极简主义纲领”(Minimalist Program)中汲取灵感——在该纲领中,句法被理解为连接其他两个心理系统的最优解决方案。

In minimalism, the core computational process is a binary set formation operator called $MERGE$, that can be used to incrementally construct complex syntactic structures using a simple Markovian process. MGP is able to discover the core building blocks of the symbolic expressions, and to incrementally combined them using $MERGE$.

在极简主义中,核心计算过程是一个称为 $MERGE$ 的二元集合形成算子,它可以使用简单的马尔可夫过程逐步构建复杂的句法结构。MGP 能够发现符号表达式的核心构建块,并利用 $MERGE$ 将它们逐步组合起来。

The proposed system is benchmarked on symbolic regression tasks that are known to be difficult to solve with standard GP systems because of the propensity for bloat. Results show that when a proper lexicon of atomic syntactic objects are chosen, MGP is able to consistently produce the exact ground truth model on a set of symbolic regression where standard GP struggles to do the same.

该系统在符号回归任务上进行了基准测试,这些任务因容易产生“膨胀”(bloat)问题而难以用标准 GP 系统解决。结果表明,当选择了合适的原子句法对象词汇表时,MGP 能够在标准 GP 难以处理的一系列符号回归任务中,持续生成精确的基准真值模型。

The insights provided by minimalism are shown to be relevant to the problem of program induction, and should be explored further based on the potential exhibited by MGP in this work.

研究表明,极简主义提供的见解与程序归纳问题高度相关,且基于 MGP 在本研究中展现出的潜力,这些见解值得进一步探索。