In Search of the Ingredients of Open-Endedness: Replicating Picbreeder with Large Vision-Language Models

探索开放式生成的要素：利用大型视觉语言模型复刻 Picbreeder

Abstract: We are in the midst of large-scale industrial and academic efforts to automate the processes of scientific, technological and creative production through AI-driven assistants. Historically, a fundamental property of these processes in their human form has been their open-endedness: their capacity for generating a seemingly endless supply of novel and meaningful new forms.

摘要： 我们正处于大规模工业和学术努力的浪潮中，旨在通过人工智能驱动的助手实现科学、技术和创意生产过程的自动化。从历史上看，这些过程在人类形式下的一个基本属性是其“开放式生成”（open-endedness）：即产生看似无穷无尽的新颖且有意义的新形式的能力。

Do artificial agents have any capacity for such fruitful unguided discovery? To answer this question, we turn to Picbreeder, the canonical exemplar of human-driven open-ended search, in which users collaboratively generated a diverse library of images through interactive evolution of small neural networks.

人工智能体是否具备这种富有成效的无引导发现能力？为了回答这个问题，我们将目光转向 Picbreeder——这是人类驱动的开放式搜索的经典范例，用户通过小型神经网络的交互式进化，协作生成了一个多样化的图像库。

We replicate Picbreeder, replacing human users with frontier Vision Language Models (VLMs). We observe clear qualitative differences between the output of our system and the historical human baseline, and attempt to characterize them using metrics of phylogenetic complexity and visual and semantic salience and novelty.

我们复刻了 Picbreeder，并用前沿的视觉语言模型（VLM）取代了人类用户。我们观察到我们的系统输出与历史人类基准之间存在明显的定性差异，并尝试使用系统发育复杂性以及视觉和语义显著性与新颖性等指标来对其进行表征。

In an effort to identify some of the causal factors contributing these differences, we study the addition of exploratory noise to the agents’ selection process, of behavioral diversity between agents, and of narrative momentum in the form of memory of past actions. We make our code available at this https URL.

为了确定导致这些差异的部分因果因素，我们研究了在智能体的选择过程中加入探索性噪声、智能体之间的行为多样性，以及以过去行动记忆形式存在的叙事动量。我们已将代码发布在指定链接中。