SMAC-Talk: A Natural Language Extension of the StarCraft Multi-Agent Challenge for Large Language Models

SMAC-Talk：面向大语言模型的《星际争霸》多智能体挑战赛自然语言扩展

Abstract: As LLMs become more widely deployed, they are increasingly expected to work alongside other AI agents rather than operating in isolation. Effective coordination in these settings requires agents to communicate, share information and make decisions under uncertainty.

摘要： 随着大语言模型（LLM）的部署日益广泛，人们越来越期望它们能够与其他人工智能智能体协同工作，而非孤立运行。在这些场景中，有效的协作要求智能体能够进行沟通、共享信息，并在不确定性下做出决策。

We introduce SMAC-Talk, a natural language extension of the StarCraft Multi-Agent Challenge for evaluating LLM-based agents in cooperative multi-agent environments. The environment has several key features such as decentralized control, partial observability and long-horizon decision making.

我们推出了 SMAC-Talk，这是对《星际争霸》多智能体挑战赛（SMAC）的一种自然语言扩展，旨在评估协作多智能体环境中的基于 LLM 的智能体。该环境具备去中心化控制、部分可观测性以及长程决策等多个关键特征。

SMAC-Talk includes a natural language communication channel which is used to probe agent coordination and trust. We use this communication channel to construct different evaluation scenarios, including settings with an embedded deceptive communicator that tries to disrupt and deceive allies through communication alone.

SMAC-Talk 包含一个自然语言通信通道，用于探测智能体之间的协作与信任。我们利用该通信通道构建了不同的评估场景，包括设置了嵌入式欺骗性通信者，它仅通过通信手段试图干扰和欺骗盟友。

We provide three agents for benchmarking using 4 models from the Qwen3.5 family and study how reasoning structure, memory and model scale affect coordination between agents. We release SMAC-Talk as an open benchmark to support the research community in developing and evaluating LLM agents in cooperative multi-agent settings.

我们提供了三个用于基准测试的智能体，使用了 Qwen3.5 系列中的 4 个模型，并研究了推理结构、记忆和模型规模如何影响智能体之间的协作。我们将 SMAC-Talk 作为一项开放基准发布，以支持研究界在协作多智能体环境中开发和评估 LLM 智能体。