Investigating Action Encodings in Recurrent Neural Networks in Reinforcement Learning

探究强化学习中循环神经网络的动作编码

Abstract: Building and maintaining state to learn policies and value functions is critical for deploying reinforcement learning (RL) agents in the real world. Recurrent neural networks (RNNs) have become a key point of interest for the state-building problem, and several large-scale reinforcement learning agents incorporate recurrent networks.

摘要： 构建并维护状态以学习策略和价值函数，对于在现实世界中部署强化学习（RL）智能体至关重要。循环神经网络（RNN）已成为解决状态构建问题的关键研究方向，目前已有多个大规模强化学习智能体集成了循环网络。

While RNNs have become a mainstay in many RL applications, many key design choices and implementation details responsible for performance improvements are often not reported. In this work, we discuss one axis on which RNN architectures can be (and have been) modified for use in RL.

尽管 RNN 已成为许多强化学习应用中的中流砥柱，但许多对性能提升至关重要的关键设计选择和实现细节往往未被披露。在这项工作中，我们讨论了 RNN 架构在强化学习应用中可以（且已经被）修改的一个维度。

Specifically, we look at how action information can be incorporated into the state update function of a recurrent cell. We discuss several choices in using action information and empirically evaluate the resulting architectures on a set of illustrative domains. Finally, we discuss future work in developing recurrent cells and discuss challenges specific to the RL setting.

具体而言，我们研究了如何将动作信息整合到循环单元的状态更新函数中。我们讨论了使用动作信息的几种选择，并在若干典型领域中对由此产生的架构进行了实证评估。最后，我们探讨了开发循环单元的未来工作，并讨论了强化学习场景所特有的挑战。

Paper Details:

Authors: Matthew Schlegel, Volodymyr Tkachuk, Adam White, Martha White
Submission Date: 4 May 2026
Subject: Machine Learning (cs.LG)
DOI: 10.48550/arXiv.2605.16318

论文详情：

作者： Matthew Schlegel, Volodymyr Tkachuk, Adam White, Martha White
提交日期： 2026年5月4日
学科： 机器学习 (cs.LG)
DOI： 10.48550/arXiv.2605.16318