Why I built the HuggingFace for RL agents — and why RL needs one

Why I built the HuggingFace for RL agents — and why RL needs one

为什么我为强化学习(RL)智能体构建了“HuggingFace”——以及为什么 RL 需要它

Showcase Video If you’ve ever tried MineRL or OpenAI Five, you know the feeling. The environment is fascinating. The problem is hard in all the right ways. And then you check the compute requirements — and close the tab. 展示视频:如果你曾经尝试过 MineRL 或 OpenAI Five,你一定深有体会。这些环境非常迷人,问题设置也极具挑战性。但当你查看所需的计算资源时,往往只能无奈地关闭标签页。

RL has a compute problem. The most interesting environments are locked behind serious hardware. That means most people never get to play with the fun stuff. And it means that even if someone builds a great custom environment, there’s no easy way for others to actually use it, train on it, or compete on it. That’s the gap I wanted to fix. 强化学习面临着计算资源的问题。最有趣的环境往往被高昂的硬件门槛锁住,这意味着大多数人无法接触到这些有趣的项目。同时,这也意味着即使有人构建了一个出色的自定义环境,其他人也很难轻松地使用它进行训练或参与竞争。这正是我想要填补的空白。

Introducing Agenlus — a browser-based RL training platform where you can train agents, share them via HuggingFace, and battle others on a global leaderboard. No install. No GPU bill. Just open your browser. 隆重介绍 Agenlus——一个基于浏览器的强化学习训练平台。在这里,你可以训练智能体,通过 HuggingFace 分享它们,并在全球排行榜上与他人竞技。无需安装,无需支付 GPU 费用,只需打开浏览器即可。

The goal isn’t just accessibility. It’s compounding knowledge — the same way HuggingFace made NLP and CV ecosystems compound on each other, Agenlus is built so RL agents and environments compound on each other. 我们的目标不仅仅是提高可访问性,更是为了实现知识的累积——正如 HuggingFace 推动了自然语言处理(NLP)和计算机视觉(CV)生态系统的相互促进一样,Agenlus 的构建旨在让强化学习智能体和环境之间也能实现这种协同效应。

I built this solo, and launched it this week. Would love feedback from the RL community on what environments you’d actually want to train on. 这个项目由我独立开发,并于本周正式发布。非常期待来自强化学习社区的反馈,了解你们真正希望在哪些环境上进行训练。

🔗 agenlus.com 🚀 Launching on Product Hunt 🔗 agenlus.com 🚀 正在 Product Hunt 上线