galilai-group / stable-worldmodel
galilai-group / stable-worldmodel
stable-worldmodel is a platform for reproducible world model research and evaluation. stable-worldmodel 是一个用于可复现世界模型研究与评估的平台。
Installation · Quick Start · Environments · Solvers & Baselines · Documentation · Paper · Citation 安装 · 快速入门 · 环境 · 求解器与基准 · 文档 · 论文 · 引用
stable-worldmodel provides a single, unified interface for the three stages of world model research — collecting data, training, and evaluating with model-predictive control — across a large suite of standardized environments. It ships with reference implementations of common baselines and planning solvers so research code can stay focused on the contribution that matters: the model and the objective. stable-worldmodel 为世界模型研究的三个阶段(数据收集、训练以及基于模型预测控制的评估)提供了一个统一的接口,并支持大量标准化环境。它内置了常见基准和规划求解器的参考实现,使研究代码能够专注于核心贡献:模型本身及其目标函数。
Installation
安装
From PyPI: 通过 PyPI 安装:
pip install stable-worldmodel # base only
pip install 'stable-worldmodel[all]' # + training, environments, and data formats
LeRobot dataset support is a separate opt-in extra (requires Python 3.12+): LeRobot 数据集支持是一个可选的额外功能(需要 Python 3.12+):
pip install 'stable-worldmodel[lerobot]'
From source (development): 从源码安装(开发版):
git clone https://github.com/galilai-group/stable-worldmodel
cd stable-worldmodel
uv venv --python=3.10 && source .venv/bin/activate
uv sync --extra all --group dev
Datasets and checkpoints are stored under $STABLEWM_HOME (defaults to ~/.stable_worldmodel/). Override the variable to point at your preferred storage location. The library is in active development. APIs may change between minor versions.
数据集和检查点存储在 $STABLEWM_HOME 下(默认为 ~/.stable_worldmodel/)。你可以覆盖此变量以指向你偏好的存储位置。该库处于活跃开发阶段,API 可能会在小版本更新中发生变化。
Quick Start
快速入门
import stable_worldmodel as swm
from stable_worldmodel.policy import WorldModelPolicy, PlanConfig
from stable_worldmodel.solver import CEMSolver
# 1. Collect a dataset
# 1. 收集数据集
world = swm.World("swm/PushT-v1", num_envs=8)
world.set_policy(your_expert_policy)
world.collect("data/pusht_demo.lance", episodes=100, seed=0)
# 2. Load it and train your world model (format is autodetected)
# 2. 加载并训练你的世界模型(格式自动检测)
dataset = swm.data.load_dataset("data/pusht_demo.lance", num_steps=16)
world_model = ... # your model
# 3. Evaluate with model-predictive control
# 3. 使用模型预测控制进行评估
solver = CEMSolver(model=world_model, num_samples=300)
policy = WorldModelPolicy(solver=solver, config=PlanConfig(horizon=10))
world.set_policy(policy)
results = world.evaluate(episodes=50)
print(f"Success Rate: {results['success_rate']:.1f}%")
Reference implementations are provided in scripts/train/: lewm.py implements LeWM, and prejepa.py reproduces DINO-WM. GPU utilization for LeWM trained with Push-T LanceDB dataset on a H200 GPU.
参考实现位于 scripts/train/ 中:lewm.py 实现了 LeWM,prejepa.py 复现了 DINO-WM。图示为在 H200 GPU 上使用 Push-T LanceDB 数据集训练 LeWM 时的 GPU 利用率。
Data Formats
数据格式
Recording, loading, and conversion all go through a small format registry. Pick the backend that matches your trade-off, or register your own. 记录、加载和转换均通过一个小型的格式注册表进行。选择最符合你需求的后端,或者注册你自己的格式。
| Format | On-disk layout | Best for |
|---|---|---|
| lance | LanceDB table (episode-contiguous flat rows) | default — append-friendly, fast indexed reads |
| hdf5 | single .h5 file (one dataset per column) | portable single-file artifact |
| folder | .npz columns + one JPEG per step | inspection, partial-key streaming |
| video | .npz columns + one MP4 per episode (decord) | long episodes, compact image storage |
| lerobot | lerobot://<repo_id> (read-only adapter) | training/eval directly on LeRobot Hub datasets |
| 格式 | 磁盘布局 | 适用场景 |
|---|---|---|
| lance | LanceDB 表(片段连续的扁平行) | 默认 — 易于追加,快速索引读取 |
| hdf5 | 单个 .h5 文件(每列一个数据集) | 便携式单文件制品 |
| folder | .npz 列 + 每步一张 JPEG | 检查、部分键流式传输 |
| video | .npz 列 + 每片段一个 MP4 (decord) | 长片段、紧凑的图像存储 |
| lerobot | lerobot://<repo_id> (只读适配器) | 直接在 LeRobot Hub 数据集上进行训练/评估 |
world.collect("data/pusht.lance", episodes=100) # default: lance
world.collect("data/pusht_video", episodes=100, format="video") # mp4 episodes
ds = swm.data.load_dataset("data/pusht.lance", num_steps=16) # autodetect
swm.data.convert("data/pusht.lance", "data/pusht_video", dest_format="video", fps=30) # one-shot migration
Every writer accepts a mode kwarg ('append' (default), 'overwrite', 'error'); re-running world.collect extends the existing dataset rather than failing.
每个写入器都接受一个 mode 参数(默认为 'append',可选 'overwrite' 或 'error');重新运行 world.collect 会扩展现有数据集,而不是报错。
Throughput & storage benchmarks
吞吐量与存储基准
Numbers below were produced by scripts/benchmark/compare_h5_lance.py and can be reproduced with it. Benchmarks use the PushT dataset from the LeWorldModel paper.
以下数据由 scripts/benchmark/compare_h5_lance.py 生成,可自行复现。基准测试使用了 LeWorldModel 论文中的 PushT 数据集。
(Table omitted for brevity, showing performance metrics for HDF5, LanceDB, and Video formats across local/S3 storage) (表格从略,展示了 HDF5、LanceDB 和 Video 格式在本地/S3 存储上的性能指标)
Environments
环境
Environments are pulled from the DeepMind Control Suite, Gymnasium classic control, OGBench, Craftax, the Arcade Learning Environment (100+ Atari games), and classical world model benchmarks (Two-Room, PushT). 环境来源于 DeepMind Control Suite、Gymnasium 经典控制、OGBench、Craftax、Arcade Learning Environment(100+ Atari 游戏)以及经典世界模型基准(Two-Room, PushT)。
Most environments ship with a set of factors of variation — independently controllable visual and physical parameters (lighting, textures, dynamics, morphology) — that make it straightforward to evaluate zero-shot generalization to distribution shifts without any additional setup. Adding a new environment only requires conforming to the Gymnasium interface. 大多数环境都附带了一组变异因子(Factors of Variation)——即独立可控的视觉和物理参数(光照、纹理、动力学、形态),这使得在无需额外设置的情况下,即可直接评估零样本泛化能力对分布偏移的适应性。添加新环境只需符合 Gymnasium 接口即可。
Solvers and Baselines
求解器与基准
-
Solvers: Cross-Entropy Method (CEM), Improved CEM (iCEM), Model Predictive Path Integral (MPPI), Predictive Sampling, Gradient Descent (SGD, Adam), Projected Gradient Descent (PGD), Augmented Lagrangian Constrained Opt.
-
求解器: 交叉熵方法 (CEM)、改进型 CEM (iCEM)、模型预测路径积分 (MPPI)、预测采样、梯度下降 (SGD, Adam)、投影梯度下降 (PGD)、增广拉格朗日约束优化。
-
Baselines: DINO-WM, JEPA, PLDM, LeWM, GCBC (Behaviour Cloning), GCIVL (RL), GCIQL (RL).
-
基准: DINO-WM, JEPA, PLDM, LeWM, GCBC (行为克隆), GCIVL (强化学习), GCIQL (强化学习)。
Command-Line Interface
命令行界面
After installation, the swm command is available for inspecting/converting datasets, environments, and checkpoints without writing code:
安装后,可以使用 swm 命令在无需编写代码的情况下检查/转换数据集、环境和检查点:
swm datasets # list cached datasets
swm ...