galilai-group / stable-worldmodel

galilai-group / stable-worldmodel

stable-worldmodel is a platform for reproducible world model research and evaluation. stable-worldmodel 是一个用于可复现世界模型研究与评估的平台。

Installation · Quick Start · Environments · Solvers & Baselines · Documentation · Paper · Citation 安装 · 快速入门 · 环境 · 求解器与基准 · 文档 · 论文 · 引用

stable-worldmodel provides a single, unified interface for the three stages of world model research — collecting data, training, and evaluating with model-predictive control — across a large suite of standardized environments. It ships with reference implementations of common baselines and planning solvers so research code can stay focused on the contribution that matters: the model and the objective. stable-worldmodel 为世界模型研究的三个阶段(数据收集、训练以及基于模型预测控制的评估)提供了一个统一的接口,并支持大量标准化环境。它内置了常见基准和规划求解器的参考实现,使研究代码能够专注于核心贡献:模型本身及其目标函数。

Installation

安装

From PyPI: 通过 PyPI 安装:

pip install stable-worldmodel # base only
pip install 'stable-worldmodel[all]' # + training, environments, and data formats

LeRobot dataset support is a separate opt-in extra (requires Python 3.12+): LeRobot 数据集支持是一个可选的额外功能(需要 Python 3.12+):

pip install 'stable-worldmodel[lerobot]'

From source (development): 从源码安装(开发版):

git clone https://github.com/galilai-group/stable-worldmodel
cd stable-worldmodel
uv venv --python=3.10 && source .venv/bin/activate
uv sync --extra all --group dev

Datasets and checkpoints are stored under $STABLEWM_HOME (defaults to ~/.stable_worldmodel/). Override the variable to point at your preferred storage location. The library is in active development. APIs may change between minor versions. 数据集和检查点存储在 $STABLEWM_HOME 下(默认为 ~/.stable_worldmodel/)。你可以覆盖此变量以指向你偏好的存储位置。该库处于活跃开发阶段,API 可能会在小版本更新中发生变化。

Quick Start

快速入门

import stable_worldmodel as swm
from stable_worldmodel.policy import WorldModelPolicy, PlanConfig
from stable_worldmodel.solver import CEMSolver

# 1. Collect a dataset
# 1. 收集数据集
world = swm.World("swm/PushT-v1", num_envs=8)
world.set_policy(your_expert_policy)
world.collect("data/pusht_demo.lance", episodes=100, seed=0)

# 2. Load it and train your world model (format is autodetected)
# 2. 加载并训练你的世界模型(格式自动检测)
dataset = swm.data.load_dataset("data/pusht_demo.lance", num_steps=16)
world_model = ... # your model

# 3. Evaluate with model-predictive control
# 3. 使用模型预测控制进行评估
solver = CEMSolver(model=world_model, num_samples=300)
policy = WorldModelPolicy(solver=solver, config=PlanConfig(horizon=10))
world.set_policy(policy)
results = world.evaluate(episodes=50)
print(f"Success Rate: {results['success_rate']:.1f}%")

Reference implementations are provided in scripts/train/: lewm.py implements LeWM, and prejepa.py reproduces DINO-WM. GPU utilization for LeWM trained with Push-T LanceDB dataset on a H200 GPU. 参考实现位于 scripts/train/ 中:lewm.py 实现了 LeWM,prejepa.py 复现了 DINO-WM。图示为在 H200 GPU 上使用 Push-T LanceDB 数据集训练 LeWM 时的 GPU 利用率。

Data Formats

数据格式

Recording, loading, and conversion all go through a small format registry. Pick the backend that matches your trade-off, or register your own. 记录、加载和转换均通过一个小型的格式注册表进行。选择最符合你需求的后端,或者注册你自己的格式。

FormatOn-disk layoutBest for
lanceLanceDB table (episode-contiguous flat rows)default — append-friendly, fast indexed reads
hdf5single .h5 file (one dataset per column)portable single-file artifact
folder.npz columns + one JPEG per stepinspection, partial-key streaming
video.npz columns + one MP4 per episode (decord)long episodes, compact image storage
lerobotlerobot://<repo_id> (read-only adapter)training/eval directly on LeRobot Hub datasets
格式磁盘布局适用场景
lanceLanceDB 表(片段连续的扁平行)默认 — 易于追加,快速索引读取
hdf5单个 .h5 文件(每列一个数据集)便携式单文件制品
folder.npz 列 + 每步一张 JPEG检查、部分键流式传输
video.npz 列 + 每片段一个 MP4 (decord)长片段、紧凑的图像存储
lerobotlerobot://<repo_id> (只读适配器)直接在 LeRobot Hub 数据集上进行训练/评估
world.collect("data/pusht.lance", episodes=100) # default: lance
world.collect("data/pusht_video", episodes=100, format="video") # mp4 episodes
ds = swm.data.load_dataset("data/pusht.lance", num_steps=16) # autodetect
swm.data.convert("data/pusht.lance", "data/pusht_video", dest_format="video", fps=30) # one-shot migration

Every writer accepts a mode kwarg ('append' (default), 'overwrite', 'error'); re-running world.collect extends the existing dataset rather than failing. 每个写入器都接受一个 mode 参数(默认为 'append',可选 'overwrite''error');重新运行 world.collect 会扩展现有数据集,而不是报错。

Throughput & storage benchmarks

吞吐量与存储基准

Numbers below were produced by scripts/benchmark/compare_h5_lance.py and can be reproduced with it. Benchmarks use the PushT dataset from the LeWorldModel paper. 以下数据由 scripts/benchmark/compare_h5_lance.py 生成,可自行复现。基准测试使用了 LeWorldModel 论文中的 PushT 数据集。

(Table omitted for brevity, showing performance metrics for HDF5, LanceDB, and Video formats across local/S3 storage) (表格从略,展示了 HDF5、LanceDB 和 Video 格式在本地/S3 存储上的性能指标)

Environments

环境

Environments are pulled from the DeepMind Control Suite, Gymnasium classic control, OGBench, Craftax, the Arcade Learning Environment (100+ Atari games), and classical world model benchmarks (Two-Room, PushT). 环境来源于 DeepMind Control Suite、Gymnasium 经典控制、OGBench、Craftax、Arcade Learning Environment(100+ Atari 游戏)以及经典世界模型基准(Two-Room, PushT)。

Most environments ship with a set of factors of variation — independently controllable visual and physical parameters (lighting, textures, dynamics, morphology) — that make it straightforward to evaluate zero-shot generalization to distribution shifts without any additional setup. Adding a new environment only requires conforming to the Gymnasium interface. 大多数环境都附带了一组变异因子(Factors of Variation)——即独立可控的视觉和物理参数(光照、纹理、动力学、形态),这使得在无需额外设置的情况下,即可直接评估零样本泛化能力对分布偏移的适应性。添加新环境只需符合 Gymnasium 接口即可。

Solvers and Baselines

求解器与基准

  • Solvers: Cross-Entropy Method (CEM), Improved CEM (iCEM), Model Predictive Path Integral (MPPI), Predictive Sampling, Gradient Descent (SGD, Adam), Projected Gradient Descent (PGD), Augmented Lagrangian Constrained Opt.

  • 求解器: 交叉熵方法 (CEM)、改进型 CEM (iCEM)、模型预测路径积分 (MPPI)、预测采样、梯度下降 (SGD, Adam)、投影梯度下降 (PGD)、增广拉格朗日约束优化。

  • Baselines: DINO-WM, JEPA, PLDM, LeWM, GCBC (Behaviour Cloning), GCIVL (RL), GCIQL (RL).

  • 基准: DINO-WM, JEPA, PLDM, LeWM, GCBC (行为克隆), GCIVL (强化学习), GCIQL (强化学习)。

Command-Line Interface

命令行界面

After installation, the swm command is available for inspecting/converting datasets, environments, and checkpoints without writing code: 安装后,可以使用 swm 命令在无需编写代码的情况下检查/转换数据集、环境和检查点:

swm datasets # list cached datasets
swm ...