galilai-group / stable-worldmodel

stable-worldmodel is a platform for reproducible world model research and evaluation. stable-worldmodel 是一个用于可复现世界模型研究与评估的平台。

Installation · Quick Start · Environments · Solvers & Baselines · Documentation · Paper · Citation 安装 · 快速入门 · 环境 · 求解器与基准 · 文档 · 论文 · 引用

stable-worldmodel provides a single, unified interface for the three stages of world model research — collecting data, training, and evaluating with model-predictive control — across a large suite of standardized environments. It ships with reference implementations of common baselines and planning solvers so research code can stay focused on the contribution that matters: the model and the objective. stable-worldmodel 为世界模型研究的三个阶段（数据收集、训练以及基于模型预测控制的评估）提供了一个统一的接口，并支持大量标准化环境。它内置了常见基准和规划求解器的参考实现，使研究代码能够专注于核心贡献：模型本身及其目标函数。

Installation

安装

From PyPI: 通过 PyPI 安装：

pip install stable-worldmodel # base only
pip install 'stable-worldmodel[all]' # + training, environments, and data formats

LeRobot dataset support is a separate opt-in extra (requires Python 3.12+): LeRobot 数据集支持是一个可选的额外功能（需要 Python 3.12+）：

pip install 'stable-worldmodel[lerobot]'

From source (development): 从源码安装（开发版）：

git clone https://github.com/galilai-group/stable-worldmodel
cd stable-worldmodel
uv venv --python=3.10 && source .venv/bin/activate
uv sync --extra all --group dev

Datasets and checkpoints are stored under $STABLEWM_HOME (defaults to ~/.stable_worldmodel/). Override the variable to point at your preferred storage location. The library is in active development. APIs may change between minor versions. 数据集和检查点存储在 $STABLEWM_HOME 下（默认为 ~/.stable_worldmodel/）。你可以覆盖此变量以指向你偏好的存储位置。该库处于活跃开发阶段，API 可能会在小版本更新中发生变化。

Quick Start

快速入门

import stable_worldmodel as swm
from stable_worldmodel.policy import WorldModelPolicy, PlanConfig
from stable_worldmodel.solver import CEMSolver

# 1. Collect a dataset
# 1. 收集数据集
world = swm.World("swm/PushT-v1", num_envs=8)
world.set_policy(your_expert_policy)
world.collect("data/pusht_demo.lance", episodes=100, seed=0)

# 2. Load it and train your world model (format is autodetected)
# 2. 加载并训练你的世界模型（格式自动检测）
dataset = swm.data.load_dataset("data/pusht_demo.lance", num_steps=16)
world_model = ... # your model

# 3. Evaluate with model-predictive control
# 3. 使用模型预测控制进行评估
solver = CEMSolver(model=world_model, num_samples=300)
policy = WorldModelPolicy(solver=solver, config=PlanConfig(horizon=10))
world.set_policy(policy)
results = world.evaluate(episodes=50)
print(f"Success Rate: {results['success_rate']:.1f}%")

Reference implementations are provided in scripts/train/: lewm.py implements LeWM, and prejepa.py reproduces DINO-WM. GPU utilization for LeWM trained with Push-T LanceDB dataset on a H200 GPU. 参考实现位于 scripts/train/ 中：lewm.py 实现了 LeWM，prejepa.py 复现了 DINO-WM。图示为在 H200 GPU 上使用 Push-T LanceDB 数据集训练 LeWM 时的 GPU 利用率。

Data Formats

数据格式

Recording, loading, and conversion all go through a small format registry. Pick the backend that matches your trade-off, or register your own. 记录、加载和转换均通过一个小型的格式注册表进行。选择最符合你需求的后端，或者注册你自己的格式。

Format	On-disk layout	Best for
lance	LanceDB table (episode-contiguous flat rows)	default — append-friendly, fast indexed reads
hdf5	single .h5 file (one dataset per column)	portable single-file artifact
folder	.npz columns + one JPEG per step	inspection, partial-key streaming
video	.npz columns + one MP4 per episode (decord)	long episodes, compact image storage
lerobot	lerobot://<repo_id> (read-only adapter)	training/eval directly on LeRobot Hub datasets

格式	磁盘布局	适用场景
lance	LanceDB 表（片段连续的扁平行）	默认 — 易于追加，快速索引读取
hdf5	单个 .h5 文件（每列一个数据集）	便携式单文件制品
folder	.npz 列 + 每步一张 JPEG	检查、部分键流式传输
video	.npz 列 + 每片段一个 MP4 (decord)	长片段、紧凑的图像存储
lerobot	lerobot://<repo_id> (只读适配器)	直接在 LeRobot Hub 数据集上进行训练/评估

world.collect("data/pusht.lance", episodes=100) # default: lance
world.collect("data/pusht_video", episodes=100, format="video") # mp4 episodes
ds = swm.data.load_dataset("data/pusht.lance", num_steps=16) # autodetect
swm.data.convert("data/pusht.lance", "data/pusht_video", dest_format="video", fps=30) # one-shot migration

Every writer accepts a mode kwarg ('append' (default), 'overwrite', 'error'); re-running world.collect extends the existing dataset rather than failing. 每个写入器都接受一个 mode 参数（默认为 'append'，可选 'overwrite' 或 'error'）；重新运行 world.collect 会扩展现有数据集，而不是报错。

Throughput & storage benchmarks

吞吐量与存储基准

Numbers below were produced by scripts/benchmark/compare_h5_lance.py and can be reproduced with it. Benchmarks use the PushT dataset from the LeWorldModel paper. 以下数据由 scripts/benchmark/compare_h5_lance.py 生成，可自行复现。基准测试使用了 LeWorldModel 论文中的 PushT 数据集。

(Table omitted for brevity, showing performance metrics for HDF5, LanceDB, and Video formats across local/S3 storage) (表格从略，展示了 HDF5、LanceDB 和 Video 格式在本地/S3 存储上的性能指标)

Environments

环境

Environments are pulled from the DeepMind Control Suite, Gymnasium classic control, OGBench, Craftax, the Arcade Learning Environment (100+ Atari games), and classical world model benchmarks (Two-Room, PushT). 环境来源于 DeepMind Control Suite、Gymnasium 经典控制、OGBench、Craftax、Arcade Learning Environment（100+ Atari 游戏）以及经典世界模型基准（Two-Room, PushT）。

Most environments ship with a set of factors of variation — independently controllable visual and physical parameters (lighting, textures, dynamics, morphology) — that make it straightforward to evaluate zero-shot generalization to distribution shifts without any additional setup. Adding a new environment only requires conforming to the Gymnasium interface. 大多数环境都附带了一组变异因子（Factors of Variation）——即独立可控的视觉和物理参数（光照、纹理、动力学、形态），这使得在无需额外设置的情况下，即可直接评估零样本泛化能力对分布偏移的适应性。添加新环境只需符合 Gymnasium 接口即可。

Solvers and Baselines

求解器与基准

Solvers: Cross-Entropy Method (CEM), Improved CEM (iCEM), Model Predictive Path Integral (MPPI), Predictive Sampling, Gradient Descent (SGD, Adam), Projected Gradient Descent (PGD), Augmented Lagrangian Constrained Opt.
求解器： 交叉熵方法 (CEM)、改进型 CEM (iCEM)、模型预测路径积分 (MPPI)、预测采样、梯度下降 (SGD, Adam)、投影梯度下降 (PGD)、增广拉格朗日约束优化。
Baselines: DINO-WM, JEPA, PLDM, LeWM, GCBC (Behaviour Cloning), GCIVL (RL), GCIQL (RL).
基准： DINO-WM, JEPA, PLDM, LeWM, GCBC (行为克隆), GCIVL (强化学习), GCIQL (强化学习)。

Command-Line Interface

命令行界面

After installation, the swm command is available for inspecting/converting datasets, environments, and checkpoints without writing code: 安装后，可以使用 swm 命令在无需编写代码的情况下检查/转换数据集、环境和检查点：

swm datasets # list cached datasets
swm ...