Cli-Modelarium 0.1.4: 10 LLM providers now, with Qwen and GLM

Cli-Modelarium 0.1.4 发布：现已支持 10 家大模型供应商，新增 Qwen 和 GLM

Quick release note. Cli-Modelarium 0.1.4 just shipped, and the headline is two new providers. Two new providers, ten in total. 简短的发布说明：Cli-Modelarium 0.1.4 刚刚发布，本次更新的重点是新增了两家供应商。至此，该工具已总共支持 10 家供应商。

You can now compare Alibaba’s Qwen models (via DashScope) and Z.AI’s GLM models side by side with the rest of the lineup: OpenAI, Anthropic, Google, xAI, DeepSeek, Mistral, Groq, OpenRouter, plus your local models. That brings it to 10 cloud providers. 现在，你可以将阿里巴巴的 Qwen 模型（通过 DashScope）和智谱 AI 的 GLM 模型，与现有的其他模型进行同台对比，包括：OpenAI、Anthropic、Google、xAI、DeepSeek、Mistral、Groq、OpenRouter 以及你本地运行的模型。这使得云端供应商总数达到了 10 家。

If you have wanted to benchmark the open-weight models against the frontier ones on your own prompts, it is now a single command: 如果你一直想用自己的提示词（prompts）来对比开源权重模型与前沿模型的效果，现在只需一条命令即可完成：

pip install --upgrade cli-modelarium
cli-modelarium "Write a haiku about garbage collection in programming" \
--models qwen3.7-max,glm-5.2,gpt-5.4,claude-opus-4-8 \
--runs 10 --max-cost 0.50

You get a side by side table with cost and latency per model. With —runs greater than 1 it repeats the trials and runs the statistical tests automatically, so you can tell a real difference from noise instead of eyeballing one output. The —max-cost flag is a hard cap, so a multi-model run does not surprise your API bill. 你将获得一张包含每个模型成本和延迟的对比表格。当 --runs 参数大于 1 时，它会自动重复测试并运行统计检验，让你能够区分真实的差异与随机噪声，而无需仅凭肉眼观察单次输出。--max-cost 标志是一个硬性上限，确保多模型运行不会让你的 API 账单出现意外。

Also in this release: 本次更新还包括：

Refreshed all pricing to current provider rates.
更新了所有供应商的最新定价。
Added Qwen and GLM to the model groups (all-flagship, all-budget, all-fast, all-cheap), plus GLM to all-reasoning, so you can pull them in by group.
将 Qwen 和 GLM 加入到模型组（all-flagship, all-budget, all-fast, all-cheap）中，并将 GLM 加入到 all-reasoning 组，方便你按组调用。
Added Python 3.14 support.
增加了对 Python 3.14 的支持。
A few model id updates to track provider renames.
更新了部分模型 ID，以适配供应商的更名。

New here? Cli-Modelarium is a command line tool for comparing LLM outputs side by side, with real statistics (bootstrap confidence intervals, paired significance tests, McNemar’s), CI-ready assertions, hallucination detection, LLM-as-judge scoring, and cost tracking. One pip install, no infrastructure, Apache 2.0. 新用户请看：Cli-Modelarium 是一款用于对比大模型输出的命令行工具，具备真实的统计分析功能（自助法置信区间、配对显著性检验、McNemar 检验）、支持 CI 的断言、幻觉检测、LLM-as-judge 评分以及成本追踪。只需一次 pip 安装，无需额外基础设施，采用 Apache 2.0 协议。

GitHub: https://github.com/lavellehatcherjr/cli-modelarium PyPI: https://pypi.org/project/cli-modelarium/

Would love to hear how the new providers work for your use case. 非常期待听到这些新供应商在你的使用场景中表现如何。