Cli-Modelarium 0.1.4: 10 LLM providers now, with Qwen and GLM
Cli-Modelarium 0.1.4: 10 LLM providers now, with Qwen and GLM
Cli-Modelarium 0.1.4 发布:现已支持 10 家大模型供应商,新增 Qwen 和 GLM
Quick release note. Cli-Modelarium 0.1.4 just shipped, and the headline is two new providers. Two new providers, ten in total. 简短的发布说明:Cli-Modelarium 0.1.4 刚刚发布,本次更新的重点是新增了两家供应商。至此,该工具已总共支持 10 家供应商。
You can now compare Alibaba’s Qwen models (via DashScope) and Z.AI’s GLM models side by side with the rest of the lineup: OpenAI, Anthropic, Google, xAI, DeepSeek, Mistral, Groq, OpenRouter, plus your local models. That brings it to 10 cloud providers. 现在,你可以将阿里巴巴的 Qwen 模型(通过 DashScope)和智谱 AI 的 GLM 模型,与现有的其他模型进行同台对比,包括:OpenAI、Anthropic、Google、xAI、DeepSeek、Mistral、Groq、OpenRouter 以及你本地运行的模型。这使得云端供应商总数达到了 10 家。
If you have wanted to benchmark the open-weight models against the frontier ones on your own prompts, it is now a single command: 如果你一直想用自己的提示词(prompts)来对比开源权重模型与前沿模型的效果,现在只需一条命令即可完成:
pip install --upgrade cli-modelarium
cli-modelarium "Write a haiku about garbage collection in programming" \
--models qwen3.7-max,glm-5.2,gpt-5.4,claude-opus-4-8 \
--runs 10 --max-cost 0.50
You get a side by side table with cost and latency per model. With —runs greater than 1 it repeats the trials and runs the statistical tests automatically, so you can tell a real difference from noise instead of eyeballing one output. The —max-cost flag is a hard cap, so a multi-model run does not surprise your API bill.
你将获得一张包含每个模型成本和延迟的对比表格。当 --runs 参数大于 1 时,它会自动重复测试并运行统计检验,让你能够区分真实的差异与随机噪声,而无需仅凭肉眼观察单次输出。--max-cost 标志是一个硬性上限,确保多模型运行不会让你的 API 账单出现意外。
Also in this release: 本次更新还包括:
- Refreshed all pricing to current provider rates.
- 更新了所有供应商的最新定价。
- Added Qwen and GLM to the model groups (all-flagship, all-budget, all-fast, all-cheap), plus GLM to all-reasoning, so you can pull them in by group.
- 将 Qwen 和 GLM 加入到模型组(all-flagship, all-budget, all-fast, all-cheap)中,并将 GLM 加入到 all-reasoning 组,方便你按组调用。
- Added Python 3.14 support.
- 增加了对 Python 3.14 的支持。
- A few model id updates to track provider renames.
- 更新了部分模型 ID,以适配供应商的更名。
New here? Cli-Modelarium is a command line tool for comparing LLM outputs side by side, with real statistics (bootstrap confidence intervals, paired significance tests, McNemar’s), CI-ready assertions, hallucination detection, LLM-as-judge scoring, and cost tracking. One pip install, no infrastructure, Apache 2.0. 新用户请看:Cli-Modelarium 是一款用于对比大模型输出的命令行工具,具备真实的统计分析功能(自助法置信区间、配对显著性检验、McNemar 检验)、支持 CI 的断言、幻觉检测、LLM-as-judge 评分以及成本追踪。只需一次 pip 安装,无需额外基础设施,采用 Apache 2.0 协议。
GitHub: https://github.com/lavellehatcherjr/cli-modelarium PyPI: https://pypi.org/project/cli-modelarium/
Would love to hear how the new providers work for your use case. 非常期待听到这些新供应商在你的使用场景中表现如何。