JuliusBrussee / caveman

caveman why use many token when few do trick caveman 为什么能用少词解决，却要用那么多 token？

Before/After • Install • What You Get • Benchmarks • Full install guide 前后对比 • 安装 • 功能概览 • 基准测试 • 完整安装指南

A Claude Code skill/plugin (also Codex, Gemini, Cursor, Windsurf, Cline, Copilot, 30+ more) that makes agent talk like caveman — cuts ~75% of output tokens, keeps full technical accuracy. Brain still big. Mouth small. 这是一个适用于 Claude Code 以及其他 30 多种工具（如 Codex, Gemini, Cursor, Windsurf, Cline, Copilot 等）的技能/插件，它能让 AI 像原始人一样说话——在保持技术准确性的前提下，减少约 75% 的输出 token。大脑依然强大，嘴巴变得精简。

Before / After

前 / 后

🗣️ Normal Claude (69 tokens) “The reason your React component is re-rendering is likely because you’re creating a new object reference on each render cycle. When you pass an inline object as a prop, React’s shallow comparison sees it as a different object every time, which triggers a re-render. I’d recommend using useMemo to memoize the object.”

🗣️ 普通 Claude (69 tokens) “你的 React 组件重新渲染的原因很可能是因为你在每个渲染周期都创建了一个新的对象引用。当你传递一个内联对象作为 prop 时，React 的浅比较每次都会将其视为不同的对象，从而触发重新渲染。我建议使用 useMemo 来缓存该对象。”

Caveman Claude (19 tokens) “New object ref each render. Inline object prop = new ref = re-render. Wrap in useMemo.”

原始人 Claude (19 tokens) “每次渲染都有新对象引用。内联对象 prop = 新引用 = 重新渲染。用 useMemo 包裹。”

🗣️ Normal Claude “Sure! I’d be happy to help you with that. The issue you’re experiencing is most likely caused by your authentication middleware not properly validating the token expiry. Let me take a look and suggest a fix.”

🗣️ 普通 Claude “当然！我很乐意帮你解决这个问题。你遇到的问题很可能是因为你的身份验证中间件没有正确验证 token 的过期时间。让我看一看并提供修复建议。”

Caveman Claude “Bug in auth middleware. Token expiry check use < not <=. Fix:”

原始人 Claude “身份验证中间件有 Bug。Token 过期检查用了 < 而不是 <=。修复如下：”

Same fix. 75% less word. Brain still big. 修复方案一样。字数减少 75%。大脑依然强大。

┌─────────────────────────────────────┐
│ TOKENS SAVED      ████████ 75%      │
│ TECHNICAL ACCURACY ████████ 100%    │
│ SPEED INCREASE     ████████ ~3x     │
│ VIBES              ████████ OOG     │
└─────────────────────────────────────┘

Pick your level of grunt — lite (drop filler), full (default caveman), ultra (telegraphic), or wenyan (classical Chinese, even shorter). One command switch. Cost go down forever. 选择你的“咕哝”等级——lite（去填充词）、full（默认原始人）、ultra（电报式）或 wenyan（文言文，更短）。一键切换。成本永久降低。

Speak your tongue. Caveman keep your language. You write Portuguese, caveman grunt Portuguese. Spanish, French, same. Compress the style, not the language. Code, command, error string stay exact. 使用你的母语。Caveman 会保留你的语言。你写葡萄牙语，Caveman 就用葡萄牙语咕哝。西班牙语、法语同理。压缩的是风格，而非语言。代码、命令、错误字符串保持精确。

“Novo ref de objeto cada render. Prop inline = novo ref = re-render. Envolva com useMemo.” “Novo ref de objeto cada render. Prop inline = novo ref = re-render. Envolva com useMemo.”

Like this trick? Now get whole agent — caveman-code 喜欢这个技巧吗？现在获取完整的代理工具 —— caveman-code

This skill shrink what agent say. caveman-code shrink everything — full terminal coding agent, caveman top to bottom. ~2× fewer tokens than Codex on identical tasks. 20+ providers · plan mode · autopilot goal loop · MIT. 这个技能可以压缩代理的回复。而 caveman-code 则压缩一切——它是完整的终端编码代理，从头到尾都是原始人风格。在相同任务下，比 Codex 节省约 2 倍的 token。支持 20 多种提供商 · 计划模式 · 自动驾驶目标循环 · MIT 协议。

npm install -g @juliusbrussee/caveman-code ▶ Try caveman-code now → — why use many token when whole agent save ▶ 立即尝试 caveman-code → —— 当整个代理都能节省 token 时，何必用那么多呢？

Install

安装

One line. Find every agent. Install for each. 一行命令。找到所有代理，逐一安装。

# macOS / Linux / WSL / Git Bash
curl -fsSL https://raw.githubusercontent.com/JuliusBrussee/caveman/main/install.sh | bash

# Windows (PowerShell 5.1+)
irm https://raw.githubusercontent.com/JuliusBrussee/caveman/main/install.ps1 | iex

~30 seconds. Needs Node ≥18. Skip agent you no have. Safe to re-run. 约 30 秒。需要 Node ≥18。跳过你没有的代理。可安全重复运行。

Trigger: type /caveman or say “talk like caveman”. Stop with “normal mode”. One agent only, manual command, or any of 30+ other agents → INSTALL.md. 触发方式：输入 /caveman 或说 “talk like caveman”。输入 “normal mode” 停止。支持单个代理、手动命令或 30 多种其他代理 → 详见 INSTALL.md。

Install break? Open agent, say “Read CLAUDE.md and INSTALL.md, install caveman for me.” Agent fix own brain. 安装出问题？打开代理，说：“Read CLAUDE.md and INSTALL.md, install caveman for me.” 代理会修复它自己的大脑。

What You Get

功能概览

Skill: /caveman [lite|full|ultra|wenyan] Compress every reply. Levels stick until session end. 技能： /caveman [lite|full|ultra|wenyan] 压缩每次回复。等级在会话结束前一直有效。
/caveman-commit: Conventional Commit messages, ≤50 char subject. Why over what. /caveman-commit：符合规范的提交信息，主题 ≤50 字符。侧重“为什么”而非“是什么”。
/caveman-review: One-line PR comments: L42: 🔴 bug: user null. Add guard. /caveman-review：单行 PR 注释：L42: 🔴 bug: user null. Add guard.
/caveman-stats: Real session token usage + lifetime savings + USD. Tweetable line via --share. /caveman-stats：实时会话 token 使用量 + 终身节省量 + 美元金额。可通过 --share 生成可发推文的统计行。
/caveman-compress <file>: Rewrite memory file (e.g. CLAUDE.md) into caveman-speak. Cuts ~46% input tokens every session. Code/URLs/paths byte-preserved. /caveman-compress <file>：将记忆文件（如 CLAUDE.md）重写为原始人语言。每次会话减少约 46% 的输入 token。代码/URL/路径保持不变。
caveman-shrink: MCP middleware. Wraps any MCP server, compresses tool descriptions. npm. caveman-shrink：MCP 中间件。封装任何 MCP 服务器，压缩工具描述。
cavecrew-*: Caveman subagents (investigator/builder/reviewer). ~60% fewer tokens than vanilla, main context lasts longer. cavecrew-*：原始人子代理（调查员/构建者/审查员）。比原生代理少约 60% 的 token，主上下文更持久。

Statusline badge — Claude Code shows [CAVEMAN] ⛏ 12.4k (lifetime tokens saved). Updates every /caveman-stats run. Set CAVEMAN_STATUSLINE_SAVINGS=0 to silence. 状态栏徽章 — Claude Code 显示 [CAVEMAN] ⛏ 12.4k (已节省终身 token)。每次运行 /caveman-stats 时更新。设置 CAVEMAN_STATUSLINE_SAVINGS=0 可关闭。

Auto-activate every session: Claude Code, Codex, Gemini (built-in). Cursor / Windsurf / Cline / Copilot get always-on rule files via --with-init. Other agents trigger with /caveman per session. Full feature matrix in INSTALL.md. 自动激活：Claude Code, Codex, Gemini（内置）。Cursor / Windsurf / Cline / Copilot 通过 --with-init 获取常驻规则文件。其他代理需在会话中通过 /caveman 触发。完整功能矩阵见 INSTALL.md。

Benchmarks

基准测试

Real token counts from the Claude API. Average 65% output reduction across 10 prompts (range 22-87%). 来自 Claude API 的真实 token 计数。10 个提示词平均减少 65% 的输出（范围 22-87%）。

Task	Normal	Caveman	Saved
Explain React re-render bug	1180	159	87%
Fix auth middleware token expiry	704	121	83%
Set up PostgreSQL connection pool	2347	380	84%
Explain git rebase vs merge	702	292	58%
Refactor callback to async/await	387	301	22%
Architecture: microservices vs monolith	446	310	30%
Review PR for security issues	678	398	41%
Docker multi-stage build	1042	290	72%
Debug PostgreSQL race condition	1200	232	81%
Implement React error boundary	3454	456	87%
Average	1214	294	65%

Raw data and reproduction script: benchmarks/. Three-arm eval harness (baseline / terse / skill) lives in evals/ — caveman compared against Answer concisely, not against verbose default, so the delta is honest. 原始数据和复现脚本：benchmarks/。三方评估工具（基准/简洁/技能）位于 evals/ — Caveman 是与“简洁回答”进行对比，而非与冗长的默认回答对比，因此差值是真实的。

caveman-compress receipts (real memory files): caveman-compress 效果（真实记忆文件）：

File	Original	Compressed	Saved
claude-md-preferences.md	706	285	59.6%
project-notes.md	1145	535	53.3%
claude-md-project.md	1122	636	43.3%
todo-list.md	627	388	38.1%
mixed-with-code.md	888	560	36.9%
Average	898	481	46%

Important

重要提示

Caveman only affects output tokens — thinking/reasoning tokens untouched. Caveman no make brain smaller. Caveman make mouth smaller. Biggest win is readability and speed, cost savings a bonus. Caveman 只影响输出 token —— 思维/推理 token 不受影响。Caveman 不会让大脑变小，只会让嘴巴变小。最大的收获是可读性和速度，成本节省只是额外奖励。

A March 2026 paper “Brevity Constraints Reverse Performance Hierarchies in Language Models” found that constraining large models to brief responses improved accuracy by 26 points on certain benchmarks. Verbose not always better. Sometimes less word = more correct. 2026 年 3 月的一篇论文《简洁约束逆转语言模型性能层级》发现，限制大模型进行简短回复在某些基准测试中将准确率提高了 26 个百分点。冗长并不总是更好。有时字数越少 = 越准确。

How It Work

工作原理

Install drop skill file in agent. Skill tell agent: drop filler, keep substance, use fragments. For Claude Code, hook also write tiny flag file each session — agent see flag, talk caveman from message one. No need say /caveman. 安装时会在代理中放入技能文件。技能告诉代理：去掉填充词，保留实质内容，使用片段。对于 Claude Code，钩子程序还会为每个会话写入一个微小的标志文件 —— 代理看到标志后，从第一条消息开始就以原始人方式说话。无需输入 /caveman。

Stats command read Claude Code session log, count tokens saved, write number to statusline. Caveman-compress sub-skill rewrite memory files (CLAUDE.md, project notes) so each session start with smaller context. Save tokens forever, not just one reply. Stats 命令读取 Claude Code 会话日志，计算节省的 token，并将数字写入状态栏。Caveman-compress 子技能重写记忆文件（CLAUDE.md, 项目笔记），使每个会话都以更小的上下文开始。永久节省 token，而不仅仅是一次回复。