The claude -p playbook for June 15 — rebuilding your AI workflows inside interactive sessions

6月15日的 claude -p 指南：在交互式会话中重建你的 AI 工作流

On June 15, Claude’s claude -p (headless mode) and the Agent SDK stop drawing from your subscription and move to a separate metered credit. If you’ve built a pile of claude -p scripts, the news probably landed with a small jolt — I’ve written plenty of them myself, and “wait, all of that is metered now?” was my first reaction too. 6月15日起，Claude 的 claude -p（无头模式）和 Agent SDK 将不再消耗你的订阅额度，转而使用单独的计量积分。如果你已经编写了一堆 claude -p 脚本，这个消息可能会让你感到一丝震惊——我自己也写过很多，我的第一反应也是：“等等，现在所有这些都要按量计费了吗？”

But step back and it’s not really one company’s pricing decision. The whole industry is converging on the same shape: GitHub Copilot moved to AI Credits on June 1 (completions stay free; Chat, CLI, and agents consume credits), OpenAI Codex pairs seat pricing with credits + API usage, and Claude, on June 15, splits headless and the SDK onto a separate credit. The shape they all landed on is the same: interaction stays flat-rate, automation gets metered. 但退一步看，这其实并非某一家公司的定价决策。整个行业正在趋向于同一种模式：GitHub Copilot 已于 6 月 1 日转向 AI 积分制（补全功能保持免费；聊天、CLI 和 Agent 消耗积分）；OpenAI Codex 将席位定价与积分及 API 使用量挂钩；而 Claude 则在 6 月 15 日将无头模式和 SDK 拆分到了单独的积分体系中。它们最终达成的共识是一样的：交互保持固定费率，自动化则按量计费。

This post is about how to read that, and what to actually do with your claude -p scripts. It comes from a few months of running a “multi-agent inside interactive sessions” setup day to day. One thing up front: this is not a billing-evasion hack. The conclusion isn’t “go back to doing everything by hand,” and it isn’t “keep everything headless” — it’s a redesign that sits between the two. 这篇文章旨在探讨如何解读这一变化，以及你应该如何处理现有的 claude -p 脚本。这些建议源于我几个月来日常运行“交互式会话内多智能体”架构的经验。首先声明：这不是为了规避账单的黑客手段。结论不是“回到一切手动操作”，也不是“坚持使用无头模式”——而是一种介于两者之间的新设计。

— that said, if you opened this because you want to know “so can I keep doing my claude -p stuff inside the flat-rate plan?”, some of the recipes below do read that way. Terms and pricing lines can shift, so check the current conditions and use your own judgment. My argument is “because it’s a better setup,” not “because it’s cheaper” — but either door is fine. 话虽如此，如果你打开这篇文章是因为想知道“那我还能在固定费率计划内继续使用 claude -p 吗？”，下面的一些方案确实能实现这一点。条款和定价界限可能会变动，请务必查看当前条件并自行判断。我的观点是“因为这是一种更好的架构”，而不是“因为它更便宜”——但无论出于哪种目的，这都是可行的。

Why everyone converged on the same shape

为什么大家都趋向于同一种模式

If you want the broader picture, “The flat-fee era is over” walks through the cost mechanics across providers, and “The Tokenpocalypse” frames what survives the meter. The gist: Old-style completion was short output, and flat-rate worked. Agentic tools are different. Behind one user request, a flood of tokens: read the repo, search files, run tests, patch. The cost gap between a light user and a heavy user became extreme — flat-fee misses the real token cost of a heavy user by up to 10x — and “unlimited flat-rate for everyone” stopped being mathematically sustainable. 如果你想了解更宏观的背景，《固定费率时代已终结》一文分析了各服务商的成本机制，而《Token 启示录》则探讨了哪些功能能在计量收费下存活。核心在于：旧式的代码补全输出较短，固定费率是可行的。但智能体工具则不同。一个用户请求背后往往伴随着海量的 Token：读取仓库、搜索文件、运行测试、打补丁。轻度用户与重度用户之间的成本差距变得极其巨大——固定费率对重度用户的实际 Token 成本覆盖率可能差了 10 倍——因此“人人无限固定费率”在数学上已不再可持续。

The same shift that happened when cloud went from “server rent” to “metered usage” is now happening to LLM tokens. Overlay how each vendor drew its line and something interesting shows up: usage where a human is at the screen (interaction) stays flat-rate; usage where the human steps away (headless, SDK, CI) goes metered. The reason is simple — interaction is throughput-capped. A human reads, thinks, types. One session’s consumption tops out at human speed. Headless can be called in a loop, without bound. 云计算从“服务器租赁”转向“按量计费”时发生的转变，现在正发生在 LLM Token 上。对比各厂商划定的界限，你会发现一个有趣的现象：有人在屏幕前操作的使用场景（交互）保持固定费率；而人离开后的使用场景（无头模式、SDK、CI）则转为按量计费。原因很简单——交互的吞吐量是有上限的。人类需要阅读、思考、打字。单个会话的消耗量受限于人类的速度。而无头模式可以在循环中被无限调用。

The pricing is the answer to “which kind of usage can a flat rate actually support,” and at the same time it’s a statement about which usage the platform will structurally favor. So the thing June 15 quietly tells you: the economically durable surface is inside the interactive session. 这种定价方式回答了“固定费率究竟能支持哪种使用场景”，同时也表明了平台在结构上更倾向于哪种使用方式。因此，6 月 15 日悄悄告诉你的事实是：经济上最持久的生存空间在于交互式会话内部。

Was your claude -p really an “unattended” job?

你的 `claude -p` 真的是“无人值守”任务吗？

Looking back, a fair amount of what I ran through claude -p didn’t strictly need to be unattended. I wanted a second agent’s opinion while I was working. I wanted a review from a different model. I wanted a refactor running in parallel on another model — and in every case I was right there. But there was no channel between sessions, so I had two options: be the copy-paste courier myself, or write claude -p into a script to bridge them. 回想起来，我通过 claude -p 运行的很多任务其实并不一定非要“无人值守”。我只是想在工作时获得第二个智能体的意见，或者想要另一个模型的评审，又或者想让另一个模型并行执行重构——在这些情况下，我本人其实就在现场。但由于会话之间没有通信渠道，我只有两个选择：要么自己充当复制粘贴的搬运工，要么编写 claude -p 脚本来连接它们。

This is a common thing in how a stack matures: when a part is missing, the neighboring part carries its role. While agent-to-agent messaging didn’t exist, headless calls and glue scripts carried that weight — headless wasn’t the wrong tool, it was the only channel. Now that the missing part is filling in, you can redraw the division of labor. 这是技术栈成熟过程中的常见现象：当某个组件缺失时，相邻的组件就会承担起它的角色。在智能体间通信功能出现之前，无头调用和胶水脚本承担了这一重任——无头模式并非错误的工具，它只是当时唯一的渠道。现在，随着缺失的部分被补齐，你可以重新划分劳动分工。

The response to June 15 isn’t only “budget for headless credits” (some jobs genuinely need it — more below); it’s also moving the carried-over work back to where it belongs, and leaving claude -p only the jobs that truly have to be headless. If agents can talk to each other directly inside interactive sessions, you need neither the courier nor the bridge script. 应对 6 月 15 日变化的策略不仅仅是“为无头模式积分做预算”（有些任务确实需要它——详见下文）；更重要的是将那些被“越权”执行的工作移回它们应有的位置，只让 claude -p 处理那些真正必须无头运行的任务。如果智能体可以在交互式会话中直接对话，你既不需要搬运工，也不需要桥接脚本。

The channel I built for that is agmsg — a messaging layer that runs on nothing but bash + SQLite, letting Claude Code / Codex / Gemini CLI / Copilot CLI sessions form a team and message each other. No daemon, no network, not MCP. And the key part: send and receive both run inside your normal interactive sessions, through a hook. No claude -p, no SDK. 我为此构建的渠道是 agmsg——一个仅基于 bash + SQLite 运行的消息层，它允许 Claude Code / Codex / Gemini CLI / Copilot CLI 会话组成团队并互相发送消息。无需守护进程，无需网络，也不是 MCP。关键在于：发送和接收都在你正常的交互式会话中通过钩子（hook）运行。无需 claude -p，无需 SDK。

https://github.com/fujibee/agmsg

Here’s how to rebuild the common claude -p patterns, one by one. 以下是如何逐一重建常见的 claude -p 模式。

Recipe 1: a script that “asks an AI” → a resident buddy session

方案 1：将“询问 AI”的脚本 → 变为常驻的伙伴会话

Before: code review or a quick consult, fired off to headless from inside a script. The script is non-interactive, but the AI work inside it is fundamentally a conversation. 之前： 代码评审或快速咨询，通过脚本触发无头模式。脚本是非交互式的，但其中的 AI 工作本质上是一场对话。

#!/usr/bin/env bash
# review-diff.sh — runs in the dev loop
set -euo pipefail
diff=$(git diff --staged)

# headless call — metered after June 15
review=$(claude -p \
  --print "Review this diff for race conditions and SQL injection. Return JSON \
  {\"verdict\":\"ok|block\",\"notes\":\"...\"}." \
  <<<"$diff")

echo "$review" | jq -e '.verdict == "ok"' >/dev/null || {
  echo "$review" | jq -r '.notes' >&2
  exit 1
}

After: open a second Claude Code in another terminal and keep it in the team (real-time monitor mode). The reviewer is a regular interactive session, covered by your subscription. 之后： 在另一个终端打开第二个 Claude Code 并将其保持在团队中（实时监控模式）。评审者现在是一个常规的交互式会话，包含在你的订阅额度内。

One-time setup (terminal 2, the reviewer) — just this, inside the session: 一次性设置（终端 2，评审者）——在会话内只需执行：

/agmsg
# → it asks for team + agent name: answer team: dev / agent: alice
# → pick monitor (real-time) when asked how to receive
# from now on, anything addressed to alice streams into this window live

The script side (terminal 1, where claude -p used to be): 脚本端（终端 1，原先使用 claude -p 的地方）：

#!/usr/bin/env bash
# review-diff.sh — agmsg version
set -euo pipefail
TEAM=dev
FROM=worker # this script's identity
TO=alice    # the resident reviewer session
# hand the review to the live session; body is plain text —
# pass a reference