I Built a Self-Improving Health Platform: Five AI Agents That Learn Every Week

我构建了一个自我进化的健康平台：五个每周自动学习的 AI 智能体

Most AI products are static. You fine-tune a model, ship it, and it stays exactly as smart as the day you launched. Your users get the same quality on day 1 as on day 365. Mine doesn’t work that way. Every Wednesday at 3am, five AI agents wake up, talk to each other, and make the next week’s reports smarter — without me touching a single line of code. This is the architecture that makes it possible. 大多数 AI 产品都是静态的。你微调一个模型，发布它，它就永远保持在发布当天的智能水平。用户在第一天和第 365 天获得的体验质量完全一样。但我的产品并非如此。每周三凌晨 3 点，五个 AI 智能体会自动唤醒、相互交流，并让下周的报告变得更智能——而我无需触碰一行代码。以下是实现这一目标的架构。

The problem with static AI products

静态 AI 产品的问题

I run Longevity AI — a health platform that generates personalized 6-month lifestyle plans from a 28-question intake. It cross-references 10+ organ systems, checks 400+ EFSA-regulated claims, and outputs a clinically-framed report in under 15 minutes. The core AI is Claude Sonnet. It’s powerful. But it only knows what I’ve taught it. 我运营着 Longevity AI——一个通过 28 个问题问卷生成个性化 6 个月生活方式规划的健康平台。它交叉参考了 10 多个器官系统，核查 400 多项受 EFSA（欧洲食品安全局）监管的声明，并在 15 分钟内输出一份具有临床参考价值的报告。其核心 AI 是 Claude Sonnet。它很强大，但它只知道我教给它的东西。

The problem: health science moves fast. A new PubMed paper on magnesium and sleep quality drops. My platform doesn’t know. A patient with a rare medication combination comes in. The report might miss the interaction. A legal claim sneaks past review. Nobody catches it until a user complains. 问题在于：健康科学发展迅速。一篇关于镁与睡眠质量的 PubMed 新论文发布了，我的平台却不知道。一位服用罕见药物组合的患者来了，报告可能会遗漏药物相互作用。一项违规的法律声明溜过了审查，直到用户投诉才被发现。

I had two options:

Hire a team of researchers and QA engineers to manually update the system
Build agents that do it automatically I chose option 2. Here’s exactly how it works. 我有两个选择：
雇佣一个研究员和质量保证工程师团队来手动更新系统；
构建自动完成这些工作的智能体。我选择了方案 2。以下是它的具体工作原理。

The multi-agent architecture

多智能体架构

Five agents run on a fixed schedule. One orchestrates them all. 五个智能体按固定时间表运行，由一个智能体负责统筹。

Wednesday 03:00: Synthetic Patients Agent
Wednesday 04:00: Auto-KB Agent
Tuesday 03:30: Developer Tools Radar
Monday 07:00: Weekly Digest Agent
Always active: Agent Orchestrator
周三 03:00： 合成患者智能体
周三 04:00： 自动知识库智能体
周二 03:30： 开发者工具雷达
周一 07:00： 每周摘要智能体
始终活跃： 智能体统筹器

They don’t share a runtime. They communicate through the database and a lightweight event system. No complex framework — just reportAgentEvent() and a rules table. 它们不共享运行环境，而是通过数据库和一个轻量级事件系统进行通信。没有复杂的框架，只有 reportAgentEvent() 函数和一个规则表。

Agent 1: Synthetic Patients

智能体 1：合成患者

The core of the self-improvement loop. Every Wednesday at 3am, 10 synthetic patient profiles are selected from a static template library (5 conditions x 2 psychological archetypes). These are fake patients with real-looking intake responses: ferritin levels, medication lists, trauma history, stress scores. Each synthetic patient goes through the exact same production pipeline as a real user. Full Sonnet report generation. No shortcuts. 这是自我进化循环的核心。每周三凌晨 3 点，系统会从静态模板库中选择 10 个合成患者档案（5 种病症 x 2 种心理原型）。这些是拥有逼真问卷回答的虚拟患者：包括铁蛋白水平、药物清单、创伤史和压力评分。每个合成患者都会经历与真实用户完全相同的生产流程。完整的 Sonnet 报告生成，没有任何捷径。

Then a second agent — Claude Haiku — scores each report on 4 dimensions: 随后，第二个智能体——Claude Haiku——会从 4 个维度对每份报告进行评分：

Dimension	What it checks	Gap threshold
Protocol depth	Are expected correlations for this condition named?	below 6/10
Personalization	Are this patient’s specific details in the report?	below 6/10
Supplement specificity	Active biological forms named (e.g., magnesium bisglycinate)?	below 6/10
Legal safety	No forbidden medical claims, no stop-medication advice?	below 7/10

维度	检查内容	差距阈值
方案深度	是否指出了该病症的预期相关性？	低于 6/10
个性化	报告中是否包含该患者的具体细节？	低于 6/10
补充剂特异性	是否指出了活性生物形式（如甘氨酸镁）？	低于 6/10
法律安全性	是否无违禁医疗声明，无停药建议？	低于 7/10

Scores below threshold become knowledge gap proposals. Legal safety below 5 triggers an immediate compliance scan — synchronously, before anything else continues. Cost: ~€0.55/week. 低于阈值的评分将成为“知识差距建议”。法律安全性低于 5 分会触发立即合规扫描——同步进行，在其他任务继续之前优先处理。成本：约 0.55 欧元/周。

Agent 2: Auto-KB

智能体 2：自动知识库 (Auto-KB)

The knowledge base that writes itself. The gap proposals from Agent 1 contain condition types and dimensions. Agent 2 converts these into PubMed queries, fetches abstracts via the free NCBI API, and sends each abstract to Haiku with one instruction: Extract 3-5 factual claims from this abstract that are directly relevant to [condition]. Return structured triples: subject, predicate, object. The triples land in a knowledge_triples table. The report generator reads from this table at runtime. No retraining. No fine-tuning. Just better context for the next generation. 这是一个会自我编写的知识库。来自智能体 1 的差距建议包含病症类型和维度。智能体 2 将其转换为 PubMed 查询，通过免费的 NCBI API 获取摘要，并将每篇摘要发送给 Haiku，指令只有一个：从摘要中提取 3-5 条与 [病症] 直接相关的事实声明，并返回结构化三元组：主语、谓语、宾语。这些三元组存入 knowledge_triples 表中。报告生成器在运行时读取该表。无需重新训练，无需微调，只是为下一次生成提供了更好的上下文。

By Wednesday afternoon, the knowledge base has been updated. By Thursday morning, real patients get smarter reports. 到周三下午，知识库已完成更新。到周四早上，真实患者就能获得更智能的报告。

Agent 3: Developer Tools Radar

智能体 3：开发者工具雷达

Because staying current is also a product decision. Every Tuesday at 3:30am, the radar scans GitHub Trending and dev.to for tools that match a static relevance filter. Haiku summarizes each match in 1-2 sentences. The summaries land in the admin UI. Monday morning I get a digest with what the dev world built this week that’s relevant to my stack. Cost: €0.04/month. 因为保持技术前沿也是一种产品决策。每周二凌晨 3:30，雷达会扫描 GitHub Trending 和 dev.to，寻找符合静态相关性过滤器的工具。Haiku 将每个匹配项总结为 1-2 句话，并存入管理后台。周一早上，我会收到一份摘要，了解开发界本周构建的与我技术栈相关的内容。成本：0.04 欧元/月。

Agent 4: The Orchestrator

智能体 4：统筹器 (Orchestrator)

The rule engine that connects everything. Each agent calls reportAgentEvent(type, result) when it finishes. The orchestrator applies rules: 这是连接一切的规则引擎。每个智能体在完成任务时都会调用 reportAgentEvent(type, result)。统筹器应用以下规则：

R1: Legal flag → immediate compliance scan
R2: KB pipeline returned 0 facts → warning
R1： 法律警示 → 立即进行合规扫描
R2： 知识库流水线返回 0 条事实 → 发出警告

Rule R1 is the critical one. If a synthetic patient triggers a legal safety score below 5, the compliance agent doesn’t wait until next week. It runs immediately. 规则 R1 是关键。如果合成患者触发了低于 5 分的法律安全评分，合规智能体不会等到下周，而是立即运行。

Agent 5: Weekly Digest

智能体 5：每周摘要

The operator dashboard I never have to build. Every Monday at 7am, an HTML email lands in my inbox with: 这是我无需构建的操作仪表盘。每周一早上 7 点，一封 HTML 邮件会发送到我的收件箱，包含：

本周进入知识库的新事实数量
合成循环结果：发现的差距、法律警示（如有）
自动处理的 PubMed 论文
雷达上的开发者工具
本周系统成本我无需登录仪表盘或运行查询，就能确切知道系统学到了什么、修复了什么以及标记了什么。