I Built a Self-Improving Health Platform: Five AI Agents That Learn Every Week
I Built a Self-Improving Health Platform: Five AI Agents That Learn Every Week
我构建了一个自我进化的健康平台:五个每周自动学习的 AI 智能体
Most AI products are static. You fine-tune a model, ship it, and it stays exactly as smart as the day you launched. Your users get the same quality on day 1 as on day 365. Mine doesn’t work that way. Every Wednesday at 3am, five AI agents wake up, talk to each other, and make the next week’s reports smarter — without me touching a single line of code. This is the architecture that makes it possible. 大多数 AI 产品都是静态的。你微调一个模型,发布它,它就永远保持在发布当天的智能水平。用户在第一天和第 365 天获得的体验质量完全一样。但我的产品并非如此。每周三凌晨 3 点,五个 AI 智能体会自动唤醒、相互交流,并让下周的报告变得更智能——而我无需触碰一行代码。以下是实现这一目标的架构。
The problem with static AI products
静态 AI 产品的问题
I run Longevity AI — a health platform that generates personalized 6-month lifestyle plans from a 28-question intake. It cross-references 10+ organ systems, checks 400+ EFSA-regulated claims, and outputs a clinically-framed report in under 15 minutes. The core AI is Claude Sonnet. It’s powerful. But it only knows what I’ve taught it. 我运营着 Longevity AI——一个通过 28 个问题问卷生成个性化 6 个月生活方式规划的健康平台。它交叉参考了 10 多个器官系统,核查 400 多项受 EFSA(欧洲食品安全局)监管的声明,并在 15 分钟内输出一份具有临床参考价值的报告。其核心 AI 是 Claude Sonnet。它很强大,但它只知道我教给它的东西。
The problem: health science moves fast. A new PubMed paper on magnesium and sleep quality drops. My platform doesn’t know. A patient with a rare medication combination comes in. The report might miss the interaction. A legal claim sneaks past review. Nobody catches it until a user complains. 问题在于:健康科学发展迅速。一篇关于镁与睡眠质量的 PubMed 新论文发布了,我的平台却不知道。一位服用罕见药物组合的患者来了,报告可能会遗漏药物相互作用。一项违规的法律声明溜过了审查,直到用户投诉才被发现。
I had two options:
- Hire a team of researchers and QA engineers to manually update the system
- Build agents that do it automatically I chose option 2. Here’s exactly how it works. 我有两个选择:
- 雇佣一个研究员和质量保证工程师团队来手动更新系统;
- 构建自动完成这些工作的智能体。 我选择了方案 2。以下是它的具体工作原理。
The multi-agent architecture
多智能体架构
Five agents run on a fixed schedule. One orchestrates them all. 五个智能体按固定时间表运行,由一个智能体负责统筹。
- Wednesday 03:00: Synthetic Patients Agent
- Wednesday 04:00: Auto-KB Agent
- Tuesday 03:30: Developer Tools Radar
- Monday 07:00: Weekly Digest Agent
- Always active: Agent Orchestrator
- 周三 03:00: 合成患者智能体
- 周三 04:00: 自动知识库智能体
- 周二 03:30: 开发者工具雷达
- 周一 07:00: 每周摘要智能体
- 始终活跃: 智能体统筹器
They don’t share a runtime. They communicate through the database and a lightweight event system. No complex framework — just reportAgentEvent() and a rules table.
它们不共享运行环境,而是通过数据库和一个轻量级事件系统进行通信。没有复杂的框架,只有 reportAgentEvent() 函数和一个规则表。
Agent 1: Synthetic Patients
智能体 1:合成患者
The core of the self-improvement loop. Every Wednesday at 3am, 10 synthetic patient profiles are selected from a static template library (5 conditions x 2 psychological archetypes). These are fake patients with real-looking intake responses: ferritin levels, medication lists, trauma history, stress scores. Each synthetic patient goes through the exact same production pipeline as a real user. Full Sonnet report generation. No shortcuts. 这是自我进化循环的核心。每周三凌晨 3 点,系统会从静态模板库中选择 10 个合成患者档案(5 种病症 x 2 种心理原型)。这些是拥有逼真问卷回答的虚拟患者:包括铁蛋白水平、药物清单、创伤史和压力评分。每个合成患者都会经历与真实用户完全相同的生产流程。完整的 Sonnet 报告生成,没有任何捷径。
Then a second agent — Claude Haiku — scores each report on 4 dimensions: 随后,第二个智能体——Claude Haiku——会从 4 个维度对每份报告进行评分:
| Dimension | What it checks | Gap threshold |
|---|---|---|
| Protocol depth | Are expected correlations for this condition named? | below 6/10 |
| Personalization | Are this patient’s specific details in the report? | below 6/10 |
| Supplement specificity | Active biological forms named (e.g., magnesium bisglycinate)? | below 6/10 |
| Legal safety | No forbidden medical claims, no stop-medication advice? | below 7/10 |
| 维度 | 检查内容 | 差距阈值 |
|---|---|---|
| 方案深度 | 是否指出了该病症的预期相关性? | 低于 6/10 |
| 个性化 | 报告中是否包含该患者的具体细节? | 低于 6/10 |
| 补充剂特异性 | 是否指出了活性生物形式(如甘氨酸镁)? | 低于 6/10 |
| 法律安全性 | 是否无违禁医疗声明,无停药建议? | 低于 7/10 |
Scores below threshold become knowledge gap proposals. Legal safety below 5 triggers an immediate compliance scan — synchronously, before anything else continues. Cost: ~€0.55/week. 低于阈值的评分将成为“知识差距建议”。法律安全性低于 5 分会触发立即合规扫描——同步进行,在其他任务继续之前优先处理。成本:约 0.55 欧元/周。
Agent 2: Auto-KB
智能体 2:自动知识库 (Auto-KB)
The knowledge base that writes itself. The gap proposals from Agent 1 contain condition types and dimensions. Agent 2 converts these into PubMed queries, fetches abstracts via the free NCBI API, and sends each abstract to Haiku with one instruction: Extract 3-5 factual claims from this abstract that are directly relevant to [condition]. Return structured triples: subject, predicate, object. The triples land in a knowledge_triples table. The report generator reads from this table at runtime. No retraining. No fine-tuning. Just better context for the next generation.
这是一个会自我编写的知识库。来自智能体 1 的差距建议包含病症类型和维度。智能体 2 将其转换为 PubMed 查询,通过免费的 NCBI API 获取摘要,并将每篇摘要发送给 Haiku,指令只有一个:从摘要中提取 3-5 条与 [病症] 直接相关的事实声明,并返回结构化三元组:主语、谓语、宾语。这些三元组存入 knowledge_triples 表中。报告生成器在运行时读取该表。无需重新训练,无需微调,只是为下一次生成提供了更好的上下文。
By Wednesday afternoon, the knowledge base has been updated. By Thursday morning, real patients get smarter reports. 到周三下午,知识库已完成更新。到周四早上,真实患者就能获得更智能的报告。
Agent 3: Developer Tools Radar
智能体 3:开发者工具雷达
Because staying current is also a product decision. Every Tuesday at 3:30am, the radar scans GitHub Trending and dev.to for tools that match a static relevance filter. Haiku summarizes each match in 1-2 sentences. The summaries land in the admin UI. Monday morning I get a digest with what the dev world built this week that’s relevant to my stack. Cost: €0.04/month. 因为保持技术前沿也是一种产品决策。每周二凌晨 3:30,雷达会扫描 GitHub Trending 和 dev.to,寻找符合静态相关性过滤器的工具。Haiku 将每个匹配项总结为 1-2 句话,并存入管理后台。周一早上,我会收到一份摘要,了解开发界本周构建的与我技术栈相关的内容。成本:0.04 欧元/月。
Agent 4: The Orchestrator
智能体 4:统筹器 (Orchestrator)
The rule engine that connects everything. Each agent calls reportAgentEvent(type, result) when it finishes. The orchestrator applies rules:
这是连接一切的规则引擎。每个智能体在完成任务时都会调用 reportAgentEvent(type, result)。统筹器应用以下规则:
- R1: Legal flag → immediate compliance scan
- R2: KB pipeline returned 0 facts → warning
- R1: 法律警示 → 立即进行合规扫描
- R2: 知识库流水线返回 0 条事实 → 发出警告
Rule R1 is the critical one. If a synthetic patient triggers a legal safety score below 5, the compliance agent doesn’t wait until next week. It runs immediately. 规则 R1 是关键。如果合成患者触发了低于 5 分的法律安全评分,合规智能体不会等到下周,而是立即运行。
Agent 5: Weekly Digest
智能体 5:每周摘要
The operator dashboard I never have to build. Every Monday at 7am, an HTML email lands in my inbox with: 这是我无需构建的操作仪表盘。每周一早上 7 点,一封 HTML 邮件会发送到我的收件箱,包含:
- 本周进入知识库的新事实数量
- 合成循环结果:发现的差距、法律警示(如有)
- 自动处理的 PubMed 论文
- 雷达上的开发者工具
- 本周系统成本 我无需登录仪表盘或运行查询,就能确切知道系统学到了什么、修复了什么以及标记了什么。