I Built a Self-Improving Health Platform: Five AI Agents That Learn Every Week

I Built a Self-Improving Health Platform: Five AI Agents That Learn Every Week

我构建了一个自我进化的健康平台:五个每周自动学习的 AI 智能体

Most AI products are static. You fine-tune a model, ship it, and it stays exactly as smart as the day you launched. Your users get the same quality on day 1 as on day 365. Mine doesn’t work that way. Every Wednesday at 3am, five AI agents wake up, talk to each other, and make the next week’s reports smarter — without me touching a single line of code. This is the architecture that makes it possible. 大多数 AI 产品都是静态的。你微调一个模型,发布它,它就永远保持在发布当天的智能水平。用户在第一天和第 365 天获得的体验质量完全一样。但我的产品并非如此。每周三凌晨 3 点,五个 AI 智能体会自动唤醒、相互交流,并让下周的报告变得更智能——而我无需触碰一行代码。以下是实现这一目标的架构。

The problem with static AI products

静态 AI 产品的问题

I run Longevity AI — a health platform that generates personalized 6-month lifestyle plans from a 28-question intake. It cross-references 10+ organ systems, checks 400+ EFSA-regulated claims, and outputs a clinically-framed report in under 15 minutes. The core AI is Claude Sonnet. It’s powerful. But it only knows what I’ve taught it. 我运营着 Longevity AI——一个通过 28 个问题问卷生成个性化 6 个月生活方式规划的健康平台。它交叉参考了 10 多个器官系统,核查 400 多项受 EFSA(欧洲食品安全局)监管的声明,并在 15 分钟内输出一份具有临床参考价值的报告。其核心 AI 是 Claude Sonnet。它很强大,但它只知道我教给它的东西。

The problem: health science moves fast. A new PubMed paper on magnesium and sleep quality drops. My platform doesn’t know. A patient with a rare medication combination comes in. The report might miss the interaction. A legal claim sneaks past review. Nobody catches it until a user complains. 问题在于:健康科学发展迅速。一篇关于镁与睡眠质量的 PubMed 新论文发布了,我的平台却不知道。一位服用罕见药物组合的患者来了,报告可能会遗漏药物相互作用。一项违规的法律声明溜过了审查,直到用户投诉才被发现。

I had two options:

  1. Hire a team of researchers and QA engineers to manually update the system
  2. Build agents that do it automatically I chose option 2. Here’s exactly how it works. 我有两个选择:
  3. 雇佣一个研究员和质量保证工程师团队来手动更新系统;
  4. 构建自动完成这些工作的智能体。 我选择了方案 2。以下是它的具体工作原理。

The multi-agent architecture

多智能体架构

Five agents run on a fixed schedule. One orchestrates them all. 五个智能体按固定时间表运行,由一个智能体负责统筹。

  • Wednesday 03:00: Synthetic Patients Agent
  • Wednesday 04:00: Auto-KB Agent
  • Tuesday 03:30: Developer Tools Radar
  • Monday 07:00: Weekly Digest Agent
  • Always active: Agent Orchestrator
  • 周三 03:00: 合成患者智能体
  • 周三 04:00: 自动知识库智能体
  • 周二 03:30: 开发者工具雷达
  • 周一 07:00: 每周摘要智能体
  • 始终活跃: 智能体统筹器

They don’t share a runtime. They communicate through the database and a lightweight event system. No complex framework — just reportAgentEvent() and a rules table. 它们不共享运行环境,而是通过数据库和一个轻量级事件系统进行通信。没有复杂的框架,只有 reportAgentEvent() 函数和一个规则表。

Agent 1: Synthetic Patients

智能体 1:合成患者

The core of the self-improvement loop. Every Wednesday at 3am, 10 synthetic patient profiles are selected from a static template library (5 conditions x 2 psychological archetypes). These are fake patients with real-looking intake responses: ferritin levels, medication lists, trauma history, stress scores. Each synthetic patient goes through the exact same production pipeline as a real user. Full Sonnet report generation. No shortcuts. 这是自我进化循环的核心。每周三凌晨 3 点,系统会从静态模板库中选择 10 个合成患者档案(5 种病症 x 2 种心理原型)。这些是拥有逼真问卷回答的虚拟患者:包括铁蛋白水平、药物清单、创伤史和压力评分。每个合成患者都会经历与真实用户完全相同的生产流程。完整的 Sonnet 报告生成,没有任何捷径。

Then a second agent — Claude Haiku — scores each report on 4 dimensions: 随后,第二个智能体——Claude Haiku——会从 4 个维度对每份报告进行评分:

DimensionWhat it checksGap threshold
Protocol depthAre expected correlations for this condition named?below 6/10
PersonalizationAre this patient’s specific details in the report?below 6/10
Supplement specificityActive biological forms named (e.g., magnesium bisglycinate)?below 6/10
Legal safetyNo forbidden medical claims, no stop-medication advice?below 7/10
维度检查内容差距阈值
方案深度是否指出了该病症的预期相关性?低于 6/10
个性化报告中是否包含该患者的具体细节?低于 6/10
补充剂特异性是否指出了活性生物形式(如甘氨酸镁)?低于 6/10
法律安全性是否无违禁医疗声明,无停药建议?低于 7/10

Scores below threshold become knowledge gap proposals. Legal safety below 5 triggers an immediate compliance scan — synchronously, before anything else continues. Cost: ~€0.55/week. 低于阈值的评分将成为“知识差距建议”。法律安全性低于 5 分会触发立即合规扫描——同步进行,在其他任务继续之前优先处理。成本:约 0.55 欧元/周。

Agent 2: Auto-KB

智能体 2:自动知识库 (Auto-KB)

The knowledge base that writes itself. The gap proposals from Agent 1 contain condition types and dimensions. Agent 2 converts these into PubMed queries, fetches abstracts via the free NCBI API, and sends each abstract to Haiku with one instruction: Extract 3-5 factual claims from this abstract that are directly relevant to [condition]. Return structured triples: subject, predicate, object. The triples land in a knowledge_triples table. The report generator reads from this table at runtime. No retraining. No fine-tuning. Just better context for the next generation. 这是一个会自我编写的知识库。来自智能体 1 的差距建议包含病症类型和维度。智能体 2 将其转换为 PubMed 查询,通过免费的 NCBI API 获取摘要,并将每篇摘要发送给 Haiku,指令只有一个:从摘要中提取 3-5 条与 [病症] 直接相关的事实声明,并返回结构化三元组:主语、谓语、宾语。这些三元组存入 knowledge_triples 表中。报告生成器在运行时读取该表。无需重新训练,无需微调,只是为下一次生成提供了更好的上下文。

By Wednesday afternoon, the knowledge base has been updated. By Thursday morning, real patients get smarter reports. 到周三下午,知识库已完成更新。到周四早上,真实患者就能获得更智能的报告。

Agent 3: Developer Tools Radar

智能体 3:开发者工具雷达

Because staying current is also a product decision. Every Tuesday at 3:30am, the radar scans GitHub Trending and dev.to for tools that match a static relevance filter. Haiku summarizes each match in 1-2 sentences. The summaries land in the admin UI. Monday morning I get a digest with what the dev world built this week that’s relevant to my stack. Cost: €0.04/month. 因为保持技术前沿也是一种产品决策。每周二凌晨 3:30,雷达会扫描 GitHub Trending 和 dev.to,寻找符合静态相关性过滤器的工具。Haiku 将每个匹配项总结为 1-2 句话,并存入管理后台。周一早上,我会收到一份摘要,了解开发界本周构建的与我技术栈相关的内容。成本:0.04 欧元/月。

Agent 4: The Orchestrator

智能体 4:统筹器 (Orchestrator)

The rule engine that connects everything. Each agent calls reportAgentEvent(type, result) when it finishes. The orchestrator applies rules: 这是连接一切的规则引擎。每个智能体在完成任务时都会调用 reportAgentEvent(type, result)。统筹器应用以下规则:

  • R1: Legal flag → immediate compliance scan
  • R2: KB pipeline returned 0 facts → warning
  • R1: 法律警示 → 立即进行合规扫描
  • R2: 知识库流水线返回 0 条事实 → 发出警告

Rule R1 is the critical one. If a synthetic patient triggers a legal safety score below 5, the compliance agent doesn’t wait until next week. It runs immediately. 规则 R1 是关键。如果合成患者触发了低于 5 分的法律安全评分,合规智能体不会等到下周,而是立即运行。

Agent 5: Weekly Digest

智能体 5:每周摘要

The operator dashboard I never have to build. Every Monday at 7am, an HTML email lands in my inbox with: 这是我无需构建的操作仪表盘。每周一早上 7 点,一封 HTML 邮件会发送到我的收件箱,包含:

  • 本周进入知识库的新事实数量
  • 合成循环结果:发现的差距、法律警示(如有)
  • 自动处理的 PubMed 论文
  • 雷达上的开发者工具
  • 本周系统成本 我无需登录仪表盘或运行查询,就能确切知道系统学到了什么、修复了什么以及标记了什么。