We Tried 6 Memory Providers for Hermes Agent — Here's What We Learned
We Tried 6 Memory Providers for Hermes Agent — Here’s What We Learned
我们为 Hermes Agent 测试了 6 种记忆插件:以下是我们的经验教训
Giving an AI agent persistent memory sounds simple. Store facts. Recall them later. How hard can it be? Three weeks and six providers later, I have opinions. This is the story of what broke, what we discarded, and the one thing that finally worked — and why. 为 AI Agent 提供持久化记忆听起来很简单:存储事实,稍后调用。这能有多难?在经历了三周和六种不同的插件后,我有了自己的看法。这就是关于什么出了问题、我们放弃了什么,以及最终是什么奏效了——以及原因的故事。
The Setup
环境配置
I run Hermes Agent on a headless VPS with 4GB RAM. Nothing exotic. The goal was straightforward: the agent should remember things across sessions — my preferences, environment details, lessons learned — without me repeating myself every conversation. Hermes ships with several bundled memory providers and supports third-party ones via plugins. Should be plug-and-play, right? 我在一台 4GB 内存的无头 VPS 上运行 Hermes Agent。配置并不复杂。目标很明确:Agent 应该跨会话记住事情——比如我的偏好、环境细节、学到的经验——而不需要我在每次对话中重复自己。Hermes 自带了几个记忆插件,并支持通过插件使用第三方方案。这应该是“即插即用”的,对吧?
Phase 1: The Ones That Failed Silently
第一阶段:静默失败的插件
AgentMemory
The first provider we had. Node.js runtime, Docker container for the iii-engine, 860 memories at peak. It seemed fine. Then we switched to a different provider to try it out. AgentMemory’s ingestion died instantly — but nothing told us. Tools responded normally. No errors in logs. Just… nothing was being stored anymore. Root cause: Hermes supports exactly one active memory provider. The switch disabled AgentMemory’s sync_turn() without a warning. The deadliest failure mode: total silence.
这是我们使用的第一个插件。它依赖 Node.js 运行时和 iii-engine 的 Docker 容器,峰值时存储了 860 条记忆。起初看起来没问题。后来我们切换到另一个插件进行测试,AgentMemory 的数据摄入瞬间停止了——但没有任何提示。工具响应正常,日志里也没有报错。只是……再也没有任何东西被存储了。根本原因:Hermes 仅支持一个活跃的记忆插件。切换操作在没有任何警告的情况下禁用了 AgentMemory 的 sync_turn()。这是最致命的故障模式:完全的静默。
YantrikDB Tried as a replacement. Same silent failure. MCP tools responded “OK” but ingestion was completely dead. We never stored a single memory. Uninstalled alongside AgentMemory in the same cleanup session. 尝试作为替代品。同样出现了静默失败。MCP 工具响应“OK”,但数据摄入完全死锁。我们一条记忆都没存进去。在随后的清理中,它和 AgentMemory 一起被卸载了。
Lesson #1: A memory provider that fails silently is worse than no provider at all. False confidence corrupts everything. 教训 #1: 一个静默失败的记忆插件比没有插件更糟糕。虚假的信心会破坏一切。
Phase 2: The One That Wouldn’t Die (Or Live)
第二阶段:那个“死不掉”(也活不了)的插件
Hindsight This one looked promising on paper. Bundled with Hermes. 91.4% on the LongMemEval benchmark. Knowledge graphs, reflect synthesis — the “power pick.” Hindsight 从纸面上看,这个插件很有前途。它随 Hermes 一起打包,在 LongMemEval 基准测试中得分 91.4%。知识图谱、反射式综合——看起来是“强力之选”。
Reality:
- Installed the wrong package first (hindsight-all vs hindsight-client)
- API key caching bugs — daemon held stale env vars across restarts
- Embedded PostgreSQL (pg0) tried to download itself and hung for 177 seconds
- After full uninstall — pip remove, config cleaned, directories deleted, plugin disabled — daemons kept respawning every 2 minutes. The gateway cached plugin state at startup and wouldn’t let go. Breaking the cycle required stopping the gateway, hunting processes with pkill -9, and restarting. A hard kill. For a memory plugin. 现实情况是:
- 首先安装了错误的包(hindsight-all vs hindsight-client)。
- API 密钥缓存存在 Bug——守护进程在重启后仍保留过期的环境变量。
- 内置的 PostgreSQL (pg0) 尝试自动下载,结果卡死了 177 秒。
- 在完全卸载(pip remove、清理配置、删除目录、禁用插件)后,守护进程每 2 分钟就会自动重启。网关在启动时缓存了插件状态,导致无法彻底清除。要打破这个循环,必须停止网关,用
pkill -9搜寻进程并强制重启。仅仅为了一个记忆插件,竟然需要进行这种“硬杀”。
Lesson #2: If uninstallation requires killing processes by force, the architecture is wrong. A memory provider’s lifecycle should not require a process manager. 教训 #2: 如果卸载需要强制杀死进程,说明架构设计有问题。记忆插件的生命周期不应该依赖进程管理器。
Phase 3: The Evaluation
第三阶段:评估
At this point we had criteria. Real criteria, earned through pain:
- Cannot silently fail — if ingestion stops, I need to know
- Simple uninstall — no daemon ghosts
- Local-first — no cloud dependency, no API key expiry taking down memory
- Hermes-specific author instructions — the #1 predictor of whether integration actually works
- No double token burn — I’m not paying for inference twice 此时我们总结出了标准。这是通过痛苦换来的真实标准:
- 不能静默失败——如果摄入停止,我必须知道。
- 卸载简单——没有残留的守护进程。
- 本地优先——不依赖云端,不会因为 API 密钥过期导致记忆失效。
- 针对 Hermes 的作者说明——这是集成是否真正有效的首要指标。
- 不会双重消耗 Token——我不想为推理付两次钱。
We surveyed what was available: 我们调查了现有的方案:
| Provider | Verdict | Killer Flaw |
|---|---|---|
| Holographic (bundled) | Too simple | sync_turn() is a no-op — no auto-ingestion |
| Supermemory (bundled) | Cloud-only | All cloud. Best benchmarks, but contradicts local-first |
| Mem0 | Double token burn | LLM-Embedded: the agent calls an LLM, Mem0 calls its OWN LLM for fact extraction. Pay twice. |
| MemPalace | Wrong platform | 96.6% LongMemEval, but built for Claude Code — not Hermes |
| 插件 | 结论 | 致命缺陷 |
|---|---|---|
| Holographic (内置) | 太简单 | sync_turn() 是空操作,没有自动摄入 |
| Supermemory (内置) | 仅限云端 | 全云端。基准测试最好,但违背了本地优先原则 |
| Mem0 | 双重 Token 消耗 | LLM 嵌入式:Agent 调用 LLM,Mem0 又调用它自己的 LLM 进行事实提取。付两次钱。 |
| MemPalace | 平台错误 | LongMemEval 得分 96.6%,但它是为 Claude Code 构建的,不是为 Hermes。 |
Phase 4: The One That Worked
第四阶段:最终奏效的方案
Mnemosyne By AxDSan. Posted directly to r/hermesagent by its author. The README literally says: “The Zero-Dependency, Sub-Millisecond AI Memory System for Hermes Agents.” Mnemosyne 由 AxDSan 开发。作者直接发布在 r/hermesagent 上。README 上赫然写着:“为 Hermes Agent 设计的零依赖、亚毫秒级 AI 记忆系统。”
What makes it different:
- In-process Python + SQLite. No separate service. No Docker. No daemon. If the gateway process runs, memory works. There is nothing to fall out of sync with.
- Sub-millisecond reads. 0.076ms. 500x faster than the previous-generation providers. You don’t feel it. 它的不同之处在于:
- 进程内 Python + SQLite。没有独立服务,没有 Docker,没有守护进程。只要网关进程在运行,记忆功能就正常。不存在同步失效的问题。
- 亚毫秒级读取。0.076 毫秒。比上一代插件快 500 倍。你根本感觉不到延迟。
Three code paths, all verified working:
- Explicit remember — the agent calls
remember()when asked - Auto-ingestion —
sync_turncaptures every conversation turn automatically - Context injection — high-importance memories surface in each turn’s system prompt 三个代码路径,全部验证有效:
- 显式记忆——Agent 在被要求时调用
remember()。 - 自动摄入——
sync_turn自动捕获每一轮对话。 - 上下文注入——高重要性的记忆会出现在每一轮的系统提示词中。
Installation was one command:
安装只需一条命令:
pip install mnemosyne-memory[embeddings]
python -m mnemosyne.install
hermes memory setup # interactive picker → select "mnemosyne"
No [all] — that pulls ctransformers and downloads 1–4GB of GGUF models. On a 4GB machine, that’s OOM territory. The [embeddings] extra adds fastembed (133MB ONNX model) for semantic search, and LLM consolidation routes through your existing API key.
不要用 [all]——那会拉取 ctransformers 并下载 1–4GB 的 GGUF 模型。在 4GB 内存的机器上,这会导致内存溢出 (OOM)。[embeddings] 扩展包添加了用于语义搜索的 fastembed(133MB 的 ONNX 模型),且 LLM 整合过程通过你现有的 API 密钥进行。
After three weeks of operation:
- 362 working memories
- 29 episodic summaries (auto-consolidation working)
- 27/27 test suite passing
- Zero silent failures. Zero daemon hunts. Zero forced kills. 运行三周后:
- 362 条有效记忆
- 29 条情景摘要(自动整合功能正常)
- 27/27 测试套件通过
- 零静默失败。零守护进程搜寻。零强制杀死进程。
The Pattern
模式总结
Every failed provider shared one architectural decision: an external runtime with its own lifecycle. AgentMemory’s Node.js Docker. Hindsight’s pg0 Postgres + daemon. When the runtime and the gateway fell out of sync — silent failure, ghost processes, respawn loops. Mnemosyne’s in-process Python + SQLite avoids this entirely. It’s the simplest thing that could possibly work — and that turns out to be the hardest thing to get right, because every other provider ships complexity as a feature. 每一个失败的插件都有一个共同的架构决策:拥有独立生命周期的外部运行时。AgentMemory 的 Node.js Docker,Hindsight 的 pg0 Postgres + 守护进程。当运行时和网关不同步时——就会出现静默失败、幽灵进程和重启循环。Mnemosyne 的进程内 Python + SQLite 完全避免了这一点。这是最简单的可行方案——但事实证明,这也是最难做对的,因为其他所有插件都把“复杂性”当作一种功能来兜售。
What I’d Tell Someone Starting Today
给今天刚开始尝试的人的建议
- Local-first, single-process. If memory needs a separate service, it will fail in ways you won’t notice.
- Verify ingestion before trusting it. After installing any memory provider, store a test fact, restart, and ask for it back.
- The author matters. Does the provider’s README mention your agent platform by name? If not, you’re doing integration work the author didn’t do.
- [all] is a trap. Read the install extras. On constrained hardware, the “everything” option downloads models you don’t need.
- Clean uninstall is a feature. If removing a provider takes more than deleting a directory, the architecture is fragile.
- 本地优先,单进程。 如果记忆功能需要独立服务,它迟早会以你无法察觉的方式失败。
- 在信任之前先验证摄入。 安装任何记忆插件后,存储一个测试事实,重启,然后询问它。
- 作者很重要。 插件的 README 是否明确提到了你的 Agent 平台?如果没有,你就是在做作者没做完的集成工作。
- [all] 是个陷阱。 阅读安装扩展选项。在受限硬件上,“全量”选项会下载你根本不需要的模型。
- 干净的卸载是一种功能。 如果移除一个插件需要删除目录之外的操作,说明其架构很脆弱。
I’m @MariaTanBoBo on X. This article was written with Hermes Agent and published via the DEV.to API — yes, an AI agent can publish articles now. The future is weird. 我在 X 上的账号是 @MariaTanBoBo。本文由 Hermes Agent 编写,并通过 DEV.to API 发布——没错,AI Agent 现在可以发布文章了。未来真是奇妙。