We Tried 6 Memory Providers for Hermes Agent — Here's What We Learned

We Tried 6 Memory Providers for Hermes Agent — Here’s What We Learned

我们为 Hermes Agent 测试了 6 种记忆插件:以下是我们的经验教训

Giving an AI agent persistent memory sounds simple. Store facts. Recall them later. How hard can it be? Three weeks and six providers later, I have opinions. This is the story of what broke, what we discarded, and the one thing that finally worked — and why. 为 AI Agent 提供持久化记忆听起来很简单:存储事实,稍后调用。这能有多难?在经历了三周和六种不同的插件后,我有了自己的看法。这就是关于什么出了问题、我们放弃了什么,以及最终是什么奏效了——以及原因的故事。

The Setup

环境配置

I run Hermes Agent on a headless VPS with 4GB RAM. Nothing exotic. The goal was straightforward: the agent should remember things across sessions — my preferences, environment details, lessons learned — without me repeating myself every conversation. Hermes ships with several bundled memory providers and supports third-party ones via plugins. Should be plug-and-play, right? 我在一台 4GB 内存的无头 VPS 上运行 Hermes Agent。配置并不复杂。目标很明确:Agent 应该跨会话记住事情——比如我的偏好、环境细节、学到的经验——而不需要我在每次对话中重复自己。Hermes 自带了几个记忆插件,并支持通过插件使用第三方方案。这应该是“即插即用”的,对吧?

Phase 1: The Ones That Failed Silently

第一阶段:静默失败的插件

AgentMemory The first provider we had. Node.js runtime, Docker container for the iii-engine, 860 memories at peak. It seemed fine. Then we switched to a different provider to try it out. AgentMemory’s ingestion died instantly — but nothing told us. Tools responded normally. No errors in logs. Just… nothing was being stored anymore. Root cause: Hermes supports exactly one active memory provider. The switch disabled AgentMemory’s sync_turn() without a warning. The deadliest failure mode: total silence. 这是我们使用的第一个插件。它依赖 Node.js 运行时和 iii-engine 的 Docker 容器,峰值时存储了 860 条记忆。起初看起来没问题。后来我们切换到另一个插件进行测试,AgentMemory 的数据摄入瞬间停止了——但没有任何提示。工具响应正常,日志里也没有报错。只是……再也没有任何东西被存储了。根本原因:Hermes 仅支持一个活跃的记忆插件。切换操作在没有任何警告的情况下禁用了 AgentMemory 的 sync_turn()。这是最致命的故障模式:完全的静默。

YantrikDB Tried as a replacement. Same silent failure. MCP tools responded “OK” but ingestion was completely dead. We never stored a single memory. Uninstalled alongside AgentMemory in the same cleanup session. 尝试作为替代品。同样出现了静默失败。MCP 工具响应“OK”,但数据摄入完全死锁。我们一条记忆都没存进去。在随后的清理中,它和 AgentMemory 一起被卸载了。

Lesson #1: A memory provider that fails silently is worse than no provider at all. False confidence corrupts everything. 教训 #1: 一个静默失败的记忆插件比没有插件更糟糕。虚假的信心会破坏一切。

Phase 2: The One That Wouldn’t Die (Or Live)

第二阶段:那个“死不掉”(也活不了)的插件

Hindsight This one looked promising on paper. Bundled with Hermes. 91.4% on the LongMemEval benchmark. Knowledge graphs, reflect synthesis — the “power pick.” Hindsight 从纸面上看,这个插件很有前途。它随 Hermes 一起打包,在 LongMemEval 基准测试中得分 91.4%。知识图谱、反射式综合——看起来是“强力之选”。

Reality:

  • Installed the wrong package first (hindsight-all vs hindsight-client)
  • API key caching bugs — daemon held stale env vars across restarts
  • Embedded PostgreSQL (pg0) tried to download itself and hung for 177 seconds
  • After full uninstall — pip remove, config cleaned, directories deleted, plugin disabled — daemons kept respawning every 2 minutes. The gateway cached plugin state at startup and wouldn’t let go. Breaking the cycle required stopping the gateway, hunting processes with pkill -9, and restarting. A hard kill. For a memory plugin. 现实情况是:
  • 首先安装了错误的包(hindsight-all vs hindsight-client)。
  • API 密钥缓存存在 Bug——守护进程在重启后仍保留过期的环境变量。
  • 内置的 PostgreSQL (pg0) 尝试自动下载,结果卡死了 177 秒。
  • 在完全卸载(pip remove、清理配置、删除目录、禁用插件)后,守护进程每 2 分钟就会自动重启。网关在启动时缓存了插件状态,导致无法彻底清除。要打破这个循环,必须停止网关,用 pkill -9 搜寻进程并强制重启。仅仅为了一个记忆插件,竟然需要进行这种“硬杀”。

Lesson #2: If uninstallation requires killing processes by force, the architecture is wrong. A memory provider’s lifecycle should not require a process manager. 教训 #2: 如果卸载需要强制杀死进程,说明架构设计有问题。记忆插件的生命周期不应该依赖进程管理器。

Phase 3: The Evaluation

第三阶段:评估

At this point we had criteria. Real criteria, earned through pain:

  • Cannot silently fail — if ingestion stops, I need to know
  • Simple uninstall — no daemon ghosts
  • Local-first — no cloud dependency, no API key expiry taking down memory
  • Hermes-specific author instructions — the #1 predictor of whether integration actually works
  • No double token burn — I’m not paying for inference twice 此时我们总结出了标准。这是通过痛苦换来的真实标准:
  • 不能静默失败——如果摄入停止,我必须知道。
  • 卸载简单——没有残留的守护进程。
  • 本地优先——不依赖云端,不会因为 API 密钥过期导致记忆失效。
  • 针对 Hermes 的作者说明——这是集成是否真正有效的首要指标。
  • 不会双重消耗 Token——我不想为推理付两次钱。

We surveyed what was available: 我们调查了现有的方案:

ProviderVerdictKiller Flaw
Holographic (bundled)Too simplesync_turn() is a no-op — no auto-ingestion
Supermemory (bundled)Cloud-onlyAll cloud. Best benchmarks, but contradicts local-first
Mem0Double token burnLLM-Embedded: the agent calls an LLM, Mem0 calls its OWN LLM for fact extraction. Pay twice.
MemPalaceWrong platform96.6% LongMemEval, but built for Claude Code — not Hermes
插件结论致命缺陷
Holographic (内置)太简单sync_turn() 是空操作,没有自动摄入
Supermemory (内置)仅限云端全云端。基准测试最好,但违背了本地优先原则
Mem0双重 Token 消耗LLM 嵌入式:Agent 调用 LLM,Mem0 又调用它自己的 LLM 进行事实提取。付两次钱。
MemPalace平台错误LongMemEval 得分 96.6%,但它是为 Claude Code 构建的,不是为 Hermes。

Phase 4: The One That Worked

第四阶段:最终奏效的方案

Mnemosyne By AxDSan. Posted directly to r/hermesagent by its author. The README literally says: “The Zero-Dependency, Sub-Millisecond AI Memory System for Hermes Agents.” Mnemosyne 由 AxDSan 开发。作者直接发布在 r/hermesagent 上。README 上赫然写着:“为 Hermes Agent 设计的零依赖、亚毫秒级 AI 记忆系统。”

What makes it different:

  • In-process Python + SQLite. No separate service. No Docker. No daemon. If the gateway process runs, memory works. There is nothing to fall out of sync with.
  • Sub-millisecond reads. 0.076ms. 500x faster than the previous-generation providers. You don’t feel it. 它的不同之处在于:
  • 进程内 Python + SQLite。没有独立服务,没有 Docker,没有守护进程。只要网关进程在运行,记忆功能就正常。不存在同步失效的问题。
  • 亚毫秒级读取。0.076 毫秒。比上一代插件快 500 倍。你根本感觉不到延迟。

Three code paths, all verified working:

  • Explicit remember — the agent calls remember() when asked
  • Auto-ingestion — sync_turn captures every conversation turn automatically
  • Context injection — high-importance memories surface in each turn’s system prompt 三个代码路径,全部验证有效:
  • 显式记忆——Agent 在被要求时调用 remember()
  • 自动摄入——sync_turn 自动捕获每一轮对话。
  • 上下文注入——高重要性的记忆会出现在每一轮的系统提示词中。

Installation was one command: 安装只需一条命令: pip install mnemosyne-memory[embeddings] python -m mnemosyne.install hermes memory setup # interactive picker → select "mnemosyne"

No [all] — that pulls ctransformers and downloads 1–4GB of GGUF models. On a 4GB machine, that’s OOM territory. The [embeddings] extra adds fastembed (133MB ONNX model) for semantic search, and LLM consolidation routes through your existing API key. 不要用 [all]——那会拉取 ctransformers 并下载 1–4GB 的 GGUF 模型。在 4GB 内存的机器上,这会导致内存溢出 (OOM)。[embeddings] 扩展包添加了用于语义搜索的 fastembed(133MB 的 ONNX 模型),且 LLM 整合过程通过你现有的 API 密钥进行。

After three weeks of operation:

  • 362 working memories
  • 29 episodic summaries (auto-consolidation working)
  • 27/27 test suite passing
  • Zero silent failures. Zero daemon hunts. Zero forced kills. 运行三周后:
  • 362 条有效记忆
  • 29 条情景摘要(自动整合功能正常)
  • 27/27 测试套件通过
  • 零静默失败。零守护进程搜寻。零强制杀死进程。

The Pattern

模式总结

Every failed provider shared one architectural decision: an external runtime with its own lifecycle. AgentMemory’s Node.js Docker. Hindsight’s pg0 Postgres + daemon. When the runtime and the gateway fell out of sync — silent failure, ghost processes, respawn loops. Mnemosyne’s in-process Python + SQLite avoids this entirely. It’s the simplest thing that could possibly work — and that turns out to be the hardest thing to get right, because every other provider ships complexity as a feature. 每一个失败的插件都有一个共同的架构决策:拥有独立生命周期的外部运行时。AgentMemory 的 Node.js Docker,Hindsight 的 pg0 Postgres + 守护进程。当运行时和网关不同步时——就会出现静默失败、幽灵进程和重启循环。Mnemosyne 的进程内 Python + SQLite 完全避免了这一点。这是最简单的可行方案——但事实证明,这也是最难做对的,因为其他所有插件都把“复杂性”当作一种功能来兜售。

What I’d Tell Someone Starting Today

给今天刚开始尝试的人的建议

  • Local-first, single-process. If memory needs a separate service, it will fail in ways you won’t notice.
  • Verify ingestion before trusting it. After installing any memory provider, store a test fact, restart, and ask for it back.
  • The author matters. Does the provider’s README mention your agent platform by name? If not, you’re doing integration work the author didn’t do.
  • [all] is a trap. Read the install extras. On constrained hardware, the “everything” option downloads models you don’t need.
  • Clean uninstall is a feature. If removing a provider takes more than deleting a directory, the architecture is fragile.
  • 本地优先,单进程。 如果记忆功能需要独立服务,它迟早会以你无法察觉的方式失败。
  • 在信任之前先验证摄入。 安装任何记忆插件后,存储一个测试事实,重启,然后询问它。
  • 作者很重要。 插件的 README 是否明确提到了你的 Agent 平台?如果没有,你就是在做作者没做完的集成工作。
  • [all] 是个陷阱。 阅读安装扩展选项。在受限硬件上,“全量”选项会下载你根本不需要的模型。
  • 干净的卸载是一种功能。 如果移除一个插件需要删除目录之外的操作,说明其架构很脆弱。

I’m @MariaTanBoBo on X. This article was written with Hermes Agent and published via the DEV.to API — yes, an AI agent can publish articles now. The future is weird. 我在 X 上的账号是 @MariaTanBoBo。本文由 Hermes Agent 编写,并通过 DEV.to API 发布——没错,AI Agent 现在可以发布文章了。未来真是奇妙。