Stop Making Your AI Coding Agent Grep Your Whole Repo — Try codebase-memory-mcp
Stop Making Your AI Coding Agent Grep Your Whole Repo — Try codebase-memory-mcp
别再让你的 AI 编程助手全库 grep 了——试试 codebase-memory-mcp
If you use an AI coding agent — Claude Code, Codex CLI, Gemini CLI, Cursor, Zed, Aider, whatever — you’ve probably watched it burn through tens of thousands of tokens just trying to figure out who calls a function or where a route is defined. It greps, it reads files, it greps again, it reads more files. Eventually it answers your question, but it took a small forest of tokens and several tool calls to do it. 如果你使用 AI 编程助手(如 Claude Code、Codex CLI、Gemini CLI、Cursor、Zed、Aider 等),你可能见过它为了弄清楚谁调用了某个函数或路由定义在哪里,而消耗掉数万个 Token。它不断地 grep、读取文件、再 grep、再读取更多文件。最终它确实回答了你的问题,但却为此消耗了海量的 Token 并进行了多次工具调用。
codebase-memory-mcp is an open-source MCP server built to fix exactly that. It indexes your codebase into a persistent knowledge graph — functions, classes, call chains, HTTP routes, cross-service links — and lets your agent ask structural questions directly instead of reading its way through the filesystem. Here’s what’s actually in it for you. codebase-memory-mcp 是一个开源的 MCP 服务器,正是为了解决这个问题而生。它将你的代码库索引为一个持久化的知识图谱(包含函数、类、调用链、HTTP 路由、跨服务链接等),让你的 AI 助手可以直接询问结构性问题,而无需通过遍历文件系统来寻找答案。以下是它能为你带来的实际价值。
The pitch in one paragraph
一段话介绍核心卖点
You install a single static binary. You tell your agent “index this project.” A few seconds to a few minutes later (yes, even the Linux kernel — 28 million lines of code — takes about 3 minutes), your agent can ask things like “what calls ProcessOrder?” or “show me the architecture of this service” and get an answer in under a millisecond, instead of grepping and reading dozens of files. No Docker, no API keys, no separate database to run. It’s just a binary that talks MCP. 你只需安装一个静态二进制文件,然后告诉你的助手“索引此项目”。几秒到几分钟后(是的,即使是拥有 2800 万行代码的 Linux 内核也只需约 3 分钟),你的助手就可以询问诸如“谁调用了 ProcessOrder?”或“展示此服务的架构”之类的问题,并在不到一毫秒的时间内得到答案,而无需再 grep 和读取几十个文件。无需 Docker,无需 API Key,无需运行额外的数据库。它只是一个支持 MCP 协议的二进制程序。
Why this matters: the token math
为什么这很重要:Token 的账单
This is the number that should get your attention. According to the project’s own benchmarks, five typical structural queries cost roughly 3,400 tokens through codebase-memory-mcp, versus about 412,000 tokens doing the equivalent file-by-file grep-and-read exploration. That’s a 99 percent reduction. Less context burned on exploration means more budget left for the agent to actually reason about your problem — and a noticeably faster, cheaper session. 这个数字值得你关注。根据该项目的基准测试,通过 codebase-memory-mcp 进行五次典型的结构性查询大约消耗 3,400 个 Token,而通过逐个文件 grep 和读取的传统方式则需要约 412,000 个 Token。这意味着减少了 99% 的消耗。在探索阶段节省的上下文意味着有更多的预算留给助手去真正思考你的问题,从而获得明显更快、更便宜的编程体验。
What it actually does
它到底做了什么
At its core, codebase-memory-mcp is a structural analysis backend, not another LLM wrapper. It parses your code with tree-sitter, builds a graph of how everything connects, and exposes that graph through 14 MCP tools. Your agent is still the brain — it just stops having to rediscover your codebase from scratch every session. 本质上,codebase-memory-mcp 是一个结构分析后端,而不是另一个 LLM 包装器。它使用 tree-sitter 解析你的代码,构建一个描述所有事物连接方式的图谱,并通过 14 个 MCP 工具将该图谱暴露出来。你的助手依然是“大脑”,只是它不再需要在每次会话时都从零开始重新探索你的代码库。
A typical interaction looks like this: 典型的交互过程如下:
- You: “what calls ProcessOrder?”
- Agent calls:
trace_call_path(function_name="ProcessOrder", direction="inbound") - codebase-memory-mcp: executes the graph query, returns structured results
- Agent: presents the call chain in plain English
- 你: “谁调用了 ProcessOrder?”
- 助手调用:
trace_call_path(function_name="ProcessOrder", direction="inbound") - codebase-memory-mcp: 执行图查询,返回结构化结果
- 助手: 用通俗易懂的语言展示调用链
No extra LLM, no extra API key, no extra cost layer — the agent you’re already paying for does the translation from natural language to graph query. 无需额外的 LLM,无需额外的 API Key,无需额外的成本层——你已经在付费使用的助手就能完成从自然语言到图查询的转换。
The features that actually matter day to day
日常工作中真正重要的功能
- It’s fast, even on huge repos. RAM-first indexing pipeline (LZ4 compression, in-memory SQLite, single dump at the end). Benchmarked on an Apple M3 Pro, the Linux kernel indexes in about 3 minutes; a mid-sized project like Django takes around 6 seconds. Memory is released back to the OS once indexing finishes. 即使在大型代码库中也非常快。 采用内存优先的索引流水线(LZ4 压缩,内存 SQLite,最后进行单次转储)。在 Apple M3 Pro 上测试,Linux 内核索引耗时约 3 分钟;像 Django 这样的中型项目只需约 6 秒。索引完成后,内存会释放回操作系统。
- It speaks 155 languages. Tree-sitter grammars for all of them are vendored directly into the binary, so there’s nothing extra to install. Go, C, C++, and the TypeScript/JavaScript/JSX/TSX family additionally get LSP-style hybrid type resolution — proper parameter binding, return-type inference, generic substitution, and JSDoc inference for plain JS files. 支持 155 种语言。 所有语言的 tree-sitter 语法都直接内置在二进制文件中,无需额外安装。Go、C、C++ 以及 TypeScript/JavaScript/JSX/TSX 系列还获得了 LSP 风格的混合类型解析——包括正确的参数绑定、返回类型推断、泛型替换以及针对普通 JS 文件的 JSDoc 推断。
- It plugs into 11 coding agents automatically. Running the installer auto-detects Claude Code, Codex CLI, Gemini CLI, Zed, OpenCode, Antigravity, Aider, KiloCode, VS Code, OpenClaw, and Kiro, and configures MCP entries, instruction files, and hooks for whichever ones you have installed. 自动接入 11 种编程助手。 运行安装程序会自动检测 Claude Code、Codex CLI、Gemini CLI、Zed、OpenCode、Antigravity、Aider、KiloCode、VS Code、OpenClaw 和 Kiro,并为你已安装的工具配置 MCP 条目、指令文件和钩子。
- It understands more than just application code. Dockerfiles, Kubernetes manifests, and Kustomize overlays get indexed as graph nodes too, with cross-references between them — useful if your agent needs to reason about infrastructure as well as application logic. 不仅理解应用代码。 Dockerfile、Kubernetes 清单和 Kustomize 覆盖文件也会被索引为图节点,并建立交叉引用——如果你的助手需要同时分析基础设施和应用逻辑,这将非常有用。
- It finds dead code, traces blast radius, and detects near-duplicates. Beyond simple search, there’s Louvain community detection for discovering functional modules, git-diff impact mapping that classifies risk for uncommitted changes, dead code detection (excluding entry points), and MinHash-based near-clone detection. 查找死代码、追踪影响范围并检测近似重复代码。 除了简单的搜索,它还支持用于发现功能模块的 Louvain 社区检测、用于评估未提交更改风险的 git-diff 影响映射、死代码检测(排除入口点)以及基于 MinHash 的近似克隆检测。
- It can visualize the graph. An optional UI binary variant ships a built-in 3D graph visualization you can open in a browser at
localhost:9749, running alongside the MCP server as a background thread. 支持图谱可视化。 可选的 UI 二进制版本内置了 3D 图谱可视化功能,你可以在浏览器中访问localhost:9749查看,它作为后台线程与 MCP 服务器并行运行。 - Your team doesn’t all have to reindex separately. A
.codebase-memory/graph.db.zstartifact can be committed alongside your source — a zstd-compressed snapshot of the graph. When a teammate clones the repo and runs the indexer for the first time, it imports that snapshot and only does an incremental diff locally, instead of a full reindex. It’s entirely optional and gitignore-friendly if you’d rather everyone start fresh. 团队成员无需各自重新索引。 你可以将.codebase-memory/graph.db.zst(图谱的 zstd 压缩快照)与源代码一起提交。当队友克隆仓库并首次运行索引器时,它会导入该快照并仅在本地执行增量差异更新,而不是进行全量重新索引。如果你希望每个人都从零开始,这完全是可选的,并且对 gitignore 非常友好。
Getting started
开始使用
The one-line install on macOS or Linux:
macOS 或 Linux 的一行安装命令:
curl -fsSL https://raw.githubusercontent.com/DeusData/codebase-memory-mcp/main/install.sh | bash
Add -s -- --ui on the end if you want the graph visualization included. On Windows, download install.ps1 and run it from PowerShell. There’s also a manual route if you’d rather inspect everything first: download the platform-specific archive from the latest release, extract it, and run the bundled install script.
如果你想包含图谱可视化功能,请在末尾添加 -s -- --ui。在 Windows 上,下载 install.ps1 并在 PowerShell 中运行。如果你想先检查所有内容,也可以选择手动方式:从最新发布版本下载对应平台的压缩包,解压并运行自带的安装脚本。
Once installed, restart your coding agent and just say: 安装完成后,重启你的编程助手并输入: “Index this project” “Index this project”
That’s the entire setup. If you want indexing to happen automatically whenever you open a new project, turn on auto-index:
这就是全部设置。如果你希望在打开新项目时自动进行索引,请开启自动索引:
codebase-memory-mcp config set auto_index true
After the initial index, a background watcher keeps the graph in sync with your git changes, so you’re not manually re-indexing every time you commit. 首次索引后,后台监视器会保持图谱与你的 git 更改同步,因此你无需在每次提交时手动重新索引。
What you get to ask it
你可以问它什么
Once a project is indexed, here’s a sample of what’s available through the 14 MCP tools: 项目索引完成后,以下是通过 14 个 MCP 工具可以实现的功能示例:
search_graph— structured search by label, name pattern, file pattern, or degree filterstrace_call_path— BFS traversal showing what calls a function and what it calls, up to depth 5get_architecture— a one-call overview of languages, packages, entry points, routes, hotspots, and clusterssearch_graph— 通过标签、名称模式、文件模式或度数过滤器进行结构化搜索trace_call_path— 通过广度优先搜索(BFS)展示谁调用了某个函数以及它调用了什么(深度可达 5 层)get_architecture— 一键概览项目的语言、包、入口点、路由、热点和集群信息