The Answer Is an Edge, Not a Sentence — Building a Topology-Native GraphRAG Intelligence Platform with TigerGraph
The Answer Is an Edge, Not a Sentence — Building a Topology-Native GraphRAG Intelligence Platform with TigerGraph
答案是“边”而非“句子”——基于 TigerGraph 构建拓扑原生 GraphRAG 智能平台
How we built Shadow Network Intelligence — a GraphRAG-powered fraud investigation platform that proved why topology-aware retrieval outperforms traditional RAG for financial crime investigations.
我们构建了“影子网络情报”(Shadow Network Intelligence)——一个由 GraphRAG 驱动的欺诈调查平台,该平台证明了在金融犯罪调查中,拓扑感知检索为何优于传统的 RAG(检索增强生成)。
Introduction
引言
Most retrieval systems are built around documents. Financial crime investigations are not. Fraud networks, laundering chains, shell company ecosystems, mule accounts, and intermediary ownership structures do not exist as clean paragraphs inside a single document. They exist across: transactions, shared devices, addresses, shell corporations, account transfers, ownership chains, hidden intermediaries, and multi-hop relationships. Traditional retrieval systems can retrieve text. But financial investigations are fundamentally about reconstructing relationships. That realization became the foundation for our TigerGraph GraphRAG Inference Hackathon project: Shadow Network Intelligence. A topology-native intelligence platform built to prove one thing: Traditional retrieval preserves documents. GraphRAG preserves relationships. Or more simply: The answer is an edge, not a sentence.
大多数检索系统是围绕文档构建的,但金融犯罪调查并非如此。欺诈网络、洗钱链条、空壳公司生态系统、洗钱账户、中介所有权结构等,并不会以整洁的段落形式存在于单一文档中。它们存在于交易、共享设备、地址、空壳公司、账户转账、所有权链、隐藏中介以及多跳关系之中。传统的检索系统可以检索文本,但金融调查从根本上说是关于重建关系。这一认识成为了我们 TigerGraph GraphRAG 推理黑客松项目“影子网络情报”的基石。这是一个拓扑原生的智能平台,旨在证明一件事:传统检索保留的是文档,而 GraphRAG 保留的是关系。简单来说:答案是“边”(Edge),而不是“句子”。
The Problem with Traditional Retrieval
传统检索的问题
Most modern AI retrieval systems rely on semantic similarity. That works well when: the answer exists inside a chunk, semantic similarity is enough, relationships are shallow, and the retrieval target is local. But financial crime investigations behave very differently. The answer often emerges only after reconstructing: multi-hop ownership chains, indirect transaction flows, hidden intermediary entities, shell-company cascades, device-sharing patterns, ring structures, and laundering topology. In these situations, semantic similarity alone breaks down. A chunk may contain a clue, but the relationship continuity between clues disappears. That is the core limitation of VectorRAG.
大多数现代 AI 检索系统依赖于语义相似度。当答案存在于某个数据块中、语义相似度足够、关系较浅且检索目标是局部时,这种方法效果很好。但金融犯罪调查的情况大不相同。答案往往只有在重建了多跳所有权链、间接交易流、隐藏中介实体、空壳公司级联、设备共享模式、环形结构和洗钱拓扑后才会显现。在这些情况下,仅靠语义相似度是行不通的。一个数据块可能包含一条线索,但线索之间的关系连续性却消失了。这就是 VectorRAG 的核心局限性。
Our Hypothesis
我们的假设
We hypothesized that: PureLLM systems would hallucinate or miss hidden structural relationships; VectorRAG systems would retrieve partial clues but fail to reconstruct topology; GraphRAG systems would recover hidden investigative structure through graph traversal. To test this properly, we needed: adversarial datasets, relationship-dense ecosystems, hidden rings, multi-hop structures, and topology-aware benchmarks—not simple Q&A datasets.
我们假设:纯 LLM 系统会产生幻觉或遗漏隐藏的结构关系;VectorRAG 系统能检索到部分线索,但无法重建拓扑;而 GraphRAG 系统可以通过图遍历恢复隐藏的调查结构。为了正确测试这一点,我们需要:对抗性数据集、关系密集型生态系统、隐藏的环、多跳结构以及拓扑感知基准测试——而不是简单的问答数据集。
Building the Dataset
构建数据集
We built a synthetic financial crime ecosystem specifically designed to stress retrieval systems. The generated graph included: 6,000 people, 5,000 companies, 10,000 accounts, 150,000+ transactions, shared devices, shared addresses, ownership structures, hidden fraud rings, and intermediary laundering chains. The important part was not scale alone. It was: Structural density. We intentionally designed adversarial investigation scenarios where: topology mattered, intermediary entities mattered, chunk retrieval failed structurally, and graph traversal became necessary.
我们构建了一个专门用于测试检索系统压力的合成金融犯罪生态系统。生成的图包括:6,000 人、5,000 家公司、10,000 个账户、150,000 多笔交易,以及共享设备、共享地址、所有权结构、隐藏的欺诈团伙和中介洗钱链。重要的不仅是规模,而是结构密度。我们特意设计了对抗性调查场景,在这些场景中:拓扑结构至关重要,中介实体至关重要,数据块检索在结构上会失败,而图遍历则变得必不可少。
Why We Chose TigerGraph
为什么选择 TigerGraph
This project required: high-performance traversal, multi-hop exploration, topology-native reasoning, relationship continuity, and structural neighborhood expansion. TigerGraph became the backbone of the entire intelligence system. We used TigerGraph to: reconstruct hidden ownership chains, detect fraud rings, traverse laundering paths, surface intermediary entities, and expand graph neighborhoods. The graph became: not just storage, but the reasoning substrate itself.
该项目需要:高性能遍历、多跳探索、拓扑原生推理、关系连续性和结构化邻域扩展。TigerGraph 成为了整个智能系统的骨干。我们利用 TigerGraph 来:重建隐藏的所有权链、检测欺诈团伙、遍历洗钱路径、浮现中介实体并扩展图邻域。图不再仅仅是存储,而是推理的基质本身。
The Results
结果
The benchmark results became the strongest validation of the project thesis. GraphRAG successfully reconstructed hidden rings, intermediary chains, ownership topology, laundering paths, and multi-hop relationships, while the other systems failed structurally. And importantly: this was not a tuning failure. Vector retrieval fundamentally lacks topology. A chunk cannot retrieve an edge that no longer exists.
基准测试结果成为了该项目论点最有力的验证。GraphRAG 成功重建了隐藏的环、中介链、所有权拓扑、洗钱路径和多跳关系,而其他系统在结构上均告失败。重要的是:这并非调优失败,而是向量检索从根本上缺乏拓扑能力。一个数据块无法检索到一条已经不存在的“边”。
The Most Important Realization
最重要的认识
During development, one insight became impossible to ignore: The answer was never hidden in a document. It was hidden in the relationships. That single realization completely shaped the rest of the platform.
在开发过程中,一个洞察变得无法忽视:答案从未隐藏在文档中,而是隐藏在关系中。这一认识彻底塑造了该平台的后续发展。
Building the Cognitive Layer
构建认知层
We wanted the system to do more than retrieve graph neighborhoods. We wanted: grounded investigative reasoning. So we added a cognitive reasoning layer capable of: structural claim synthesis, contradiction detection, confidence scoring, and evidence grounding.
我们希望系统不仅仅是检索图邻域,我们想要的是:扎实的调查推理。因此,我们增加了一个认知推理层,能够实现:结构化声明综合、矛盾检测、置信度评分和证据溯源。