Scalable Uncertainty Reasoning in Knowledge Graphs

知识图谱中可扩展的不确定性推理

Abstract: Knowledge Graphs are pivotal for semantic data integration. The real-world data they model is often inherently uncertain. Within knowledge graphs, uncertainty manifests in three distinct levels: imprecise attribute values, probabilistic triple existence, and incomplete schema knowledge.

摘要： 知识图谱对于语义数据集成至关重要。它们所建模的现实世界数据往往具有内在的不确定性。在知识图谱中，不确定性表现为三个不同的层面：不精确的属性值、概率性的三元组存在性，以及不完整的模式知识。

However, current Semantic Web standards lack native support for reasoning over such uncertainty, and naïve extensions often incur computational intractability. In this thesis, I aim to develop a modular framework that addresses each level through tailored techniques: (1) defining probabilistic literals and a corresponding query algebra for continuous attributes; (2) a compilation-based framework transforming SPARQL provenance into tractable probabilistic circuits for uncertain triples; and (3) topology-aware geometric embeddings for statistical schema reasoning.

然而，当前的语义网标准缺乏对处理此类不确定性推理的原生支持，而简单的扩展往往会导致计算上的不可行。在本论文中，我旨在开发一个模块化框架，通过定制技术解决每个层面的问题：（1）为连续属性定义概率字面量及相应的查询代数；（2）构建一个基于编译的框架，将 SPARQL 溯源转换为可处理不确定三元组的易处理概率电路；以及（3）用于统计模式推理的拓扑感知几何嵌入。

The central hypothesis is that specialized reasoning mechanisms, namely algebraic, logical, and geometric approaches, can reconcile semantic precision with computational tractability.

其核心假设是：专门的推理机制，即代数、逻辑和几何方法，能够协调语义精度与计算可行性之间的平衡。