opendatalab / MinerU

🚀Access MinerU Now→✅ Zero-Install Web Version ✅ Full-Featured Desktop Client ✅ Instant API Access; Skip deployment headaches – get all product formats in one click. Developers, dive in! 👋 join us on Discord and WeChat 🚀立即访问 MinerU→✅ 免安装网页版 ✅ 全功能桌面客户端 ✅ 即刻 API 访问；告别部署烦恼，一键获取所有产品格式。开发者们，快来加入我们！👋 欢迎加入我们的 Discord 和微信群。

MinerU — High-accuracy document parsing engine for LLM · RAG · Agent workflows MinerU — 面向 LLM · RAG · Agent 工作流的高精度文档解析引擎

Converts PDF · DOCX · PPTX · XLSX · Images · Web pages into structured Markdown / JSON · VLM+OCR dual engine · 109 languages 将 PDF · DOCX · PPTX · XLSX · 图像 · 网页转换为结构化的 Markdown / JSON · VLM+OCR 双引擎 · 支持 109 种语言

MCP Server · LangChain / Dify / FastGPT native integration · 10+ domestic AI chip support MCP Server · 原生集成 LangChain / Dify / FastGPT · 支持 10+ 款国产 AI 芯片

🔍 Core Parsing Capabilities

🔍 核心解析能力

Native support for DOCX, PPTX, and XLSX parsing 原生支持 DOCX、PPTX 和 XLSX 解析

Formulas → LaTeX · Tables → HTML, accurate layout reconstruction 公式 → LaTeX · 表格 → HTML，精准的版面还原

Supports scanned docs, handwriting, multi-column layouts, cross-page table merging 支持扫描件、手写体、多栏布局、跨页表格合并

Output follows human reading order with automatic header/footer removal 输出遵循人类阅读顺序，并自动去除页眉页脚

VLM + OCR dual engine, 109-language OCR recognition VLM + OCR 双引擎，支持 109 种语言的 OCR 识别

🔌 Integration Use Case Solution

🔌 集成方案与应用场景

AI Coding Tools: MCP Server — Cursor · Claude Desktop · Windsurf AI 编程工具：MCP Server — Cursor · Claude Desktop · Windsurf

RAG Frameworks: LangChain · LlamaIndex · RAGFlow · RAG-Anything · Flowise · Dify · FastGPT RAG 框架：LangChain · LlamaIndex · RAGFlow · RAG-Anything · Flowise · Dify · FastGPT

Development: Python / Go / TypeScript SDK · CLI · REST API · Docker 开发：Python / Go / TypeScript SDK · CLI · REST API · Docker

No-Code: mineru.net online · Gradio WebUI · Desktop client 无代码：mineru.net 在线版 · Gradio WebUI · 桌面客户端

🖥️ Deployment (Private · Fully Offline)

🖥️ 部署（私有化 · 完全离线）

Inference Backend Best For pipeline: Fast & stable, no hallucination, runs on CPU or GPU 推理后端 Pipeline：快速稳定，无幻觉，支持 CPU 或 GPU 运行

vlm-engine: High accuracy, supports vLLM / LMDeploy / mlx ecosystem vlm-engine：高精度，支持 vLLM / LMDeploy / mlx 生态

hybrid-engine: High accuracy, native text extraction, low hallucination hybrid-engine：高精度，原生文本提取，低幻觉

Domestic AI chips: Ascend · Cambricon · Enflame · MetaX · Moore Threads · Kunlunxin · Iluvatar · Hygon · Biren · T-Head 国产 AI 芯片：昇腾 · 寒武纪 · 燧原 · 沐曦 · 摩尔线程 · 昆仑芯 · 天数智芯 · 海光 · 壁仞 · 平头哥

Changelog

更新日志

2026/06/18 3.4 Released

2026/06/18 发布 3.4 版本

This release focuses on OCR capability upgrades for the pipeline backend, OCR processing pipeline optimization, and model download experience improvements. The main updates include: 本次发布重点关注 Pipeline 后端的 OCR 能力升级、OCR 处理流水线优化以及模型下载体验的改进。主要更新包括：

OCR model upgrade and processing acceleration: The OCR model for the pipeline backend has been upgraded to PP-OCRv6, improving OCR accuracy by about 11% on OmniDocBench v1.6. Removed Japanese, Traditional Chinese, English, and Latin options from OCR language selection. These scenarios are now routed to the ch OCR model, simplifying model configuration and language selection. Optimized the OCR inference and processing pipeline, increasing OCR processing speed by about 100% and significantly improving parsing efficiency for batch documents and OCR-intensive documents.
OCR 模型升级与处理加速：Pipeline 后端的 OCR 模型升级至 PP-OCRv6，在 OmniDocBench v1.6 基准测试中 OCR 准确率提升约 11%。移除了 OCR 语言选择中的日语、繁体中文、英语和拉丁语选项，这些场景现在统一路由至 ch（中文）OCR 模型，简化了模型配置和语言选择。优化了 OCR 推理和处理流水线，OCR 处理速度提升约 100%，显著提高了批量文档和 OCR 密集型文档的解析效率。
Model download logic optimization: Added automatic model source selection, allowing first-time installations to choose a better model source based on the current network environment. Before downloading models, MinerU now prioritizes checking locally downloaded model cache files. Cache hits can be reused directly, reducing repeated downloads and unnecessary remote requests. For more details about model source configuration, automatic source selection, and local model usage, see the Model Source Documentation.
模型下载逻辑优化：增加了模型源自动选择功能，允许首次安装时根据当前网络环境选择更优的模型源。在下载模型前，MinerU 现在会优先检查本地已下载的模型缓存文件。缓存命中可直接复用，减少了重复下载和不必要的远程请求。有关模型源配置、自动选择及本地模型使用的更多详情，请参阅《模型源文档》。

With the 3.4 release, MinerU further improves the parsing accuracy and processing efficiency of the pipeline backend in OCR scenarios. It also optimizes model downloads, cache reuse, and local configuration write-back, making first-time installation, model updates, and multi-environment deployment more stable and automated. 随着 3.4 版本的发布，MinerU 进一步提升了 Pipeline 后端在 OCR 场景下的解析准确率和处理效率。同时优化了模型下载、缓存复用和本地配置回写，使得首次安装、模型更新及多环境部署更加稳定和自动化。

2026/06/11 3.3 Released

2026/06/11 发布 3.3 版本

This release focuses on Hybrid parsing performance optimization and VLM model capability upgrades. The main updates include: 本次发布重点关注 Hybrid 解析性能优化和 VLM 模型能力升级。主要更新包括：

New effort parsing-strength parameter for the Hybrid backend: Added two parsing-strength levels, medium and high, allowing users to balance parsing speed, parsing accuracy, and feature requirements. On OmniDocBench v1.6, medium reduces overall accuracy by only 0.13 points compared with high, while delivering 35% ~ 220% parsing speed improvements across different devices and scenarios.
Hybrid 后端新增 effort 解析强度参数：新增 medium（中）和 high（高）两个解析强度等级，允许用户平衡解析速度、解析准确率和功能需求。在 OmniDocBench v1.6 上，medium 相比 high 仅降低了 0.13 个点的整体准确率，但在不同设备和场景下带来了 35% ~ 220% 的解析速度提升。
Performance improvements: Linux (text PDF ~80% faster, OCR ~35% faster); Windows (text PDF ~90% faster, OCR ~45% faster); macOS (text PDF ~220% faster, OCR ~50% faster). The default Hybrid backend now uses effort=medium, significantly improving overall parsing efficiency while maintaining high parsing accuracy. The medium level does not support image analysis; for maximum parsing accuracy or image analysis support, switch to the high-strength parsing mode with effort=high.
性能提升：Linux（文本 PDF 提速约 80%，OCR 提速约 35%）；Windows（文本 PDF 提速约 90%，OCR 提速约 45%）；macOS（文本 PDF 提速约 220%，OCR 提速约 50%）。默认的 Hybrid 后端现在使用 effort=medium，在保持高解析准确率的同时显著提升了整体解析效率。medium 等级不支持图像分析；如需最高解析准确率或图像分析支持，请切换至 effort=high 的高强度解析模式。
VLM model upgraded to MinerU2.5-Pro-2605-1.2B: Fixed multiple model issues found in the 2604 version, further improving parsing stability on complex documents. Added native multilingual OCR support, reducing the need for extra language-parameter configuration and improving out-of-the-box usability for multilingual documents.
VLM 模型升级至 MinerU2.5-Pro-2605-1.2B：修复了 2604 版本中发现的多个模型问题，进一步提升了复杂文档的解析稳定性。增加了原生多语言 OCR 支持，减少了对额外语言参数配置的需求，提升了多语言文档的开箱即用体验。

2026/04/18 3.1.0 Released

2026/04/18 发布 3.1.0 版本

This release focuses on licensing openness, parsing accuracy, and full-format native support. 本次发布重点关注开源许可开放、解析准确率以及全格式原生支持。

License upgrade: MinerU has officially moved from AGPLv3 to the MinerU Open Source License, a custom license based on Apache 2.0. This change significantly reduces adoption friction for both community users and commercial deployments.
许可升级：MinerU 已正式从 AGPLv3 迁移至基于 Apache 2.0 的 MinerU 开源许可。这一变更显著降低了社区用户和商业部署的采用门槛。
VLM main model upgrade: Upgraded to MinerU2.5-Pro-2604-1.2B, bringing overall parsing accuracy to a state-of-the-art level. Supports image/chart parsing, truncated paragraph merging, cross-page table merging, and image recognition inside tables.
VLM 主模型升级：升级至 MinerU2.5-Pro-2604-1.2B，将整体解析准确率提升至行业领先水平。支持图像/图表解析、截断段落合并、跨页表格合并以及表格内图像识别。
Full-format native parsing support: Native parsing support has now been extended to PPTX and XLSX. MinerU now fully supports parsing across images, PDF, DOCX, PPTX, and XLSX.
全格式原生解析支持：原生解析支持现已扩展至 PPTX 和 XLSX。至此，MinerU 已全面支持图像、PDF、DOCX、PPTX 和 XLSX 的解析。