Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
-
Updated
Jun 16, 2026 - Python
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
All-in-One Development Tool based on PaddlePaddle
A high-quality PDF to Markdown tool based on large language model visual recognition. 一款基于大模型视觉识别的高质量PDF转Markdown工具
📄🔍 Parse, extract, and analyze documents with ease 📄🔍
Ray-powered accelerator for MinerU, turning PDF → Markdown into a scalable, cluster-ready data infrastructure. 基于 Ray 的 MinerU 加速层,将 PDF → Markdown 构建为可扩展、面向集群的数据基础设施。
A high-quality PDF to Markdown tool based on large language model visual recognition. 一款基于大模型视觉识别的高质量PDF转Markdown工具桌面版
MCP server for Meta's nougat-ocr. Instruct your agent to convert academic papers to Markdown files with high mathematical accuracy
A local academic paper knowledge base powered by Marker PDF engine. PDF upload/parsing, Markdown reading, AI chat, translation, summarization — all in a native desktop window.
Omni Doc Converter Agent PDF,Images,Docx,PPTX,Spreadsheet 万能文档转化助手AI Agent
Convert PDF documents to Markdown using local VLM inference services with this single Go binary.
Add a description, image, and links to the pdf2markdown topic page so that developers can more easily learn about it.
To associate your repository with the pdf2markdown topic, visit your repo's landing page and select "manage topics."