pdf2markdown

Here are 11 public repositories matching this topic...

PaddlePaddle / PaddleOCR

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

ocr pdf-parser kie document-translation rag chineseocr ai4science pp-ocr document-parsing pp-structure pdf-extractor-rag pdf2markdown paddleocr-vl

Updated Jun 16, 2026
Python

PaddlePaddle / PaddleX

Star

All-in-One Development Tool based on PaddlePaddle

ocr time-series deployment speech-recognition classification segmentation object-detection ai-pipelines layout-detection formula-recognition pp-chatocr pdf2markdown

Updated Jun 12, 2026
Python

MarkPDFdown / markpdfdown

Star

A high-quality PDF to Markdown tool based on large language model visual recognition. 一款基于大模型视觉识别的高质量PDF转Markdown工具

markdown pdf pdf-converter llm pdf2md pdf2markdown pdf-markdown

Updated Jan 25, 2026
Python

AdemBoukhris457 / Doctra

Star

📄🔍 Parse, extract, and analyze documents with ease 📄🔍

python ocr ai gemini openai extract-data document-analysis image-restoration vlm pdf-parser pdf2markdown documentparsing

Updated Nov 29, 2025
Jupyter Notebook

OpenDCAI / Flash-MinerU

Star

Ray-powered accelerator for MinerU, turning PDF → Markdown into a scalable, cluster-ready data infrastructure. 基于 Ray 的 MinerU 加速层，将 PDF → Markdown 构建为可扩展、面向集群的数据基础设施。

pdf parallel-computing distributed-computing ray multi-gpu pdf-parsing document-ai llm-inference mineru pdf2markdown

Updated Apr 20, 2026
Python

MarkPDFdown / markpdfdown-desktop

Star

A high-quality PDF to Markdown tool based on large language model visual recognition. 一款基于大模型视觉识别的高质量PDF转Markdown工具桌面版

markdown pdf pdf-converter docx2md pdf2md pdf2markdown office2md image2md pptx2md xlsx2md

Updated May 17, 2026
TypeScript

svretina / nougat-mcp

Star

MCP server for Meta's nougat-ocr. Instruct your agent to convert academic papers to Markdown files with high mathematical accuracy

markdown ocr pdf-converter academic-paper pdf2markdown

Updated Feb 19, 2026
Python

DeconBear / kbase

Star

A local academic paper knowledge base powered by Marker PDF engine. PDF upload/parsing, Markdown reading, AI chat, translation, summarization — all in a native desktop window.

pdf2markdown

Updated Jun 15, 2026
Fluent

xusenlin / document-mcp

Star

集成markitdown、LibreOffice、pandoc、headless-shell到mcp和cli,让AI能轻松转换任意文档(md、docx、pdf、html、ppt、xlsx...),运行在容器中，无任何环境破坏

mcp md2pdf doc2md pdf2html doc2pdf pdf2markdown

Updated May 23, 2026
Go

aiagenta2z / omni-doc-assistant-agent

Star

Omni Doc Converter Agent PDF,Images,Docx,PPTX,Spreadsheet 万能文档转化助手AI Agent

resize-images pdfs resize-image pdf2image pdf2markdown

Updated Dec 31, 2025

cheto5144 / pdf2md

Star

Convert PDF documents to Markdown using local VLM inference services with this single Go binary.

pdf benchmark pipeline skills nextjs conversion pdf-document pspdfkit pdf-parser rag nutrient llm gpt-vision conventer pdf2md pdf2markdown pdf-markdown skill-md

Updated Jun 16, 2026

Improve this page

Add a description, image, and links to the pdf2markdown topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the pdf2markdown topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pdf2markdown

Here are 11 public repositories matching this topic...

PaddlePaddle / PaddleOCR

PaddlePaddle / PaddleX

MarkPDFdown / markpdfdown

AdemBoukhris457 / Doctra

OpenDCAI / Flash-MinerU

MarkPDFdown / markpdfdown-desktop

svretina / nougat-mcp

DeconBear / kbase

xusenlin / document-mcp

aiagenta2z / omni-doc-assistant-agent

cheto5144 / pdf2md

Improve this page

Add this topic to your repo