Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
-
Updated
Nov 27, 2025 - Python
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
Get your documents ready for gen AI
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
borb is a library for reading, creating and manipulating PDF files in python.
Open source Python library for converting PDF to DOCX.
A library for converting HTML into PDFs using ReportLab
A high-quality PDF to Markdown tool based on large language model visual recognition. 一款基于大模型视觉识别的高质量PDF转Markdown工具
Easily deployable 🚀 API to convert PDF to markdown quickly with high accuracy.
📚 Process PDFs, Word documents and more with spaCy
Easily deployable and scalable backend server that efficiently converts various document formats (pdf, docx, pptx, html, images, etc) into Markdown. With support for both CPU and GPU processing, it is Ideal for large-scale workflows, it offers text/table extraction, OCR, and batch processing with sync/async endpoints.
Simple yet powerful automation stuffs.
文档(小说、论文、字幕)翻译工具(支持 pdf/word/excel/json/epub/srt...)Document (Novel, Thesis, Subtitle) Translation Tool (Supports pdf/word/excel/json/epub/srt...)
pdfCropMargins -- a program to crop the margins of PDF files
Extract annotations (highlights and scribbles) from PDF, EPUB, and notebooks marked with reMarkable tablets. Export to Markdown, PDF, PNG, SVG
A small utility making use of the pypdf library to provide a (somewhat) lighter alternative to pdftk
Interact with the Deep Search platform for new knowledge explorations and discoveries
Office 文件(Word、Excel、PPT)批量转为 PDF 文件,文档完善,自用满意
天枢 - 企业级 AI 一站式数据预处理平台 | PDF/Office转Markdown | 支持MCP协议AI助手集成 | Vue3+FastAPI全栈方案 | 文档解析 | 多模态信息提取
Convert PDF files to nicely structured Markdown and EPUB format with intelligent layout detection using AI.
This is an "Image to PDF Converter" (open source) software build for Windows. You can convert your image files to PDF(Portable Document Format) files using this software. Just select your image files or the folder containing the images and convert them all into PDF file quickly.
Add a description, image, and links to the pdf-converter topic page so that developers can more easily learn about it.
To associate your repository with the pdf-converter topic, visit your repo's landing page and select "manage topics."