conduit

Using PaddleOCR-VL-1.5 with llama-server for book OCR

· 0 reactions · 0 comments · 2 views
Using PaddleOCR-VL-1.5 with llama-server for book OCR

I've been running PaddleOCR-VL-1.5 via llama.cpp's server for OCR on book pages. It handles complex layouts, tables, and mixed text/figure pages surprisingly well. Setup: - Model: PaddleOCR-VL-1.5-GGUF + mmproj.gguf - Backend: llama-server (Vulkan on Windows) - Pipeline: layout detection → region OCR → Markdown with HTML tables The pipeline can process an entire folder of page photos end-to-end. You can basically digitalise a book with a single command. Repo: Has anyone else experimented with vi

Original article
Reddit
Read full at Reddit →
Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Email

Discussion

More from Reddit