From Print Magazine
to Structured Content
Open Reader transforms PDF magazines into structured, CMS-ready articles using AI-powered extraction with human review. Upload a magazine, get back every article with full text, images, metadata, and precise page coordinates.
Platform Capabilities
A complete pipeline from PDF upload to structured article export — powered by Google Document AI and Gemini.
Magazine Digitization
Transform print magazines into structured digital content. Every article, image, and layout element is extracted and organized automatically.
Dual AI Discovery
Table of Contents parsing + visual page scanning work together to find every article — even ads, editorials, and items not listed in the TOC.
Precise Layout Mapping
Google Document AI OCR provides pixel-accurate bounding boxes. Every text block and image is mapped to its exact position on the page.
Split-View Review
Side-by-side view: original page on the left, extracted content on the right. Click any block to highlight its region on the page.
Dual-Layer Editing
AI extractions are preserved as a read-only layer. Manual corrections are stored separately — reset to AI defaults anytime.
CMS-Ready Export
Export approved articles as structured JSON with full metadata, content blocks, and image assets — ready for any CMS or publishing platform.
How It Works
Three AI-powered pipeline stages, then a final human review step — together they transform a raw PDF into reviewed, export-ready articles.
Stage 1 — PDF Extraction & OCR
Render page images at multiple DPI. Extract native images. Google Document AI provides OCR text with precise bounding boxes for every text block.
Stage 2 — AI Content Analysis
Dual discovery: TOC parsing + visual scan find all articles. Then Gemini extracts full content blocks — titles, paragraphs, images, quotes — for each article.
Stage 3 — Layout Mapping
AI-extracted text is matched to OCR bounding boxes by similarity. Images are linked to native assets. Preview crops are generated for visual review.
Human Review & Export
Split-view editor for human review. Approve, edit, or reject articles. Export structured JSON with metadata, content blocks, and images.
Built For
Open Reader is designed for organizations that need to digitize and structure print publications at scale.
Trade & Industry Magazines
HR magazines, business journals, and industry publications with complex multi-column layouts, TOCs, and mixed content types.
Corporate Communications
Internal newsletters, annual reports, and stakeholder magazines that need to be archived and searchable in digital format.
Publishing Archives
Digitize decades of print back-issues. Extract articles with metadata, making historical content discoverable and reusable.
Ready to Digitize?
Sign in to browse processed magazines, upload new issues, and start extracting articles.