Skip to main content

From Print Magazine
to Structured Content

Open Reader transforms PDF magazines into structured, CMS-ready articles using AI-powered extraction with human review. Upload a magazine, get back every article with full text, images, metadata, and precise page coordinates.

3
AI Stages + Human Review
~$0.17
Cost per Magazine
150 DPI
OCR Resolution
~10-15 min
Processing Time

Platform Capabilities

A complete pipeline from PDF upload to structured article export — powered by Google Document AI and Gemini.

Magazine Digitization

Transform print magazines into structured digital content. Every article, image, and layout element is extracted and organized automatically.

Dual AI Discovery

Table of Contents parsing + visual page scanning work together to find every article — even ads, editorials, and items not listed in the TOC.

Precise Layout Mapping

Google Document AI OCR provides pixel-accurate bounding boxes. Every text block and image is mapped to its exact position on the page.

Split-View Review

Side-by-side view: original page on the left, extracted content on the right. Click any block to highlight its region on the page.

Dual-Layer Editing

AI extractions are preserved as a read-only layer. Manual corrections are stored separately — reset to AI defaults anytime.

CMS-Ready Export

Export approved articles as structured JSON with full metadata, content blocks, and image assets — ready for any CMS or publishing platform.

How It Works

Three AI-powered pipeline stages, then a final human review step — together they transform a raw PDF into reviewed, export-ready articles.

1

Stage 1 — PDF Extraction & OCR

Render page images at multiple DPI. Extract native images. Google Document AI provides OCR text with precise bounding boxes for every text block.

2

Stage 2 — AI Content Analysis

Dual discovery: TOC parsing + visual scan find all articles. Then Gemini extracts full content blocks — titles, paragraphs, images, quotes — for each article.

3

Stage 3 — Layout Mapping

AI-extracted text is matched to OCR bounding boxes by similarity. Images are linked to native assets. Preview crops are generated for visual review.

Human Review & Export

Split-view editor for human review. Approve, edit, or reject articles. Export structured JSON with metadata, content blocks, and images.

Built For

Open Reader is designed for organizations that need to digitize and structure print publications at scale.

Trade & Industry Magazines

HR magazines, business journals, and industry publications with complex multi-column layouts, TOCs, and mixed content types.

Corporate Communications

Internal newsletters, annual reports, and stakeholder magazines that need to be archived and searchable in digital format.

Publishing Archives

Digitize decades of print back-issues. Extract articles with metadata, making historical content discoverable and reusable.

Ready to Digitize?

Sign in to browse processed magazines, upload new issues, and start extracting articles.