Vision & DAT Concept

Beyond OCR: The Document Automation Transformer

Most people come to Ocriva expecting an OCR tool — something that reads text from images and PDFs. That is certainly part of what we do. But it describes only the first inch of a much longer journey.

Ocriva is a Document Automation Transformer (DAT).

A DAT is not a tool. It is a platform that manages the full lifecycle of a document — from the moment it enters your organization to the moment its data lands where it needs to be, in the right format, integrated into the right system, without any manual intervention.

This document explains what that means, why it matters, and how the DAT model changes the way organizations think about document processing.

NOTE

DAT is not just OCR — it's a complete automation pipeline that handles the entire document lifecycle from ingestion to integration with your business systems.

NOTE

The DAT concept is not unique to any single vendor — it is a category of software. Ocriva's implementation covers all five stages (Ingest, Extract, Transform, Automate, Integrate) in a single platform, whereas many organizations currently stitch together separate tools for each stage.

What Is a Document Automation Transformer?

The word "transformer" here carries a specific meaning. It describes a system that does not merely read documents — it transforms them.

A DAT takes unstructured, messy, human-readable documents and converts them into machine-ready, structured, actionable data that flows automatically into the systems that need it.

The full definition has five parts:

Ingest — accept documents from any source
Extract — understand content with AI, not just character recognition
Transform — convert to any format the downstream system expects
Automate — process at any volume, on schedule, in real time
Integrate — push results directly to your systems via API, webhook, or notification

When all five stages work together, document processing is no longer a task. It becomes a pipeline — reliable, measurable, and scalable.

OCR vs. DAT: The Critical Difference

Traditional OCR (Optical Character Recognition) was designed to answer one question: "What text is on this page?"

That is a useful capability. But it is not automation. Consider what happens after OCR reads a document:

A human receives the raw text output
A human decides which fields are relevant
A human copies those fields into another system
A human checks for errors
A human routes the result to the next step

OCR removed the typing. It did not remove the thinking, the routing, or the integration work. Everything after the read step was still manual.

DAT removes the entire manual chain.

Capability	Traditional OCR	Document Automation Transformer
Read text from image	Yes	Yes
Understand document context	No	Yes — AI Vision Models
Handle varying layouts	No — requires fixed templates	Yes — AI infers structure
Multi-language in one document	Limited	Yes — Thai + English natively
Output in structured format	No — raw text only	Yes — JSON, CSV, XML, PDF, DOCX
Route results to other systems	No	Yes — webhooks, API, notifications
Process in bulk	Limited	Yes — batch up to 50 files
Schedule processing jobs	No	Yes — cron-based queue
Track accuracy over time	No	Yes — analytics and history
Support multiple AI providers	No	Yes — 6 providers, switchable per template

The shift from OCR to DAT is not an incremental improvement. It is a different category of software.

The Five Stages of Document Automation

Every document that enters Ocriva passes through five stages. Understanding these stages is essential for designing a complete document automation workflow.

Stage 1: Ingest

A document automation system is only useful if it can receive documents reliably, from wherever they originate.

Ocriva accepts documents through multiple channels:

Single file upload — drag and drop from the web interface
Batch upload — up to 50 files submitted at once
REST API — programmatic submission from any application
Webhook triggers — event-driven ingestion from upstream systems
LINE integration — photo capture and submission from mobile devices

Supported formats: PDF, JPG, PNG, WebP, BMP

This flexibility means Ocriva fits into your existing workflow without requiring your team to change how they handle documents. Employees can photograph receipts on their phones. Finance systems can push invoices via API. Batch jobs can drop files into a processing queue overnight.

Stage 2: Extract

Extraction is where Ocriva separates itself from traditional OCR most clearly.

Instead of pattern-matching characters against a grid, Ocriva sends documents to AI Vision Models — large language models trained to understand the meaning of a document, not just its typography.

This means:

A purchase order from Supplier A and Supplier B can look completely different on the page, and Ocriva will extract the same fields from both — because it understands what "total amount due" means, regardless of where it appears
Handwritten notes within a typed document are readable
Thai and English in the same document are handled without separate configurations
Tables, nested fields, and lists are understood structurally, not just as flat text

Ocriva connects to six AI providers: OpenAI, Google Gemini, Anthropic, DeepSeek, Qwen, and Kimi. Different templates can use different providers, allowing you to optimize for accuracy, speed, or cost depending on the document type.

Extraction is guided by Templates — reusable definitions that tell the AI what fields to extract, in what format, and with what instructions. Once a template is created for invoice extraction, it processes every invoice the same way, consistently.

TIP

Different document types have different accuracy and cost tradeoffs across AI providers. Start with a cheaper, faster model (e.g., Gemini Flash) during template development, then benchmark against a higher-accuracy model (e.g., GPT-4o) before deciding which to use in production.

Stage 3: Transform

Raw extracted data rarely matches the format a downstream system expects. The Transform stage converts extracted data into whatever format your systems consume.

Ocriva supports seven output formats:

Format	Best For
JSON	API integrations, structured databases
CSV	Spreadsheets, data pipelines, analytics
PDF	Human-readable reports, archiving
DOCX	Editable documents, review workflows
XML	Legacy enterprise systems
HTML	Web display, email embedding
Text	Simple pipelines, logging

For batch processing, results can be combined into a single export file — one CSV containing all extracted fields from 50 invoices, for example.

The Transform stage ensures that Ocriva does not create a new data silo. It produces data in the shape your existing tools already understand.

Stage 4: Automate

Processing one document manually is trivial. Processing 10,000 documents per month — consistently, reliably, without errors — is an automation problem.

The Automate stage handles volume, scheduling, and reliability:

Batch processing — submit 50 documents and monitor real-time progress as each one completes
Processing queue — documents are queued and processed in order, with retry logic for failures
Cron scheduling — configure jobs to run at specific intervals (nightly batch processing, for example)
Real-time status tracking — every document has a status: pending, in_progress, completed, or failed
WebSocket updates — the UI updates live as processing progresses, no refresh needed
Credit management — usage is tracked and billed automatically, with low-balance alerts

At this stage, document processing becomes infrastructure — something that runs in the background without human supervision.

Stage 5: Integrate

Extracted, transformed data must reach its destination. The Integrate stage is what makes DAT different from a document viewer.

Ocriva pushes results to your systems through multiple channels:

Webhooks — HTTP POST to your endpoint when a document is processed, a batch completes, or an error occurs
REST API — pull results programmatically at any time
WebSocket — real-time push to web clients
LINE notifications — alerts to LINE users when processing is complete
Email notifications — configurable alerts for completions and errors

This means a processed invoice does not sit in Ocriva waiting for someone to log in and download it. It flows automatically to your accounting system, triggers an approval workflow, or populates a database row — all without human involvement.

The DAT Pipeline: An Overview

Documents In → [Ingest] → [Extract] → [Transform] → [Automate] → Data Out
   PDF            API         AI          JSON         Webhook      ERP
   Image          Batch      Multi-AI     CSV          API          CRM  
   Scan           LINE       Templates    PDF          Notify       Database

Each stage adds value. Each stage removes a manual step. Together, they create a continuous pipeline from document receipt to system integration.

TIP

If you are new to the platform, start with the Getting Started guide. It walks you through creating your first organization, project, template, and document upload in under 15 minutes.

Why This Matters for Businesses

Most organizations that process documents manually have adapted to the friction. They have staff whose job is to type, check, re-type, and route. They have accepted that documents take a day or two to process. They have built approval workflows around the assumption of human bottlenecks.

DAT removes those bottlenecks. The business impact is measurable:

Speed: Documents processed in seconds instead of hours
Accuracy: AI extraction errors are lower than human transcription errors, and they are consistent
Cost: Processing costs do not scale linearly with volume — you pay for credits, not headcount
Scalability: Processing 10 documents and 10,000 documents uses the same infrastructure
Auditability: Every extraction is logged with its source document, result, model used, and timestamp

Ocriva's Position

Ocriva is not competing with simple OCR tools. Single-purpose tools that extract text and stop are useful in specific contexts.

Ocriva is for organizations that have moved past the question "can we read this document?" and are asking "can we automate everything that happens to this document after we read it?"

That is the DAT question. And Ocriva is built to answer it.

TIP

If you are evaluating whether DAT fits your use case, a good indicator is whether your team currently downloads extracted data, opens another application, and manually copies values into it. If that step exists, a DAT can eliminate it.

Documents

Vision & DAT Concept

Vision & DAT Concept

Beyond OCR: The Document Automation Transformer

What Is a Document Automation Transformer?

OCR vs. DAT: The Critical Difference

The Five Stages of Document Automation

Stage 1: Ingest

Stage 2: Extract

Stage 3: Transform

Stage 4: Automate

Stage 5: Integrate

The DAT Pipeline: An Overview

Why This Matters for Businesses

Ocriva's Position

Table of Contents