Vision & DAT Concept
Beyond OCR: The Document Automation Transformer
Most people come to Ocriva expecting an OCR tool — something that reads text from images and PDFs. That is certainly part of what we do. But it describes only the first inch of a much longer journey.
Ocriva is a Document Automation Transformer (DAT).
A DAT is not a tool. It is a platform that manages the full lifecycle of a document — from the moment it enters your organization to the moment its data lands where it needs to be, in the right format, integrated into the right system, without any manual intervention.
This document explains what that means, why it matters, and how the DAT model changes the way organizations think about document processing.
NOTE
DAT is not just OCR — it's a complete automation pipeline that handles the entire document lifecycle from ingestion to integration with your business systems.
NOTE
The DAT concept is not unique to any single vendor — it is a category of software. Ocriva's implementation covers all five stages (Ingest, Extract, Transform, Automate, Integrate) in a single platform, whereas many organizations currently stitch together separate tools for each stage.
What Is a Document Automation Transformer?
The word "transformer" here carries a specific meaning. It describes a system that does not merely read documents — it transforms them.
A DAT takes unstructured, messy, human-readable documents and converts them into machine-ready, structured, actionable data that flows automatically into the systems that need it.
The full definition has five parts:
- Ingest — accept documents from any source
- Extract — understand content with AI, not just character recognition
- Transform — convert to any format the downstream system expects
- Automate — process at any volume, on schedule, in real time
- Integrate — push results directly to your systems via API, webhook, or notification
When all five stages work together, document processing is no longer a task. It becomes a pipeline — reliable, measurable, and scalable.
OCR vs. DAT: The Critical Difference
Traditional OCR (Optical Character Recognition) was designed to answer one question: "What text is on this page?"
That is a useful capability. But it is not automation. Consider what happens after OCR reads a document:
- A human receives the raw text output
- A human decides which fields are relevant
- A human copies those fields into another system
- A human checks for errors
- A human routes the result to the next step
OCR removed the typing. It did not remove the thinking, the routing, or the integration work. Everything after the read step was still manual.
DAT removes the entire manual chain.
| Capability | Traditional OCR | Document Automation Transformer |
|---|---|---|
| Read text from image | Yes | Yes |
| Understand document context | No | Yes — AI Vision Models |
| Handle varying layouts | No — requires fixed templates | Yes — AI infers structure |
| Multi-language in one document | Limited | Yes — Thai + English natively |
| Output in structured format | No — raw text only | Yes — JSON, CSV, XML, PDF, DOCX |
| Route results to other systems | No | Yes — webhooks, API, notifications |
| Process in bulk | Limited | Yes — batch up to 50 files |
| Schedule processing jobs | No | Yes — cron-based queue |
| Track accuracy over time | No | Yes — analytics and history |
| Support multiple AI providers | No | Yes — 6 providers, switchable per template |
The shift from OCR to DAT is not an incremental improvement. It is a different category of software.
The Five Stages of Document Automation
Every document that enters Ocriva passes through five stages. Understanding these stages is essential for designing a complete document automation workflow.
Stage 1: Ingest
A document automation system is only useful if it can receive documents reliably, from wherever they originate.
Ocriva accepts documents through multiple channels:
- Single file upload — drag and drop from the web interface
- Batch upload — up to 50 files submitted at once
- REST API — programmatic submission from any application
- Webhook triggers — event-driven ingestion from upstream systems
- LINE integration — photo capture and submission from mobile devices
Supported formats: PDF, JPG, PNG, WebP, BMP
This flexibility means Ocriva fits into your existing workflow without requiring your team to change how they handle documents. Employees can photograph receipts on their phones. Finance systems can push invoices via API. Batch jobs can drop files into a processing queue overnight.
Stage 2: Extract
Extraction is where Ocriva separates itself from traditional OCR most clearly.
Instead of pattern-matching characters against a grid, Ocriva sends documents to AI Vision Models — large language models trained to understand the meaning of a document, not just its typography.
This means:
- A purchase order from Supplier A and Supplier B can look completely different on the page, and Ocriva will extract the same fields from both — because it understands what "total amount due" means, regardless of where it appears
- Handwritten notes within a typed document are readable
- Thai and English in the same document are handled without separate configurations
- Tables, nested fields, and lists are understood structurally, not just as flat text
Ocriva connects to six AI providers: OpenAI, Google Gemini, Anthropic, DeepSeek, Qwen, and Kimi. Different templates can use different providers, allowing you to optimize for accuracy, speed, or cost depending on the document type.
Extraction is guided by Templates — reusable definitions that tell the AI what fields to extract, in what format, and with what instructions. Once a template is created for invoice extraction, it processes every invoice the same way, consistently.
TIP
Different document types have different accuracy and cost tradeoffs across AI providers. Start with a cheaper, faster model (e.g., Gemini Flash) during template development, then benchmark against a higher-accuracy model (e.g., GPT-4o) before deciding which to use in production.
Stage 3: Transform
Raw extracted data rarely matches the format a downstream system expects. The Transform stage converts extracted data into whatever format your systems consume.
Ocriva supports seven output formats:
| Format | Best For |
|---|---|
| JSON | API integrations, structured databases |
| CSV | Spreadsheets, data pipelines, analytics |
| Human-readable reports, archiving | |
| DOCX | Editable documents, review workflows |
| XML | Legacy enterprise systems |
| HTML | Web display, email embedding |
| Text | Simple pipelines, logging |
For batch processing, results can be combined into a single export file — one CSV containing all extracted fields from 50 invoices, for example.
The Transform stage ensures that Ocriva does not create a new data silo. It produces data in the shape your existing tools already understand.
Stage 4: Automate
Processing one document manually is trivial. Processing 10,000 documents per month — consistently, reliably, without errors — is an automation problem.
The Automate stage handles volume, scheduling, and reliability:
- Batch processing — submit 50 documents and monitor real-time progress as each one completes
- Processing queue — documents are queued and processed in order, with retry logic for failures
- Cron scheduling — configure jobs to run at specific intervals (nightly batch processing, for example)
- Real-time status tracking — every document has a status: pending, in_progress, completed, or failed
- WebSocket updates — the UI updates live as processing progresses, no refresh needed
- Credit management — usage is tracked and billed automatically, with low-balance alerts
At this stage, document processing becomes infrastructure — something that runs in the background without human supervision.
Stage 5: Integrate
Extracted, transformed data must reach its destination. The Integrate stage is what makes DAT different from a document viewer.
Ocriva pushes results to your systems through multiple channels:
- Webhooks — HTTP POST to your endpoint when a document is processed, a batch completes, or an error occurs
- REST API — pull results programmatically at any time
- WebSocket — real-time push to web clients
- LINE notifications — alerts to LINE users when processing is complete
- Email notifications — configurable alerts for completions and errors
This means a processed invoice does not sit in Ocriva waiting for someone to log in and download it. It flows automatically to your accounting system, triggers an approval workflow, or populates a database row — all without human involvement.
The DAT Pipeline: An Overview
Documents In → [Ingest] → [Extract] → [Transform] → [Automate] → Data Out
PDF API AI JSON Webhook ERP
Image Batch Multi-AI CSV API CRM
Scan LINE Templates PDF Notify DatabaseEach stage adds value. Each stage removes a manual step. Together, they create a continuous pipeline from document receipt to system integration.
TIP
If you are new to the platform, start with the Getting Started guide. It walks you through creating your first organization, project, template, and document upload in under 15 minutes.
Why This Matters for Businesses
Most organizations that process documents manually have adapted to the friction. They have staff whose job is to type, check, re-type, and route. They have accepted that documents take a day or two to process. They have built approval workflows around the assumption of human bottlenecks.
DAT removes those bottlenecks. The business impact is measurable:
- Speed: Documents processed in seconds instead of hours
- Accuracy: AI extraction errors are lower than human transcription errors, and they are consistent
- Cost: Processing costs do not scale linearly with volume — you pay for credits, not headcount
- Scalability: Processing 10 documents and 10,000 documents uses the same infrastructure
- Auditability: Every extraction is logged with its source document, result, model used, and timestamp
Ocriva's Position
Ocriva is not competing with simple OCR tools. Single-purpose tools that extract text and stop are useful in specific contexts.
Ocriva is for organizations that have moved past the question "can we read this document?" and are asking "can we automate everything that happens to this document after we read it?"
That is the DAT question. And Ocriva is built to answer it.
TIP
If you are evaluating whether DAT fits your use case, a good indicator is whether your team currently downloads extracted data, opens another application, and manually copies values into it. If that step exists, a DAT can eliminate it.
