Use Cases
Document Automation in Practice
The difference between a document processing tool and a Document Automation Transformer (DAT) becomes clearest in real-world scenarios. A tool extracts text. A transformer eliminates a workflow.
The use cases below illustrate what full-pipeline automation looks like — from document receipt through to system integration. Each scenario describes the problem being solved, the documents involved, the data extracted, and the automation flow that replaces the manual process.
Use Case 1: Accounts Payable Automation
Scenario
A mid-size company receives 200–300 supplier invoices per month. Each invoice arrives as a PDF via email, in varying formats from different suppliers. The accounts payable team manually processes each one: extracting vendor details, amounts, and due dates, entering data into the accounting system, routing for approval, and scheduling payment.
At peak periods (month-end, year-end), the backlog can exceed the team's daily capacity, causing late payment fees and strained supplier relationships.
Document Types
- PDF invoices from suppliers
- Email attachments (forwarded to Ocriva via API)
- Scanned paper invoices
Extracted Fields
- Vendor name and tax ID
- Invoice number and date
- Payment due date and payment terms
- Line items (description, quantity, unit price, total)
- Subtotal, VAT amount, grand total
- Bank account details (for payment scheduling)
Automation Flow
Invoice Received (email/upload)
↓
Ocriva: Extract via Invoice Template (GPT-4o)
↓
Webhook → Accounting System API (auto-create payable record)
↓
Webhook → Approval Workflow (auto-route if amount < threshold)
↓
Webhook → Payment Scheduling (set due date reminder)
↓
Notification → AP Team (batch completion summary)Impact
| Metric | Before | After |
|---|---|---|
| Processing time per invoice | 12–15 minutes | Under 10 seconds |
| Error rate | ~2.5% (human transcription) | <0.5% (AI extraction) |
| Month-end backlog | 3–4 days | Same-day processing |
| Late payment incidents | 4–6 per quarter | Near zero |
Configuration Notes
- Use a single Ocriva template covering all supplier formats
- Configure webhook to POST to accounting system when
document.processed - Set auto-approve rule in accounting system for invoices under a threshold amount
- Enable batch export for monthly reconciliation reports
TIP
Use JSON structured mode with a detailed schema for invoice extraction — it produces the most consistent and machine-readable results.
TIP
Use structured JSON output mode for invoice extraction. A well-defined JSON schema with fields like invoice_number, total_amount, and line_items makes it straightforward to map data directly into your accounting system via webhook without any transformation layer.
Use Case 2: Expense Report Processing
Scenario
A company with 150 employees requires staff to submit expense receipts for reimbursement. Staff photograph receipts on their phones and either email them or submit them through an expense app. Finance processes these manually, verifying amounts, dates, and merchants, and entering them into the expense management system.
Volume peaks at the end of each month and before quarterly closes.
Document Types
- Photo receipts (JPG/PNG) — restaurant, taxi, hotel, office supplies
- POS thermal receipts (photographed)
- E-receipts (PDF)
- Fuel receipts and parking tickets
Extracted Fields
- Merchant name and category
- Transaction date and time
- Items purchased (if itemized)
- Subtotal, tax, total amount
- Payment method (cash, card)
- Receipt number or reference
Automation Flow
Employee photographs receipt via LINE
↓
LINE → Ocriva: Receipt Template (Gemini Flash for speed)
↓
Structured JSON result (merchant, amount, date, category)
↓
Webhook → Expense Management System (auto-create draft claim)
↓
Batch CSV Export → Finance Team (monthly review)
↓
Analytics → Expense Category DashboardImpact
- Finance team review time reduced by 70% — review only edge cases, not every receipt
- Employee submission takes seconds (photograph via LINE) instead of manual form-filling
- Consistent categorization across all employees (AI applies consistent rules)
- Monthly close completes 2 days earlier
Configuration Notes
- Enable LINE integration for mobile receipt capture
- Use Gemini Flash for high-volume, simple receipt extraction
- Configure per-employee project access for privacy
- Set up category normalization in template instructions (e.g., "Classify all food expenses as 'Meals & Entertainment'")
Use Case 3: Contract Intelligence
Scenario
A legal and procurement team manages hundreds of active vendor contracts. Contract review at signing requires extracting key terms, obligations, and renewal dates. Ongoing contract management requires alerts when contracts are approaching expiration. Currently, contract data lives in PDFs filed in a document management system — accessible but not queryable.
The team wants to build a searchable contract database with automated renewal alerts.
Document Types
- Vendor service contracts (multi-page PDFs)
- Non-disclosure agreements
- Employment contracts
- Software license agreements
Extracted Fields
- Contracting parties (names, registered addresses, tax IDs)
- Contract type and subject matter
- Effective date and expiration date
- Auto-renewal clause (yes/no, notice period)
- Contract value and payment terms
- Key obligations (summarized)
- Termination conditions
- Governing law and jurisdiction
Automation Flow
Contract PDF uploaded to Ocriva (on signing)
↓
Ocriva: Extract via Contract Template (Claude Sonnet — nuanced long-form)
↓
Structured JSON with all key contract fields
↓
Webhook → Contract Database (create searchable record)
↓
Webhook → Calendar System (set renewal reminder at notice period)
↓
Webhook → Approval Workflow (contracts above value threshold)
↓
Monthly: Batch extraction report → Legal TeamImpact
- Contract review time per document: from 45 minutes (manual) to 2 minutes (review AI output)
- Zero missed renewals (automated calendar alerts)
- Full contract database searchable by party, value, expiration, type
- New contracts processable same-day instead of queuing for legal team availability
Configuration Notes
- Use Claude Sonnet for contract extraction — longer context window, nuanced understanding of legal language
- Include template instructions for summarizing obligation clauses in plain language
- Attach a reference document with your organization's standard contract clause definitions
- Configure webhook to filter auto-renewal alerts by notice period (e.g., alert 90 days before expiry)
Use Case 4: Healthcare Document Processing
Scenario
A private hospital network processes patient documents across multiple departments: admissions, outpatient clinics, pharmacy, and billing. Documents arrive in paper and digital form from multiple sources, including referring physicians, insurance companies, and government health agencies.
Each document type requires different fields for different downstream systems: patient records, billing, insurance claims, and pharmacy management.
Document Types
- Medical certificates (doctor-issued)
- Outpatient prescriptions
- Laboratory test reports
- Hospital admission forms
- Insurance claim forms
- Referral letters from other hospitals
Extracted Fields (per document type)
Medical Certificate:
- Patient name, date of birth, ID number
- Diagnosis (ICD code and description)
- Physician name and license number
- Date of visit, date of certificate
- Recommended rest period
Prescription:
- Patient information
- Medications (name, dosage, frequency, duration)
- Prescribing physician
- Prescription date and validity
Lab Report:
- Patient and sample information
- Test name and code
- Results with reference ranges
- Interpretation (normal/abnormal)
- Laboratory and technician details
Automation Flow
Document received (scan or photo upload)
↓
Ocriva: Route to appropriate template by document type
↓
AI Extraction (Claude Sonnet for medical accuracy)
↓
Webhook → Hospital Information System (patient record update)
↓
Webhook → Billing System (for billable items)
↓
Webhook → Pharmacy System (for prescriptions)
↓
Audit trail: every extraction logged with timestamp and modelImpact
- Admission processing time reduced from 25 minutes to under 5 minutes
- Prescription transcription errors eliminated (manual transcription had ~1.5% error rate)
- Insurance claim preparation accelerated by 60%
- Complete audit trail for regulatory compliance
Configuration Notes
- Use Claude Sonnet or GPT-4o for medical documents where accuracy is non-negotiable
- Configure separate templates per document type within the same project
- Enable extraction history for all documents to support regulatory audit requirements
- Add medical terminology reference documents to templates for improved ICD code recognition
IMPORTANT
Medical documents contain sensitive patient data. Ensure your organization's data handling policies, consent procedures, and applicable health data regulations (e.g., PDPA in Thailand, HIPAA in the US) are in place before routing patient information through any automated pipeline.
Use Case 5: Logistics & Shipping
Scenario
A regional logistics company handles inbound and outbound shipments daily. Documents include package labels, waybills, customs declarations, and delivery confirmations. These documents are photographed by warehouse staff and need to be reconciled with the shipment management system.
Currently, warehouse staff photograph documents and email them to an operations team, who manually enter data into the shipment system — creating a 30–60 minute lag between document capture and system update.
Document Types
- Package labels and barcode labels
- Airway bills (AWB)
- Bill of lading (B/L)
- Customs declaration forms
- Delivery receipts (signed by recipient)
- Commercial invoices for imports
Extracted Fields
- Tracking/waybill number
- Sender and recipient details (name, address, phone)
- Origin and destination
- Package dimensions and weight
- Declared value and contents description
- Customs tariff codes
- Handling instructions (fragile, temperature-controlled, etc.)
Automation Flow
Warehouse staff photographs document (LINE mobile capture)
↓
LINE → Ocriva: Logistics Template (Gemini Flash)
↓
Extracted tracking data and addresses
↓
Webhook → Shipment Tracking System (real-time update)
↓
Webhook → Customer Notification System (SMS/email)
↓
Batch CSV → Daily shipment manifest
↓
Analytics → Throughput and volume dashboardImpact
- Real-time shipment updates instead of 30–60 minute lag
- Warehouse staff capture and submit in under 30 seconds per document
- Customer notification latency reduced from hours to minutes
- Customs documentation errors reduced (common with manual transcription of alphanumeric codes)
Configuration Notes
- LINE integration is ideal for mobile-first warehouse workflows
- Gemini Flash provides the speed needed for high-throughput operations
- Configure template to normalize address formats (important for system import compatibility)
- Use batch export for end-of-day manifest generation
Use Case 6: Education & HR Administration
Scenario
A university admissions office and a large corporation's HR department face similar problems: processing high volumes of structured documents from applicants or employees, each with slightly different formats from different institutions.
The university processes 2,000+ applications per admissions cycle. The HR department processes onboarding documents for 500+ new hires annually, plus ongoing document collection (annual performance reviews, training certifications, promotions).
Document Types (Education)
- Academic transcripts (from various universities and high schools)
- Certificates of completion and diplomas
- Standardized test score reports
- Recommendation letters
- Personal statements (for key information extraction)
Document Types (HR)
- Employment application forms
- ID cards and work permits
- Educational certificates
- Previous employment certificates
- Professional certifications and licenses
- Medical check reports
Extracted Fields (Academic Transcript)
- Student name and ID
- Institution name and accreditation
- Program name and major
- GPA and grading scale
- Courses taken with grades and credit hours
- Graduation status and date
Extracted Fields (HR — New Hire Onboarding)
- Full name (Thai and English)
- National ID and work permit number
- Date of birth and nationality
- Highest education and institution
- Previous employer and job title
- Emergency contact information
Automation Flow
Applicant uploads documents (batch, via application portal API)
↓
Ocriva: Extract via document-type-specific templates
↓
Structured JSON per document
↓
Webhook → Student Information System / HRIS (create applicant/employee record)
↓
Batch CSV → Admissions Committee / HR Review dashboard
↓
Analytics → Application volume by program / department
↓
Alert → Missing document notification (if required field is empty)Impact (University)
- Admissions office processes 2,000 applications in 2 days instead of 3 weeks
- Consistent data quality across applications from hundreds of different institutions
- Early identification of incomplete applications (missing documents flagged automatically)
Impact (HR)
- New hire onboarding documentation complete on day one instead of day three
- HR staff focus on candidate evaluation, not data entry
- Single source of truth for employee document data from day one
Configuration Notes
- Create separate templates per document type within a single "Onboarding" project
- Use template instructions to standardize GPA scales across institutions (e.g., "Convert all GPA to 4.0 scale")
- Configure "missing field" webhook alerts for required fields that return empty
- Batch upload supports the volume spikes typical of admissions and hiring cycles
Case 7: Automated Document Processing via Google Drive
Scenario
A company with multiple departments uses Google Drive as a central document repository — accounting stores invoices, procurement stores POs, legal stores contracts. Everything lives in a Shared Drive organized by department.
Currently, staff must download documents from Drive and manually key data into their systems. There is a 1–2 day lag between a document being uploaded and the data reaching downstream systems. As document volumes increase during month-end close, the backlog makes the process even slower.
The team wants a system that pulls documents from Drive, processes them automatically, and pushes results back to a designated output folder in Drive — without changing the way each department already works.
Document Types
- Supplier invoices (PDFs uploaded directly to Drive)
- Purchase orders (POs) and delivery notes
- Contracts and agreements (multi-page PDFs)
- Tax invoices and receipts
- Scanned documents from field teams (photographed and uploaded to Drive via mobile)
Extracted Fields
Depends on the document type and configured template. Example for invoices:
- Supplier name and tax ID
- Invoice number and date
- Line items (description, quantity, unit price)
- Subtotal, VAT 7%, grand total
- Payment due date
Automation Flow
Team uploads documents to Google Drive (Shared Folder)
↓
Ocriva: Pull files from Drive Input Folder per Template config
↓
AI extracts data via Template (GPT-4o)
↓
Structured JSON/CSV output
↓
Export → Google Drive Output Folder (results immediately available)
↓
Webhook → Downstream system (ERP / Accounting / CRM)
↓
Analytics → Dashboard summarizing processed documentsImpact
| Metric | Before | After |
|---|---|---|
| Time from upload to data in system | 1–2 days (waiting for manual entry) | Under 5 minutes |
| Team workflow | Required process change | Unchanged — teams keep using Drive as before |
| Data entry errors | High (manual key-in) | Low (AI + validation) |
| Manual steps | Download → read → type → verify | Upload to Drive only |
Configuration Notes
- Connect Google Drive to your Organization via OAuth2 (one-time setup)
- Define an Input Folder (documents awaiting processing) and an Output Folder (results) per Template
- Supports organizational Shared Drives — each department has its own folder, permissions managed through standard Drive access controls
- Compatible with webhooks — on completion, results are both exported back to Drive and posted to downstream systems simultaneously
- Ideal for organizations already on Google Workspace — no retraining required
TIP
Use Drive folder structure as input/output separated per Template — for example, Invoices/Input for invoices awaiting processing and Invoices/Output for results. This gives teams immediate access to results directly in Drive without needing to log in to Ocriva.
Choosing the Right Configuration
| Use Case | Recommended AI | Output Format | Key Integration |
|---|---|---|---|
| Accounts Payable | GPT-4o | JSON | Accounting system webhook |
| Expense Reports | Gemini Flash | CSV | Expense system webhook |
| Contract Intelligence | Claude Sonnet | JSON | Database + Calendar webhook |
| Healthcare | Claude Sonnet / GPT-4o | JSON | HIS/EMR webhook |
| Logistics | Gemini Flash | JSON | Shipment tracking webhook |
| Education / HR | GPT-4o-mini | JSON + CSV | HRIS/SIS webhook |
| Google Drive Automation | GPT-4o | JSON + CSV | Google Drive (Pull/Push) + Webhook |
The right configuration depends on document complexity, volume, accuracy requirements, and downstream system capabilities. All of these use cases can be implemented and running in Ocriva within a single working day.
