Ocriva Logo

Documents

Result Formats & Output

Configure output formats including JSON, CSV, PDF, DOCX, and image generation.

templatesoutputformatscsvimage

Published: 3/31/2026

Result Formats & Output

Result Formats

The result format determines how the AI returns extracted data. Choosing a format is the first step when creating a template — it controls which configuration options are available and how the AI processes your document.

Choosing a Format

FormatBest ForExtraction ModeField Definition
JSONAPI integrations, structured data, databasesStructuredJSON Schema (19 presets available)
CSVSpreadsheets, Excel, data analysisStructuredColumn names
TextPlain text summaries, simple extractionFree TextNot required
PDFReports, printable documentsFree TextNot required
DOCXWord documents, editing, sharingFree TextNot required
XMLLegacy systems, data exchangeFree TextNot required
HTMLWeb display, email contentFree TextNot required
ImageAI image generation from documentsStructuredImage options

NOTE

When you select Text, PDF, DOCX, XML, or HTML as the result format, the system automatically switches to Free Text extraction mode — no field schema is needed.

NOTE

The text, pdf, and docx formats automatically use Free Text extraction mode — no schema definition is required. The AI responds with a narrative text output based solely on your instructions. This is expected behavior, not an error.

JSON

The default format. The extracted data is returned as a JSON object matching your schema. Use this when integrating with APIs, storing in databases, or building custom processing pipelines.

Example output:

{
  "invoice_number": "INV-2024-0042",
  "invoice_date": "2024-11-15",
  "total_amount": 10700,
  "currency": "THB",
  "vendor": {
    "name": "บริษัท เทค โซลูชัน จำกัด",
    "tax_id": "0105567012345"
  }
}

CSV

Tabular format suitable for Excel or Google Sheets. See the CSV Configuration section below for detailed CSV configuration options including column ordering and two CSV modes.

XML

Structured data in XML tags, compatible with enterprise and legacy systems.

Example output:

<?xml version="1.0" encoding="UTF-8"?>
<extraction>
  <invoice_number>INV-2024-0042</invoice_number>
  <invoice_date>2024-11-15</invoice_date>
  <vendor>
    <name>บริษัท เทค โซลูชัน จำกัด</name>
    <tax_id>0105567012345</tax_id>
  </vendor>
  <total_amount currency="THB">10700</total_amount>
  <line_items>
    <item>
      <description>Cloud Server Monthly</description>
      <quantity>1</quantity>
      <unit_price>10700</unit_price>
    </item>
  </line_items>
</extraction>

DOCX and PDF

The AI-extracted data is formatted into a Word or PDF document. Useful for generating human-readable reports that can be printed, shared, or archived without any additional processing.

NOTE

DOCX and PDF outputs are binary files. The AI generates a formatted document with headings, tables, and structured sections based on the extracted data. You can download the result file directly from Processing History.

Example structure of a generated PDF/DOCX:

  • Title: Document type and reference number
  • Header section: Key metadata (date, vendor, customer)
  • Data table: Extracted line items in a formatted table
  • Summary: Totals, tax calculations, and notes

Text

Raw text output from the AI. Suitable when you need a plain description, summary, or transcription of the document content. Use with the Free Text extraction mode.

Example output:

Invoice INV-2024-0042 dated November 15, 2024.
 
Issued by บริษัท เทค โซลูชัน จำกัด (Tax ID: 0105567012345) to
the purchasing department.
 
Items:
- Cloud Server Monthly: 1 unit × ฿10,700.00 = ฿10,700.00
 
Subtotal: ฿10,700.00
VAT (7%): ฿749.00
Total: ฿11,449.00
Payment terms: Net 30 days.

HTML

The AI returns an HTML-formatted response. Useful when you want to display results directly in a web interface, embed in emails, or render rich text output.

Example output:

<div class="invoice-extraction">
  <h2>Invoice INV-2024-0042</h2>
  <table>
    <tr><th>Date</th><td>2024-11-15</td></tr>
    <tr><th>Vendor</th><td>บริษัท เทค โซลูชัน จำกัด</td></tr>
    <tr><th>Tax ID</th><td>0105567012345</td></tr>
  </table>
  <h3>Line Items</h3>
  <table>
    <tr><th>Description</th><th>Qty</th><th>Price</th><th>Amount</th></tr>
    <tr><td>Cloud Server Monthly</td><td>1</td><td>฿10,700</td><td>฿10,700</td></tr>
  </table>
  <p><strong>Total: ฿11,449.00</strong> (incl. 7% VAT)</p>
</div>

TIP

HTML output is ideal for embedding extraction results directly into email notifications or internal dashboards — no additional rendering step is needed on the receiving end.

Image

The AI generates an image based on your document content or instructions. See the Image Generation section below for detailed image generation configuration.


CSV Configuration

CSV output has two modes that give you control over the structure of the exported spreadsheet.

Simple Mode (Columns Only)

Enable csvColumnsOnly to generate a CSV where each row contains only the values listed in csvColumnOrder, without a full schema expansion. This is ideal for flat data where you want a simple column-by-column layout.

Example configuration:

{
  "resultFormat": "csv",
  "csvColumnsOnly": true,
  "csvColumnOrder": [
    "invoice_number",
    "invoice_date",
    "vendor_name",
    "total_amount",
    "currency"
  ]
}

Output CSV:

invoice_number,invoice_date,vendor_name,total_amount,currency
INV-2024-0042,2024-11-15,บริษัท เทค โซลูชัน จำกัด,10700,THB
INV-2024-0043,2024-11-18,ร้านค้าตัวอย่าง,5350,THB

Full Schema Mode

When csvColumnsOnly is false (the default), the full JSON Schema is used to generate the CSV structure. Nested objects and arrays are flattened using dot notation.

Example — a schema with nested vendor object produces these columns:

invoice_number,invoice_date,vendor.name,vendor.tax_id,total_amount

Column Ordering

Use csvColumnOrder to specify the exact order of columns in the output:

{
  "csvColumnOrder": ["invoice_number", "invoice_date", "total_amount", "vendor.name"]
}

Columns not listed in csvColumnOrder will appear after the ordered columns. This lets you pin the most important columns to the left.

CSV Example: Receipt Scanner

Schema:

{
  "type": "object",
  "properties": {
    "receipt_date": { "type": "string" },
    "store_name": { "type": "string" },
    "item_name": { "type": "string" },
    "quantity": { "type": "number" },
    "unit_price": { "type": "number" },
    "total": { "type": "number" },
    "payment_method": { "type": "string" }
  }
}

CSV config:

{
  "resultFormat": "csv",
  "csvColumnsOnly": true,
  "csvColumnOrder": ["receipt_date", "store_name", "item_name", "quantity", "unit_price", "total", "payment_method"]
}

TIP

Use csvColumnsOnly: true with csvColumnOrder for the simplest, most predictable CSV output. This gives you a clean, flat spreadsheet with only the columns you care about — ideal for importing into Excel or Google Sheets without any post-processing.


Image Generation

When the result format is set to image, Ocriva uses an AI image generation pipeline instead of text extraction. The AI reads your document content or follows your instructions and generates a corresponding image.

This is useful for:

  • Converting text descriptions into visual assets
  • Generating product images from specification sheets
  • Creating illustrations from written content
  • Producing visual summaries of reports

Image Options (imageUserOptions)

Configure image generation behavior with these options:

Aspect Ratio

Controls the dimensions of the generated image:

ValueDimensionsBest For
1:1SquareSocial media, product thumbnails
16:9WidescreenPresentations, banners, hero images
9:16PortraitMobile screens, stories
4:3StandardDocuments, reports
3:4Portrait standardPrint materials

Art Style

Sets the visual style of the generated image:

ValueDescription
realisticPhotorealistic rendering
illustrationClean illustrated style
cartoonCartoon/animated look
sketchHand-drawn sketch style
watercolorWatercolor painting style
3d_render3D rendered appearance
flat_designFlat modern design style
minimalistMinimal, clean composition

Color and Lighting

ValueDescription
vibrantBold, saturated colors
mutedSoft, desaturated palette
dark_moodyDark tones, dramatic lighting
bright_airyLight, airy feel
monochromeBlack and white
warm_tonesWarm orange/red hues
cool_tonesCool blue/green hues

Negative Prompt

Use negativePrompt to list what the AI should avoid in the generated image. This helps prevent unwanted elements:

{
  "imageUserOptions": {
    "aspectRatio": "16:9",
    "artStyle": "illustration",
    "colorLighting": "bright_airy",
    "negativePrompt": "text, watermarks, blurry, low quality, distorted faces"
  }
}

WARNING

Image generation consumes significantly more credits than text extraction. Each image generation request uses the AI image pipeline in addition to document reading. Test image templates on a small sample before processing large volumes.