สถาปัตยกรรมและการเชื่อมต่อ

ภาพรวมทางเทคนิค

หน้านี้อธิบายสถาปัตยกรรมทางเทคนิคของแพลตฟอร์ม Document Automation Transformer (DAT) ของ Ocriva — ว่ามันมีโครงสร้างอย่างไร, เอกสารเคลื่อนผ่านมันอย่างไร และวิธี Integration กับมันจากระบบภายนอก

สถาปัตยกรรมของแพลตฟอร์ม

ลำดับชั้น Multi-Tenant

Ocriva ใช้ลำดับชั้น Multi-tenant ที่แต่ละ Organization เป็น Tenant ที่แยกกันอย่างสมบูรณ์ ไม่มีข้อมูลข้ามระหว่าง Organization

Organization (Tenant)
├── Project A
│   ├── Template: เครื่องมือสกัดใบแจ้งหนี้ (JSON output)
│   ├── Template: เครื่องสแกนใบเสร็จ (CSV output)
│   ├── Webhooks: → accounting-api.example.com
│   └── API Tokens: สำหรับการ Integration ระบบบัญชี
├── Project B
│   ├── Template: เครื่องวิเคราะห์สัญญา (Text output)
│   ├── Template: เครื่องสกัด NDA (JSON output)
│   └── API Tokens: สำหรับระบบเอกสารกฎหมาย
├── Project C
│   ├── Template: ชุดการรับเข้าพนักงาน (JSON output)
│   └── Webhooks: → hris.example.com
└── Billing & Credits
    ├── Stripe subscription
    ├── Credit balance
    └── ประวัติการใช้งาน

Organization — Tenant ระดับบนสุด การ Billing, Credits และการจัดการทีมทั้งหมดอยู่ที่นี่ หนึ่ง Organization ต่อบริษัทเป็นรูปแบบที่พบบ่อย แต่รองรับหลาย Organization ต่อบัญชี (มีประโยชน์สำหรับ Agency ที่จัดการลูกค้าหลายราย)

Project — การจัดกลุ่มตรรกะภายใน Organization ใช้ Project เพื่อแยก Use Case, แผนก หรือบัญชีลูกค้า Access Token และ Webhook ถูกตั้งค่าต่อ Project

Template — Configuration การสกัดภายใน Project แต่ละ Template ระบุ: จะใช้โมเดล AI อะไร, ฟิลด์อะไรที่จะสกัด, รูปแบบ Output อะไร Template หลายตัวต่อ Project ช่วยให้โปรเจกต์เดียวจัดการกับประเภทเอกสารต่างๆ ได้

Application Services

แพลตฟอร์ม Ocriva ทำงานเป็นชุดของ Services:

Service	เทคโนโลยี	หน้าที่
API Server	NestJS (Node.js)	REST API, Business Logic, AI Orchestration
Web Frontend	Next.js 14 (React)	User Interface, Document Upload UI
WebSocket Server	NestJS + Socket.IO	Real-time Event Push ไปยัง Clients
CMS	Next.js 14	เอกสารและการจัดการ Content

Processing Pipeline

ทุกเอกสารที่เข้าสู่ Ocriva ทำตาม Pipeline นี้:

Upload → Queue (pending) → AI Processing (in_progress) → Result (completed/failed)
   ↓                              ↓                              ↓
Webhook:                    หัก Credit                    Webhook:
document.uploaded           (เมื่อเริ่มต้น)              document.processed
   ↓                                                           ↓
WebSocket:                                            WebSocket:
อัพเดท UI แบบเรียลไทม์                               Push ผลลัพธ์ไปยัง UI

ทีละขั้นตอน

Upload — Client ส่งเอกสารผ่าน Web UI, REST API หรือ LINE Integration เอกสารถูกเก็บใน Supabase Storage หรือ Google Cloud Storage สร้าง Processing Record พร้อมสถานะ pending
Queue — Processing Record เข้าสู่ Queue Event Webhook document.uploaded ส่งออกไป (ถ้าตั้งค่าไว้) Event WebSocket แจ้งเตือน Client ที่เชื่อมต่อ
AI Processing — Queue Worker รับเอกสาร สถานะเปลี่ยนเป็น in_progress หัก Credit เอกสารถูกส่งไปยัง AI Provider ที่กำหนดค่าพร้อมคำสั่งการสกัดของ Template
Result — AI ส่งคืนข้อมูลที่สกัดมา Processing Record อัพเดทเป็น completed พร้อม Result Payload Event Webhook document.processed ส่งออกไป Event WebSocket Push ผลลัพธ์ไปยัง Client ที่เชื่อมต่อ
การจัดการความล้มเหลว — ถ้า AI Request ล้มเหลว Record ถูก Mark เป็น failed พร้อมรายละเอียดข้อผิดพลาด Webhook document.processed (พร้อมสถานะ failed) ส่งออกไป ความล้มเหลวชั่วคราวทริกเกอร์การ Retry อัตโนมัติ

NOTE

ระบบประมวลผลแบบ queue ทุก 5 วินาที เอกสารจะไม่ถูกประมวลผลทันทีหลังอัปโหลด แต่จะเข้าคิวก่อน ติดตามสถานะได้ผ่าน WebSocket หรือ Processing History

Batch Processing

การส่ง Batch ทำตาม Pipeline เดิมต่อเอกสาร พร้อมการประสานงานระดับ Batch:

ส่ง Batch (N เอกสาร)
        ↓
สร้าง N Individual Processing Records
        ↓
เอกสารประมวลผลแบบขนาน (จนถึงขีดจำกัด Concurrency)
        ↓
WebSocket: อัพเดทสถานะต่อเอกสาร Stream แบบ Live
        ↓
เมื่อ N เอกสารทั้งหมดเสร็จสมบูรณ์: Webhook batch.completed ส่งออกไป
        ↓
Batch Export พร้อมใช้งาน (JSON/CSV รวม)

รูปแบบการ Integration

รูปแบบที่ 1: REST API (Pull-Based)

การ Integration ที่ง่ายที่สุด ระบบของคุณเรียก Ocriva API เพื่อส่งเอกสารและดึงผลลัพธ์

ส่งเอกสาร:

POST /upload
Authorization: Bearer <api-token>
Content-Type: multipart/form-data
 
file: [binary]
templateId: tmpl_abc123
projectId: proj_xyz

ตรวจสอบสถานะการประมวลผล:

GET /processing-history/{processingId}
Authorization: Bearer <api-token>

Response:

{
  "id": "proc_abc123",
  "status": "completed",
  "result": {
    "vendor": "บริษัท เอซีเม จำกัด",
    "invoice_number": "INV-2026-042",
    "total_amount": 15000.00,
    "due_date": "2026-04-30"
  },
  "createdAt": "2026-03-31T09:00:00Z",
  "completedAt": "2026-03-31T09:00:08Z"
}

ใช้รูปแบบนี้เมื่อ: ระบบของคุณเป็นผู้เริ่ม Request และสามารถ Poll เพื่อผลลัพธ์ได้ หรือเมื่อต้องการ Integration แบบ Synchronous

รูปแบบที่ 2: Webhook-Driven (Event-Based Automation)

รูปแบบที่แนะนำสำหรับ Production Automation Ocriva Push ผลลัพธ์ไปยังระบบของคุณทันทีที่การประมวลผลเสร็จสมบูรณ์ — ไม่ต้อง Polling

ตั้งค่า Webhook ในการตั้งค่าโปรเจกต์:

{
  "url": "https://your-system.example.com/ocriva-webhook",
  "events": ["document.processed", "batch.completed"],
  "secret": "your-webhook-signing-secret"
}

Webhook Payload (document.processed):

{
  "event": "document.processed",
  "timestamp": "2026-03-31T10:15:00Z",
  "organizationId": "org_xyz",
  "projectId": "proj_abc",
  "processingId": "proc_123",
  "templateId": "tmpl_456",
  "status": "completed",
  "result": {
    "fields": {
      "vendor": "บริษัท เอซีเม จำกัด",
      "invoice_number": "INV-2026-042",
      "total_amount": 15000.00
    },
    "format": "json"
  }
}

ยืนยัน Webhook Signature (ตัวอย่าง Node.js):

const crypto = require('crypto');
 
function verifyWebhook(payload, signature, secret) {
  const expected = crypto
    .createHmac('sha256', secret)
    .update(payload)
    .digest('hex');
  return crypto.timingSafeEqual(
    Buffer.from(signature),
    Buffer.from(`sha256=${expected}`)
  );
}

ใช้รูปแบบนี้เมื่อ: คุณต้องการการส่งมอบข้อมูลแบบเรียลไทม์และระบบของคุณสามารถรับ HTTP POST Requests ได้

รูปแบบที่ 3: Batch Processing (ปริมาณสูง)

สำหรับสถานการณ์ที่คุณต้องการประมวลผลเอกสารในปริมาณมากและรวบรวมผลลัพธ์รวม

ส่ง Batch:

POST /upload/batch
Authorization: Bearer <api-token>
Content-Type: multipart/form-data
 
files[]: [binary] (สูงสุด 50 ไฟล์)
templateId: tmpl_abc123
projectId: proj_xyz

รับ Webhook การเสร็จสมบูรณ์ของ Batch:

{
  "event": "batch.completed",
  "batchId": "batch_xyz",
  "totalCount": 50,
  "completedCount": 48,
  "failedCount": 2,
  "exportUrl": "https://storage.ocriva.com/exports/batch_xyz.csv"
}

ใช้รูปแบบนี้เมื่อ: คุณมีความต้องการประมวลผลปริมาณสูง, สามารถรับ Latency เล็กน้อยได้ และต้องการ Output รวม

รูปแบบที่ 4: LINE Integration (Mobile Capture)

สำหรับ Workflow ภาคสนามที่เอกสารถ่ายรูปบนอุปกรณ์มือถือ

LINE User → ส่งรูปถ่าย → LINE Official Account
                                ↓
                    Ocriva LINE Bot รับรูปภาพ
                                ↓
                    Route ไปยังโปรเจกต์/Template ที่กำหนดค่า
                                ↓
                    ประมวลผลผ่านการสกัดด้วย AI
                                ↓
                    ส่งผลลัพธ์กลับไปยังบทสนทนา LINE (ตัวเลือก)
                                ↓
                    Fires Webhook ไปยังระบบปลายทาง

ใช้รูปแบบนี้เมื่อ: เอกสารถ่ายรูปในภาคสนาม (โลจิสติกส์, การตรวจสอบ, การส่งมอบ) และพนักงานใช้ LINE

โมเดล Security

Authentication

JWT (ตาม Session):

ใช้โดย Web Application
Token ออกเมื่อ Login เก็บใน httpOnly Cookies
ไม่สามารถเข้าถึงได้จาก JavaScript (ป้องกัน XSS)
หมดอายุเร็วพร้อม Refresh Token Rotation

API Tokens:

ใช้สำหรับการ Integration Service-to-service
สร้างต่อโปรเจกต์ใน Web Interface
มีอายุยาวแต่ Revoke ได้
ใส่ใน Header Authorization: Bearer <token>
Scope ไปยังโปรเจกต์เดียว — ไม่สามารถเข้าถึงโปรเจกต์อื่นได้

IMPORTANT

API Token ควรเก็บใน environment variables เท่านั้น ห้ามฝังลงในโค้ดโดยตรงหรือ commit ขึ้น repository เพราะหากรั่วไหลอาจทำให้ข้อมูลองค์กรถูกเข้าถึงได้

Data Isolation

Query ฐานข้อมูลทั้งหมด Scope ตาม organizationId
Project-level API Tokens ไม่สามารถ Query ทรัพยากรของโปรเจกต์อื่นภายใน Organization เดียวกัน
File Storage ใช้ Bucket Paths ที่ Scope ตาม Organization
ไม่มีการรั่วไหลข้อมูลข้าม Organization ที่เป็นไปได้ในระดับ Query

Webhook Security

Webhook Payloads ถูกลงนามด้วย HMAC-SHA256
Signature รวมอยู่ใน Header X-Ocriva-Signature
ยืนยัน Signature เมื่อรับเพื่อยืนยันความถูกต้องของ Payload
Shared Secret ตั้งค่าใน Webhook Settings ของโปรเจกต์

Input Validation

Input ของ API ทั้งหมดถูก Validate ด้วย class-validator (NestJS Pipes)
การอัปโหลดไฟล์: ตรวจสอบประเภทไฟล์และขนาดก่อนจัดเก็บ
ป้องกัน SQL/NoSQL Injection ผ่าน Mongoose Parameterized Queries
Rate Limiting บน Public Endpoints

Tech Stack

Layer	เทคโนโลยี	หมายเหตุ
Backend API	NestJS 11, TypeScript	สถาปัตยกรรม Modular, Swagger/OpenAPI
ฐานข้อมูล	MongoDB (Mongoose)	Document-oriented, Schema ยืดหยุ่น
Frontend	Next.js 14, React 18, Tailwind CSS 3	App Router, Server Components
WebSocket	NestJS + Socket.IO	การส่ง Event แบบเรียลไทม์
AI Providers	OpenAI, Google Gemini, Anthropic, DeepSeek, Qwen, Kimi	การเลือก Provider ต่อ Template
File Storage	Supabase Storage / Google Cloud Storage	Bucket Paths ที่ Scope ตาม Organization
Auth	Passport.js (JWT, Google OAuth2), Supabase Auth	การจัดการ Session + Social Login
Payments	Stripe	Subscription และการซื้อ Credit
Email	Nodemailer	Transactional Email
Secrets	Doppler	การจัดการ Environment Variable
Deployment	Docker	แต่ละ Service มี Dockerfile

สรุป API Reference

API เต็มรูปแบบถูกจัดทำเอกสารที่ /api/docs (Swagger UI) กลุ่ม Endpoint หลัก:

กลุ่ม	Base Path	คำอธิบาย
Auth	`/auth`	Login, Register, OAuth, Token Refresh
Organizations	`/organizations`	CRUD สำหรับ Organizations
Projects	`/projects`	CRUD สำหรับ Projects
Templates	`/templates`	การจัดการ Template
Upload	`/upload`	การส่งเอกสาร (เดี่ยวและ Batch)
Processing History	`/processing-history`	Query ผลลัพธ์และสถานะ
Analytics	`/analytics`	สถิติการใช้งาน
Webhooks	`/webhooks`	การตั้งค่า Webhook
API Tokens	`/api-tokens`	การจัดการ Token
Credits	`/credits`	Balance และการใช้งาน
LINE	`/line`	การตั้งค่า LINE Integration

ทุก Endpoint ต้องการ Authentication ใช้ Authorization: Bearer <token> กับ API Token หรือ Authenticate ผ่าน Session Cookie จาก Web Interface

Documents

สถาปัตยกรรมและการเชื่อมต่อ

สถาปัตยกรรมและการเชื่อมต่อ

ภาพรวมทางเทคนิค

สถาปัตยกรรมของแพลตฟอร์ม

ลำดับชั้น Multi-Tenant

Application Services

Processing Pipeline

ทีละขั้นตอน

Batch Processing

รูปแบบการ Integration

รูปแบบที่ 1: REST API (Pull-Based)

รูปแบบที่ 2: Webhook-Driven (Event-Based Automation)

รูปแบบที่ 3: Batch Processing (ปริมาณสูง)

รูปแบบที่ 4: LINE Integration (Mobile Capture)

โมเดล Security

Authentication

Data Isolation

Webhook Security

Input Validation

Tech Stack

สรุป API Reference

Table of Contents