Retry & Rate Limiting

The Ocriva API enforces rate limits to ensure fair usage and stable performance for all users. When your application exceeds the limit, the API returns 429 Too Many Requests. This guide explains the limits, the response headers you can inspect, and the retry strategies you should implement.

Rate Limits Overview

Scope	Limit
Requests per minute	60 per API key
Batch files per request	50 files
Concurrent connections	Unlimited (subject to per-minute cap)

Rate limits are tracked per API key, not per IP address or per organization. If you have multiple services sharing one key, their combined request count is measured against the same 60 req/min quota. Use separate API keys for separate services to isolate their limits.

The limit window is a fixed one-minute bucket that resets on the clock minute (e.g., 09:00:00 → 09:01:00). Requests are counted at the moment they reach the server. A burst of 60 requests at 09:00:59 exhausts the window; the next request is allowed at 09:01:00.

NOTE

Credits and rate limits are independent. Exhausting your credit balance produces a different error (402 Payment Required or a credit-related 400), not a 429. Always monitor both independently.

Rate Limit Headers

Every API response includes headers that tell you your current quota state. Read these headers proactively rather than waiting for a 429.

Header	Type	Description
`X-RateLimit-Limit`	`number`	Maximum requests allowed in the current window (always `60`)
`X-RateLimit-Remaining`	`number`	Requests remaining in the current window
`X-RateLimit-Reset`	`number`	Unix timestamp (seconds) when the current window resets
`Retry-After`	`number`	Only on 429 responses. Seconds to wait before retrying.

Example response headers after a successful request:

HTTP/1.1 200 OK
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 42
X-RateLimit-Reset: 1743750060

Example headers on a 429 response:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1743750060
Retry-After: 23

Retry-After is the authoritative signal. It tells you exactly how many seconds until the window resets — always prefer it over calculating from X-RateLimit-Reset yourself.

Handling 429 Responses

When the rate limit is exceeded the API responds with HTTP 429 Too Many Requests and a JSON body:

{
  "statusCode": 429,
  "error": "Too Many Requests",
  "message": "Rate limit exceeded. You have sent 60 requests in the current minute. Please wait and retry."
}

A 429 is a transient error — it does not indicate a problem with your request. The identical request will succeed once the window resets. Your code must never discard the request on a 429; it should queue or retry it.

Common scenarios that trigger 429:

Uploading files in a tight loop without delays between requests
Running multiple parallel workers sharing a single API key
Polling a status endpoint every second instead of using webhooks or exponential backoff
Submitting several individual uploads that should have been a single batch operation

Retry Strategies

Exponential Backoff with Jitter

Exponential backoff doubles the delay after each failed attempt. Adding random jitter prevents multiple clients from retrying simultaneously and amplifying the load spike (the "thundering herd" problem).

Recommended delay formula:

delay = min(base * 2^attempt, cap) + random(0, jitter)

Default parameters used by the Ocriva SDK:

Parameter	Value
Base delay	1 s
Cap	30 s
Jitter range	0–100 ms
Max retries	3

Effective delays before each retry attempt:

Attempt	Base delay	With jitter (approx.)
1st retry	1 s	1.0–1.1 s
2nd retry	2 s	2.0–2.1 s
3rd retry	4 s	4.0–4.1 s

Manual Implementation (TypeScript)

Use this pattern when you are calling the API directly without the SDK, or when you need custom retry logic:

interface RetryOptions {
  maxRetries?: number;
  baseDelayMs?: number;
  capMs?: number;
  jitterMs?: number;
}
 
async function fetchWithRetry<T>(
  fn: () => Promise<T>,
  options: RetryOptions = {},
): Promise<T> {
  const {
    maxRetries = 3,
    baseDelayMs = 1_000,
    capMs = 30_000,
    jitterMs = 100,
  } = options;
 
  let lastError: unknown;
 
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      return await fn();
    } catch (error: unknown) {
      lastError = error;
 
      if (!isRetryable(error) || attempt === maxRetries) {
        throw error;
      }
 
      const retryAfterMs = extractRetryAfter(error);
      const backoff = Math.min(baseDelayMs * 2 ** attempt, capMs);
      const jitter = Math.random() * jitterMs;
      const delay = retryAfterMs ?? backoff + jitter;
 
      console.warn(
        `Request failed (attempt ${attempt + 1}/${maxRetries}). ` +
          `Retrying in ${Math.round(delay)}ms...`,
      );
 
      await sleep(delay);
    }
  }
 
  throw lastError;
}
 
function isRetryable(error: unknown): boolean {
  if (error instanceof Response) return false;
  if (typeof error === 'object' && error !== null && 'status' in error) {
    const status = (error as { status: number }).status;
    // Retry on 429 and 5xx only — never retry 4xx client errors
    return status === 429 || status >= 500;
  }
  // Retry network-level errors (no status code)
  return true;
}
 
function extractRetryAfter(error: unknown): number | undefined {
  if (typeof error === 'object' && error !== null && 'headers' in error) {
    const headers = (error as { headers: Headers }).headers;
    const value = headers?.get?.('Retry-After');
    if (value) {
      const seconds = parseFloat(value);
      if (!Number.isNaN(seconds)) return seconds * 1_000;
    }
  }
  return undefined;
}
 
function sleep(ms: number): Promise<void> {
  return new Promise((resolve) => setTimeout(resolve, ms));
}

Usage:

const result = await fetchWithRetry(
  () =>
    fetch('https://api.ocriva.com/upload/YOUR_ORG_ID', {
      method: 'POST',
      headers: { Authorization: `Bearer ${process.env.OCRIVA_API_KEY}` },
      body: formData,
    }).then((res) => {
      if (!res.ok) throw Object.assign(new Error('Request failed'), { status: res.status, headers: res.headers });
      return res.json();
    }),
  { maxRetries: 3, baseDelayMs: 1_000 },
);

IMPORTANT

Always set a maximum retry count. Retrying indefinitely on a persistent outage will exhaust your rate limit quota even faster and delay recovery. Three retries with backoff is the recommended default.

Respecting the Retry-After Header

When a 429 response includes a Retry-After header, always wait at least that many seconds — even if your backoff formula would suggest a shorter delay. The Retry-After value is authoritative: retrying before it elapses will almost certainly produce another 429.

// Prefer Retry-After over calculated backoff when available
const retryAfterHeader = response.headers.get('Retry-After');
const waitMs = retryAfterHeader
  ? parseFloat(retryAfterHeader) * 1_000
  : baseDelayMs * 2 ** attempt;
 
await sleep(waitMs);

SDK Automatic Retries

If you are using @ocriva/sdk, retries are handled automatically. The SDK catches 429 and 5xx responses, waits the appropriate duration, and re-issues the request transparently.

import { OcrivaClient } from '@ocriva/sdk';
 
const client = new OcrivaClient({
  apiKey: process.env.OCRIVA_API_KEY!,
  maxRetries: 3,   // default; set to 0 to disable
  timeout: 30_000, // per-request timeout in ms
});

Option	Default	Description
`maxRetries`	`3`	Number of retry attempts after the initial request fails
`timeout`	`30000`	Milliseconds before a request times out and is retried

When the SDK receives a 429 with a Retry-After header it uses that value instead of exponential backoff. Once all retries are exhausted it throws a RateLimitError with the retryAfter property set:

import { RateLimitError } from '@ocriva/sdk';
 
try {
  await client.upload.create(orgId, formData);
} catch (error) {
  if (error instanceof RateLimitError) {
    console.error(
      `Rate limit exhausted after all retries. ` +
        `Try again in ${error.retryAfter ?? 'a moment'} second(s).`,
    );
  }
}

For full error class details and the complete error hierarchy see Error Handling.

Batch Operations

The most effective way to reduce request count is to use batch endpoints instead of individual uploads. A batch of 50 files costs one API request instead of 50.

Approach	API requests for 50 files
Individual uploads (loop)	50
Single batch upload	1

// Avoid: 50 individual requests, each consuming rate limit quota
for (const file of files) {
  await client.upload.create(orgId, buildFormData(file)); // one request per file
}
 
// Prefer: one batch request regardless of file count (up to 50)
const batchForm = new FormData();
for (const file of files) {
  batchForm.append('files', file);
}
batchForm.append('projectId', projectId);
batchForm.append('templateId', templateId);
 
await client.batch.upload(orgId, batchForm); // single request

For full batch API details, progress tracking, and export options see Batch Processing.

TIP

When you have more than 50 files, split them into multiple batches of 50. Even submitting several batch requests is far more efficient than hundreds of individual upload requests, and each batch can be submitted with a short deliberate delay between them.

Idempotency

When you retry a request after a 429 or network error, you risk submitting the same operation twice — for example, uploading the same file twice if the first request actually succeeded but the response was lost. Design your code and your API usage to be idempotent.

Webhook Event Idempotency

Webhooks may be delivered more than once. Always use the eventId field in the webhook payload as an idempotency key before processing:

const processedEvents = new Set<string>();
 
app.post('/webhook', (req, res) => {
  const { eventId, event, data } = req.body;
 
  if (processedEvents.has(eventId)) {
    // Already handled — acknowledge without re-processing
    return res.status(200).json({ received: true });
  }
 
  processedEvents.add(eventId);
 
  // Safe to process
  handleWebhookEvent(event, data);
 
  res.status(200).json({ received: true });
});

In production, store processed eventId values in a database (e.g., Redis or MongoDB) with a TTL of at least 24 hours rather than in memory.

Idempotent API Consumer Design

Check for existing results before re-uploading. Query Processing History by filename or external reference before resubmitting a document.
Use stable batch names. Naming a batch with a deterministic key (e.g., "invoices-2026-03") makes it easy to detect if the batch was already submitted.
Track upload IDs. Store the id returned by the upload or batch endpoint. If your job runner restarts, check whether the ID exists before creating a new request.

Best Practices

Pre-check credit balance before large batches. A 50-file batch deducts 50 credits immediately. A 402 mid-batch wastes the quota for files already processed. Fetch the balance via the credits endpoint before submitting.
Use batch endpoints instead of individual uploads. One batch request of 50 files costs one unit of rate limit quota. Uploading the same 50 files one at a time costs 50 units.

Add deliberate delays in tight loops. If you must submit multiple requests sequentially (e.g., several batches), add a short sleep between them:

for (const chunk of chunks) {
  await client.batch.upload(orgId, buildBatchForm(chunk));
  await sleep(1_000); // 1-second pause keeps you well under 60 req/min
}

Monitor rate limit headers proactively. Do not wait for a 429. Read X-RateLimit-Remaining on each response. If it drops below 10, slow down voluntarily.
Use separate API keys per service. Each key has its own independent quota. If you have a background job and a user-facing API sharing one key, one heavy background run can block your users.
Use webhooks instead of polling. Polling a status endpoint every second is a fast way to exhaust 60 req/min. Subscribe to batch.completed or upload.completed events and process asynchronously.

Common Mistakes

Polling Without Backoff

// Wrong: hammers the API every second, exhausts quota in one minute
while (true) {
  const status = await client.processingHistory.get(id, projectId);
  if (status.status === 'completed') break;
  await sleep(1_000); // 1 second — 60 polls per minute = instant rate limit
}
 
// Correct: exponential backoff or use webhooks
let delay = 2_000;
while (true) {
  const status = await client.processingHistory.get(id, projectId);
  if (status.status === 'completed') break;
  await sleep(delay);
  delay = Math.min(delay * 2, 30_000); // back off up to 30 s
}

Ignoring the Retry-After Header

// Wrong: uses a hardcoded delay that may be shorter than the server requires
} catch (error) {
  await sleep(500); // server said wait 23 s; this will fail again immediately
  return retry();
}
 
// Correct: always honour Retry-After
} catch (error) {
  if (error instanceof RateLimitError && error.retryAfter) {
    await sleep(error.retryAfter * 1_000);
  }
}

No Maximum Retry Cap

// Wrong: retries forever — stalls the process and wastes quota during an outage
async function upload(data: FormData): Promise<unknown> {
  try {
    return await client.upload.create(orgId, data);
  } catch {
    return upload(data); // infinite recursion on persistent errors
  }
}
 
// Correct: bounded retries with backoff
const result = await fetchWithRetry(() => client.upload.create(orgId, data), {
  maxRetries: 3,
});

WARNING

Infinite retry loops during an API outage will saturate your rate limit window for the entire duration of the outage, preventing any successful requests from other parts of your application.

Documents

Retry & Rate Limiting

Retry & Rate Limiting

Rate Limits Overview

Rate Limit Headers

Handling 429 Responses

Retry Strategies

Exponential Backoff with Jitter

Manual Implementation (TypeScript)

Respecting the Retry-After Header

SDK Automatic Retries

Batch Operations

Idempotency

Webhook Event Idempotency

Idempotent API Consumer Design

Best Practices

Common Mistakes

Polling Without Backoff

Ignoring the Retry-After Header

No Maximum Retry Cap

Table of Contents