Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.zerogpu.ai/llms.txt

Use this file to discover all available pages before exploring further.

Create batch

POST /v1/batches, after you have an input file id.

Retrieve batch

GET /v1/batches/{batch_id}, poll status and file ids.

List batches

GET /v1/batches, paginate with after.

Cancel batch

POST /v1/batches/{batch_id}/cancel.
ZeroGPU specifics on top of the OpenAI shape:
  • completion_window must be "24h" (the only supported value).
  • The endpoint field must be /v1/chat/completions (the only batchable endpoint, see supported endpoint).
  • Authentication uses x-api-key + x-project-id headers.

Endpoint summary

MethodPathPurpose
POST/v1/batchesCreate a new batch from an uploaded JSONL file
GET/v1/batchesList batches in this project
GET/v1/batches/{batch_id}Retrieve a single batch (with live counts)
POST/v1/batches/{batch_id}/cancelRequest cancellation of an in-flight batch

Authentication

Every request must include:
x-api-key:    <your-api-key>
x-project-id: <your-project-uuid>
See Errors reference for the auth failure codes (401, 403, 429).

The Batch object

This is the shape returned by create, retrieve, cancel, and inside data[] of list.
{
  "id":                "batch_01HZX...",
  "object":            "batch",
  "endpoint":          "/v1/chat/completions",
  "errors":            null,
  "input_file_id":     "file-abc123...",
  "completion_window": "24h",
  "status":            "in_progress",
  "output_file_id":    null,
  "error_file_id":     null,
  "created_at":        1736290000,
  "in_progress_at":    1736290001,
  "expires_at":        1736376400,
  "finalizing_at":     null,
  "completed_at":      null,
  "failed_at":         null,
  "expired_at":        null,
  "cancelling_at":     null,
  "cancelled_at":      null,
  "request_counts": {
    "total":     1500,
    "completed": 0,
    "failed":    0
  },
  "metadata": {
    "job": "nightly-classify"
  }
}

Fields

FieldTypeDescription
idstringBatch identifier (prefix batch_).
objectstringAlways "batch".
endpointstringThe endpoint URL every line in this batch targets.
errorsobject | nullWhen status is failed, this contains { object: "list", data: [BatchError] } describing the JSONL validation failure. Each BatchError is { code, message, line, param }. null for any non-failed batch.
input_file_idstringThe file-… ID you supplied at creation.
completion_windowstringAlways "24h".
statusstringLifecycle state. See Status lifecycle below.
output_file_idstring | nullThe file containing successfully completed lines (populated when status is completed). Download via GET /v1/files/{id}/content.
error_file_idstring | nullThe file containing failed lines (populated when status is completed and at least one line failed). Listed with purpose: "batch_output" and is_error: true.
created_atintegerUnix timestamp (seconds) when the batch was created.
in_progress_atinteger | nullUnix timestamp when validation finished and processing started. Null while validating or failed.
expires_atintegerUnix timestamp 24 hours after created_at. Anything not finished by this time becomes expired.
finalizing_atinteger | nullUnix timestamp when all lines finished and the output files began being written.
completed_atinteger | nullUnix timestamp when the batch reached completed. Null otherwise.
failed_atinteger | nullUnix timestamp when JSONL validation rejected the input. Null otherwise.
expired_atinteger | nullUnix timestamp when the 24-hour window elapsed without completion. Null otherwise.
cancelling_atinteger | nullUnix timestamp when cancellation was requested via POST /v1/batches/{id}/cancel.
cancelled_atinteger | nullUnix timestamp when cancellation finished draining and the batch became terminal.
request_counts.totalintegerTotal number of lines in the input file.
request_counts.completedintegerLines that completed with a 2xx response.
request_counts.failedintegerLines that failed (HTTP non-2xx, retries exhausted, or other system error).
metadataobjectArbitrary JSON you supplied at creation, echoed back unchanged. Empty {} when not supplied; never null.
request_counts.completed + request_counts.failed equals total only when the batch is terminal.

Status lifecycle

StatusMeaning
validatingThe row is persisted; the JSONL is being streamed and validated synchronously. Brief, transient state.
in_progressValidation succeeded; lines are being processed.
finalizingAll lines have been processed; the service is writing the output and error files. Brief, transient state.
completedTerminal. output_file_id (and error_file_id if there were failures) are populated.
failedTerminal. JSONL validation rejected the input. errors is populated with the offending line(s). The row is still retrievable via GET /v1/batches/{id} so SDKs can introspect what went wrong.
expiredTerminal. The 24-hour window elapsed before all lines completed. Whatever finished is still available via output_file_id / error_file_id.
cancellingCancellation was requested via POST /v1/batches/{id}/cancel. Lines already dispatched are allowed to finish; pending lines are short-circuited to the error file with code: "batch_cancelled".
cancelledTerminal. Cancellation finished; output and error files reflect whatever was processed before the cancel request.
Once a batch reaches completed, expired, failed, or cancelled, no further state changes occur.

POST /v1/batches: Create a batch

Submits a new batch job. The response returns immediately with status: "in_progress", actual processing happens asynchronously.

Request

POST /v1/batches
Content-Type: application/json
x-api-key:    <key>
x-project-id: <uuid>
{
  "input_file_id":     "file-abc123...",
  "endpoint":          "/v1/chat/completions",
  "completion_window": "24h",
  "metadata":          { "job": "nightly-classify" }
}
FieldRequiredDescription
input_file_idYesThe ID of a JSONL file you uploaded with purpose=batch. Must exist in this project.
endpointYesMust be /v1/chat/completions, the only batchable endpoint. See supported endpoint. Must match the url value used by every line in the JSONL file.
completion_windowNoMust be "24h" if provided. Other values are rejected. Defaults to "24h".
metadataNoAny JSON object. Echoed back unchanged on retrieve. Useful for tagging batches (job ID, dataset version, etc.).
Validation is synchronous; processing is asynchronousAll input-file and JSONL validation happens before the response returns - if it succeeds, the batch is durably committed. Lines are then processed asynchronously through an internal queue.

What happens at creation

The server performs all these checks synchronously before returning a response:
  1. The input file exists in the project.
  2. The JSONL is streamed from storage and validated line by line:
    • Each line is parseable JSON with custom_id, method, url, body.
    • custom_id values are unique within the batch.
    • All lines share the same url, and that url matches the endpoint parameter.
    • No line has stream: true.
    • Per-line size ≤ 1 MB; total file size ≤ 200 MB; total lines ≤ 50,000.
  3. The batch row is persisted; one queue message is enqueued per line.
Any validation failure returns 400 with the offending line. Once the response returns, the batch is durably committed and processing has begun.

Response: 200 OK

A Batch object with:
  • status: "in_progress" (the response is sent after validating succeeds)
  • output_file_id: null, error_file_id: null
  • request_counts: { total: N, completed: 0, failed: 0 }

Errors

StatusCause
400input_file_id is required
400endpoint is required
400completion_window must be "24h"
400endpoint "X" does not match the url "Y" used by the input file (when the endpoint field disagrees with the url in the JSONL)
400JSONL validation failure. The error body includes line (1-based) to point at the offending line. The batch is also persisted with status: "failed" so it can be retrieved later via GET /v1/batches/{id}.
404Input file not found: <file_id>, the input file is missing, deleted, or in another project.
401 / 403 / 429 / 500See Errors reference.
Validation errors look like:
{
  "error": {
    "message": "Line 5 duplicates custom_id \"req-1\"",
    "type":    "invalid_request_error",
    "code":    "invalid_request_error",
    "param":   null,
    "line":    5
  }
}

Example

curl -X POST https://api.zerogpu.ai/v1/batches \
  -H "x-api-key: $ZGPU_API_KEY" \
  -H "x-project-id: $ZGPU_PROJECT_ID" \
  -H "content-type: application/json" \
  -d '{
    "input_file_id":     "file-abc123...",
    "endpoint":          "/v1/chat/completions",
    "completion_window": "24h",
    "metadata":          { "job": "nightly-classify" }
  }'

GET /v1/batches/{batch_id}: Retrieve a batch

Returns a single batch with live counters if the batch is still in-progress, or the final snapshot if it has reached a terminal state. This is the endpoint you poll while waiting for a batch to finish.

Request

GET /v1/batches/{batch_id}
x-api-key:    <key>
x-project-id: <uuid>

Response: 200 OK

A Batch object. When status is in_progress or finalizing, the request_counts reflect the current real-time progress. Example mid-flight:
{
  "id":             "batch_01HZX...",
  "status":         "in_progress",
  "input_file_id":  "file-abc123...",
  "output_file_id": null,
  "error_file_id":  null,
  "request_counts": { "total": 1500, "completed": 1342, "failed": 8 },
  ...
}
Example after completion:
{
  "id":             "batch_01HZX...",
  "status":         "completed",
  "input_file_id":  "file-abc123...",
  "output_file_id": "file_out_xyz...",
  "error_file_id":  "file_err_xyz...",
  "completed_at":   1736295000,
  "request_counts": { "total": 1500, "completed": 1485, "failed": 15 },
  ...
}

Polling recommendations

  • Batches are asynchronous; the first response is always in_progress.
  • A 30-second poll interval is a reasonable default. You can poll faster for small batches.
  • The Retry-After header is not set; pick your own interval.
  • Once status is completed, expired, or failed, no further state changes occur, stop polling.
Use this endpoint for live countsRetrieve returns real-time request_counts while a batch is running. The list endpoint may lag by a few seconds, for accurate progress, retrieve specific batches.

Errors

StatusCause
400Missing batch ID.
404Batch not found, or belongs to another project.
401 / 403 / 429 / 500See Errors reference.

Example

curl https://api.zerogpu.ai/v1/batches/batch_01HZX... \
  -H "x-api-key: $ZGPU_API_KEY" \
  -H "x-project-id: $ZGPU_PROJECT_ID"

GET /v1/batches: List batches

Returns batches in reverse-chronological order (newest first) with cursor-based pagination.

Request

GET /v1/batches[?limit=...&after=...]
x-api-key:    <key>
x-project-id: <uuid>
ParamDefaultDescription
limit20Number of batches per page. Clamped to [1, 100].
after-Cursor: pass last_id from the previous response to get the next page.

Response: 200 OK

{
  "object":   "list",
  "data":     [ { /* Batch object */ }, { /* Batch object */ } ],
  "first_id": "batch_01HZX...",
  "last_id":  "batch_01HZW...",
  "has_more": true
}
FieldDescription
objectAlways "list".
dataBatch objects in newest-first order.
first_idThe id of the first item on the page (or null if data is empty).
last_idThe id of the last item on the page. Use this as after for the next page.
has_moretrue if there are more batches beyond this page.
Important: for batches still in flight, request_counts returned by the list endpoint may lag behind the real-time counters. If you need accurate progress for a specific batch, call GET /v1/batches/{id} instead.

Errors

StatusCause
401 / 403 / 429 / 500See Errors reference.

Example

# Page 1
curl "https://api.zerogpu.ai/v1/batches?limit=20" \
  -H "x-api-key: $ZGPU_API_KEY" \
  -H "x-project-id: $ZGPU_PROJECT_ID"

# Page 2 (using last_id from the previous response)
curl "https://api.zerogpu.ai/v1/batches?limit=20&after=batch_01HZW..." \
  -H "x-api-key: $ZGPU_API_KEY" \
  -H "x-project-id: $ZGPU_PROJECT_ID"

POST /v1/batches/{batch_id}/cancel: Cancel a batch

Requests cancellation of an in-flight batch. Returns immediately with the batch object in status: "cancelling"; the batch then drains and transitions to cancelled once all pending lines have been short-circuited.

Request

POST /v1/batches/{batch_id}/cancel
x-api-key:    <key>
x-project-id: <uuid>

Response: 200 OK

The updated Batch object. On the first successful cancel, status is cancelling and cancelling_at is populated. Subsequent calls are idempotent and return the same shape; once the worker finishes draining, polling GET /v1/batches/{id} will show status: "cancelled" with cancelled_at populated. Lines that completed before the cancel request are written to output_file_id as normal. Lines that never dispatched appear in error_file_id with error.code: "batch_cancelled".

Errors

StatusCause
400Missing batch ID.
404Batch not found, or belongs to another project.
409Batch is already in a terminal state (completed, failed, expired) and cannot be cancelled.
401 / 403 / 429 / 500See Errors reference.

Example

curl -X POST https://api.zerogpu.ai/v1/batches/batch_01HZX.../cancel \
  -H "x-api-key: $ZGPU_API_KEY" \
  -H "x-project-id: $ZGPU_PROJECT_ID"

Limits

LimitValue
Max lines per batch50,000
Max total input size200 MB
Max per-line size1 MB
Completion window24 hours (fixed)
Concurrent batches per projectenforced by quota, not by a hard cap
Streaming (stream: true in line bodies)rejected at creation

After completion

When a batch reaches status: "completed":
  • output_file_id is set if at least one line returned 2xx. Download with GET /v1/files/{output_file_id}/content. See JSONL format for the line schema.
  • error_file_id is set if at least one line failed. Download with GET /v1/files/{error_file_id}/content. See JSONL format for the line schema.
  • Both files share purpose: "batch_output". Error files are additionally tagged with the ZeroGPU-specific is_error: true flag on the File object so you can distinguish them in GET /v1/files?purpose=batch_output. Both follow the same 30-day retention policy as your uploads.
When a batch reaches status: "expired", whatever work finished before the 24-hour deadline is still recorded in output_file_id and error_file_id. Anything that didn’t make it through appears in the error file with code: "batch_expired".

Next steps

JSONL format →

Exact line schema for input, output, and error files.

Supported endpoints →

Body and response shape for /v1/chat/completions.

Errors reference →

Every HTTP status, every validation message, every JSONL error code.