Batches API - ZeroGPU

Create batch

POST /v1/batches, after you have an input file id.

Retrieve batch

GET /v1/batches/{batch_id}, poll status and file ids.

List batches

GET /v1/batches, paginate with after.

Cancel batch

POST /v1/batches/{batch_id}/cancel.

ZeroGPU specifics on top of the OpenAI shape:

completion_window must be "24h" (the only supported value).
The endpoint field must be /v1/chat/completions (the only batchable endpoint, see supported endpoint).
Authentication uses x-api-key + x-project-id headers.

Endpoint summary

Method	Path	Purpose
POST	`/v1/batches`	Create a new batch from an uploaded JSONL file
GET	`/v1/batches`	List batches in this project
GET	`/v1/batches/{batch_id}`	Retrieve a single batch (with live counts)
POST	`/v1/batches/{batch_id}/cancel`	Request cancellation of an in-flight batch

Authentication

Every request must include:

x-api-key:    <your-api-key>
x-project-id: <your-project-uuid>

See Errors reference for the auth failure codes (401, 403, 429).

The Batch object

This is the shape returned by create, retrieve, cancel, and inside data[] of list.

{
  "id":                "batch_01HZX...",
  "object":            "batch",
  "endpoint":          "/v1/chat/completions",
  "errors":            null,
  "input_file_id":     "file-abc123...",
  "completion_window": "24h",
  "status":            "in_progress",
  "output_file_id":    null,
  "error_file_id":     null,
  "created_at":        1736290000,
  "in_progress_at":    1736290001,
  "expires_at":        1736376400,
  "finalizing_at":     null,
  "completed_at":      null,
  "failed_at":         null,
  "expired_at":        null,
  "cancelling_at":     null,
  "cancelled_at":      null,
  "request_counts": {
    "total":     1500,
    "completed": 0,
    "failed":    0
  },
  "metadata": {
    "job": "nightly-classify"
  }
}

Fields

Field	Type	Description
`id`	string	Batch identifier (prefix `batch_`).
`object`	string	Always `"batch"`.
`endpoint`	string	The endpoint URL every line in this batch targets.
`errors`	object \| null	When `status` is `failed`, this contains `{ object: "list", data: [BatchError] }` describing the JSONL validation failure. Each `BatchError` is `{ code, message, line, param }`. `null` for any non-failed batch.
`input_file_id`	string	The `file-…` ID you supplied at creation.
`completion_window`	string	Always `"24h"`.
`status`	string	Lifecycle state. See Status lifecycle below.
`output_file_id`	string \| null	The file containing successfully completed lines (populated when `status` is `completed`). Download via `GET /v1/files/{id}/content`.
`error_file_id`	string \| null	The file containing failed lines (populated when `status` is `completed` and at least one line failed). Listed with `purpose: "batch_output"` and `is_error: true`.
`created_at`	integer	Unix timestamp (seconds) when the batch was created.
`in_progress_at`	integer \| null	Unix timestamp when validation finished and processing started. Null while `validating` or `failed`.
`expires_at`	integer	Unix timestamp 24 hours after `created_at`. Anything not finished by this time becomes `expired`.
`finalizing_at`	integer \| null	Unix timestamp when all lines finished and the output files began being written.
`completed_at`	integer \| null	Unix timestamp when the batch reached `completed`. Null otherwise.
`failed_at`	integer \| null	Unix timestamp when JSONL validation rejected the input. Null otherwise.
`expired_at`	integer \| null	Unix timestamp when the 24-hour window elapsed without completion. Null otherwise.
`cancelling_at`	integer \| null	Unix timestamp when cancellation was requested via `POST /v1/batches/{id}/cancel`.
`cancelled_at`	integer \| null	Unix timestamp when cancellation finished draining and the batch became terminal.
`request_counts.total`	integer	Total number of lines in the input file.
`request_counts.completed`	integer	Lines that completed with a 2xx response.
`request_counts.failed`	integer	Lines that failed (HTTP non-2xx, retries exhausted, or other system error).
`metadata`	object	Arbitrary JSON you supplied at creation, echoed back unchanged. Empty `{}` when not supplied; never `null`.

request_counts.completed + request_counts.failed equals total only when the batch is terminal.

Status lifecycle

Status	Meaning
`validating`	The row is persisted; the JSONL is being streamed and validated synchronously. Brief, transient state.
`in_progress`	Validation succeeded; lines are being processed.
`finalizing`	All lines have been processed; the service is writing the output and error files. Brief, transient state.
`completed`	Terminal. `output_file_id` (and `error_file_id` if there were failures) are populated.
`failed`	Terminal. JSONL validation rejected the input. `errors` is populated with the offending line(s). The row is still retrievable via `GET /v1/batches/{id}` so SDKs can introspect what went wrong.
`expired`	Terminal. The 24-hour window elapsed before all lines completed. Whatever finished is still available via `output_file_id` / `error_file_id`.
`cancelling`	Cancellation was requested via `POST /v1/batches/{id}/cancel`. Lines already dispatched are allowed to finish; pending lines are short-circuited to the error file with `code: "batch_cancelled"`.
`cancelled`	Terminal. Cancellation finished; output and error files reflect whatever was processed before the cancel request.

Once a batch reaches completed, expired, failed, or cancelled, no further state changes occur.

POST `/v1/batches`: Create a batch

Submits a new batch job. The response returns immediately with status: "in_progress", actual processing happens asynchronously.

Request

POST /v1/batches
Content-Type: application/json
x-api-key:    <key>
x-project-id: <uuid>

{
  "input_file_id":     "file-abc123...",
  "endpoint":          "/v1/chat/completions",
  "completion_window": "24h",
  "metadata":          { "job": "nightly-classify" }
}

Field	Required	Description
`input_file_id`	Yes	The ID of a JSONL file you uploaded with `purpose=batch`. Must exist in this project.
`endpoint`	Yes	Must be `/v1/chat/completions`, the only batchable endpoint. See supported endpoint. Must match the `url` value used by every line in the JSONL file.
`completion_window`	No	Must be `"24h"` if provided. Other values are rejected. Defaults to `"24h"`.
`metadata`	No	Any JSON object. Echoed back unchanged on retrieve. Useful for tagging batches (job ID, dataset version, etc.).

Validation is synchronous; processing is asynchronousAll input-file and JSONL validation happens before the response returns - if it succeeds, the batch is durably committed. Lines are then processed asynchronously through an internal queue.

What happens at creation

The server performs all these checks synchronously before returning a response:

The input file exists in the project.
The JSONL is streamed from storage and validated line by line:
- Each line is parseable JSON with custom_id, method, url, body.
- custom_id values are unique within the batch.
- All lines share the same url, and that url matches the endpoint parameter.
- No line has stream: true.
- Per-line size ≤ 1 MB; total file size ≤ 200 MB; total lines ≤ 50,000.
The batch row is persisted; one queue message is enqueued per line.

Any validation failure returns 400 with the offending line. Once the response returns, the batch is durably committed and processing has begun.

Response: `200 OK`

A Batch object with:

status: "in_progress" (the response is sent after validating succeeds)
output_file_id: null, error_file_id: null
request_counts: { total: N, completed: 0, failed: 0 }

Errors

Status	Cause
`400`	`input_file_id is required`
`400`	`endpoint is required`
`400`	`completion_window must be "24h"`
`400`	`endpoint "X" does not match the url "Y" used by the input file` (when the `endpoint` field disagrees with the `url` in the JSONL)
`400`	JSONL validation failure. The error body includes `line` (1-based) to point at the offending line. The batch is also persisted with `status: "failed"` so it can be retrieved later via `GET /v1/batches/{id}`.
`404`	`Input file not found: <file_id>`, the input file is missing, deleted, or in another project.
`401` / `403` / `429` / `500`	See Errors reference.

Validation errors look like:

{
  "error": {
    "message": "Line 5 duplicates custom_id \"req-1\"",
    "type":    "invalid_request_error",
    "code":    "invalid_request_error",
    "param":   null,
    "line":    5
  }
}

Example

curl -X POST https://api.zerogpu.ai/v1/batches \
  -H "x-api-key: $ZGPU_API_KEY" \
  -H "x-project-id: $ZGPU_PROJECT_ID" \
  -H "content-type: application/json" \
  -d '{
    "input_file_id":     "file-abc123...",
    "endpoint":          "/v1/chat/completions",
    "completion_window": "24h",
    "metadata":          { "job": "nightly-classify" }
  }'

GET `/v1/batches/{batch_id}`: Retrieve a batch

Returns a single batch with live counters if the batch is still in-progress, or the final snapshot if it has reached a terminal state. This is the endpoint you poll while waiting for a batch to finish.

Request

GET /v1/batches/{batch_id}
x-api-key:    <key>
x-project-id: <uuid>

Response: `200 OK`

A Batch object. When status is in_progress or finalizing, the request_counts reflect the current real-time progress. Example mid-flight:

{
  "id":             "batch_01HZX...",
  "status":         "in_progress",
  "input_file_id":  "file-abc123...",
  "output_file_id": null,
  "error_file_id":  null,
  "request_counts": { "total": 1500, "completed": 1342, "failed": 8 },
  ...
}

Example after completion:

{
  "id":             "batch_01HZX...",
  "status":         "completed",
  "input_file_id":  "file-abc123...",
  "output_file_id": "file_out_xyz...",
  "error_file_id":  "file_err_xyz...",
  "completed_at":   1736295000,
  "request_counts": { "total": 1500, "completed": 1485, "failed": 15 },
  ...
}

Polling recommendations

Batches are asynchronous; the first response is always in_progress.
A 30-second poll interval is a reasonable default. You can poll faster for small batches.
The Retry-After header is not set; pick your own interval.
Once status is completed, expired, or failed, no further state changes occur, stop polling.

Use this endpoint for live countsRetrieve returns real-time request_counts while a batch is running. The list endpoint may lag by a few seconds, for accurate progress, retrieve specific batches.

Errors

Status	Cause
`400`	Missing batch ID.
`404`	Batch not found, or belongs to another project.
`401` / `403` / `429` / `500`	See Errors reference.

Example

curl https://api.zerogpu.ai/v1/batches/batch_01HZX... \
  -H "x-api-key: $ZGPU_API_KEY" \
  -H "x-project-id: $ZGPU_PROJECT_ID"

GET `/v1/batches`: List batches

Returns batches in reverse-chronological order (newest first) with cursor-based pagination.

Request

GET /v1/batches[?limit=...&after=...]
x-api-key:    <key>
x-project-id: <uuid>

Param	Default	Description
`limit`	`20`	Number of batches per page. Clamped to `[1, 100]`.
`after`	-	Cursor: pass `last_id` from the previous response to get the next page.

Response: `200 OK`

{
  "object":   "list",
  "data":     [ { /* Batch object */ }, { /* Batch object */ } ],
  "first_id": "batch_01HZX...",
  "last_id":  "batch_01HZW...",
  "has_more": true
}

Field	Description
`object`	Always `"list"`.
`data`	Batch objects in newest-first order.
`first_id`	The `id` of the first item on the page (or `null` if `data` is empty).
`last_id`	The `id` of the last item on the page. Use this as `after` for the next page.
`has_more`	`true` if there are more batches beyond this page.

Important: for batches still in flight, request_counts returned by the list endpoint may lag behind the real-time counters. If you need accurate progress for a specific batch, call GET /v1/batches/{id} instead.

Errors

Status	Cause
`401` / `403` / `429` / `500`	See Errors reference.

Example

# Page 1
curl "https://api.zerogpu.ai/v1/batches?limit=20" \
  -H "x-api-key: $ZGPU_API_KEY" \
  -H "x-project-id: $ZGPU_PROJECT_ID"

# Page 2 (using last_id from the previous response)
curl "https://api.zerogpu.ai/v1/batches?limit=20&after=batch_01HZW..." \
  -H "x-api-key: $ZGPU_API_KEY" \
  -H "x-project-id: $ZGPU_PROJECT_ID"

POST `/v1/batches/{batch_id}/cancel`: Cancel a batch

Requests cancellation of an in-flight batch. Returns immediately with the batch object in status: "cancelling"; the batch then drains and transitions to cancelled once all pending lines have been short-circuited.

Request

POST /v1/batches/{batch_id}/cancel
x-api-key:    <key>
x-project-id: <uuid>

Response: `200 OK`

The updated Batch object. On the first successful cancel, status is cancelling and cancelling_at is populated. Subsequent calls are idempotent and return the same shape; once the worker finishes draining, polling GET /v1/batches/{id} will show status: "cancelled" with cancelled_at populated. Lines that completed before the cancel request are written to output_file_id as normal. Lines that never dispatched appear in error_file_id with error.code: "batch_cancelled".

Errors

Status	Cause
`400`	Missing batch ID.
`404`	Batch not found, or belongs to another project.
`409`	Batch is already in a terminal state (`completed`, `failed`, `expired`) and cannot be cancelled.
`401` / `403` / `429` / `500`	See Errors reference.

Example

curl -X POST https://api.zerogpu.ai/v1/batches/batch_01HZX.../cancel \
  -H "x-api-key: $ZGPU_API_KEY" \
  -H "x-project-id: $ZGPU_PROJECT_ID"

Limits

Limit	Value
Max lines per batch	50,000
Max total input size	200 MB
Max per-line size	1 MB
Completion window	24 hours (fixed)
Concurrent batches per project	enforced by quota, not by a hard cap
Streaming (`stream: true` in line bodies)	rejected at creation

After completion

When a batch reaches status: "completed":

output_file_id is set if at least one line returned 2xx. Download with GET /v1/files/{output_file_id}/content. See JSONL format for the line schema.
error_file_id is set if at least one line failed. Download with GET /v1/files/{error_file_id}/content. See JSONL format for the line schema.
Both files share purpose: "batch_output". Error files are additionally tagged with the ZeroGPU-specific is_error: true flag on the File object so you can distinguish them in GET /v1/files?purpose=batch_output. Both follow the same 30-day retention policy as your uploads.

When a batch reaches status: "expired", whatever work finished before the 24-hour deadline is still recorded in output_file_id and error_file_id. Anything that didn’t make it through appears in the error file with code: "batch_expired".

Next steps

JSONL format →

Exact line schema for input, output, and error files.

Supported endpoints →

Body and response shape for /v1/chat/completions.

Errors reference →

Every HTTP status, every validation message, every JSONL error code.

Documentation Index

Create batch

Retrieve batch

List batches

Cancel batch

​Endpoint summary

​Authentication

​The Batch object

​Fields

​Status lifecycle

​POST /v1/batches: Create a batch

​Request

​What happens at creation

​Response: 200 OK

​Errors

​Example

​GET /v1/batches/{batch_id}: Retrieve a batch

​Request

​Response: 200 OK

​Polling recommendations

​Errors

​Example

​GET /v1/batches: List batches

​Request

​Response: 200 OK

​Errors

​Example

​POST /v1/batches/{batch_id}/cancel: Cancel a batch

​Request

​Response: 200 OK

​Errors

​Example

​Limits

​After completion

​Next steps

JSONL format →

Supported endpoints →

Errors reference →

Endpoint summary

Authentication

The Batch object

Fields

Status lifecycle

POST `/v1/batches`: Create a batch

Request

What happens at creation

Response: `200 OK`

Errors

Example

GET `/v1/batches/{batch_id}`: Retrieve a batch

Request

Response: `200 OK`

Polling recommendations

Errors

Example

GET `/v1/batches`: List batches

Request

Response: `200 OK`

Errors

Example

POST `/v1/batches/{batch_id}/cancel`: Cancel a batch

Request

Response: `200 OK`

Errors

Example

Limits

After completion

Next steps