Documentation Index
Fetch the complete documentation index at: https://docs.zerogpu.ai/llms.txt
Use this file to discover all available pages before exploring further.
Create batch
POST /v1/batches, after you have an input file id.
Retrieve batch
GET /v1/batches/{batch_id}, poll status and file ids.
List batches
GET /v1/batches, paginate with after.
Cancel batch
POST /v1/batches/{batch_id}/cancel.
ZeroGPU specifics on top of the OpenAI shape:
completion_window must be "24h" (the only supported value).
- The
endpoint field must be /v1/chat/completions (the only batchable endpoint, see supported endpoint).
- Authentication uses
x-api-key + x-project-id headers.
Endpoint summary
| Method | Path | Purpose |
|---|
| POST | /v1/batches | Create a new batch from an uploaded JSONL file |
| GET | /v1/batches | List batches in this project |
| GET | /v1/batches/{batch_id} | Retrieve a single batch (with live counts) |
| POST | /v1/batches/{batch_id}/cancel | Request cancellation of an in-flight batch |
Authentication
Every request must include:
x-api-key: <your-api-key>
x-project-id: <your-project-uuid>
See Errors reference for the auth failure codes (401, 403, 429).
The Batch object
This is the shape returned by create, retrieve, cancel, and inside data[]
of list.
{
"id": "batch_01HZX...",
"object": "batch",
"endpoint": "/v1/chat/completions",
"errors": null,
"input_file_id": "file-abc123...",
"completion_window": "24h",
"status": "in_progress",
"output_file_id": null,
"error_file_id": null,
"created_at": 1736290000,
"in_progress_at": 1736290001,
"expires_at": 1736376400,
"finalizing_at": null,
"completed_at": null,
"failed_at": null,
"expired_at": null,
"cancelling_at": null,
"cancelled_at": null,
"request_counts": {
"total": 1500,
"completed": 0,
"failed": 0
},
"metadata": {
"job": "nightly-classify"
}
}
Fields
| Field | Type | Description |
|---|
id | string | Batch identifier (prefix batch_). |
object | string | Always "batch". |
endpoint | string | The endpoint URL every line in this batch targets. |
errors | object | null | When status is failed, this contains { object: "list", data: [BatchError] } describing the JSONL validation failure. Each BatchError is { code, message, line, param }. null for any non-failed batch. |
input_file_id | string | The file-… ID you supplied at creation. |
completion_window | string | Always "24h". |
status | string | Lifecycle state. See Status lifecycle below. |
output_file_id | string | null | The file containing successfully completed lines (populated when status is completed). Download via GET /v1/files/{id}/content. |
error_file_id | string | null | The file containing failed lines (populated when status is completed and at least one line failed). Listed with purpose: "batch_output" and is_error: true. |
created_at | integer | Unix timestamp (seconds) when the batch was created. |
in_progress_at | integer | null | Unix timestamp when validation finished and processing started. Null while validating or failed. |
expires_at | integer | Unix timestamp 24 hours after created_at. Anything not finished by this time becomes expired. |
finalizing_at | integer | null | Unix timestamp when all lines finished and the output files began being written. |
completed_at | integer | null | Unix timestamp when the batch reached completed. Null otherwise. |
failed_at | integer | null | Unix timestamp when JSONL validation rejected the input. Null otherwise. |
expired_at | integer | null | Unix timestamp when the 24-hour window elapsed without completion. Null otherwise. |
cancelling_at | integer | null | Unix timestamp when cancellation was requested via POST /v1/batches/{id}/cancel. |
cancelled_at | integer | null | Unix timestamp when cancellation finished draining and the batch became terminal. |
request_counts.total | integer | Total number of lines in the input file. |
request_counts.completed | integer | Lines that completed with a 2xx response. |
request_counts.failed | integer | Lines that failed (HTTP non-2xx, retries exhausted, or other system error). |
metadata | object | Arbitrary JSON you supplied at creation, echoed back unchanged. Empty {} when not supplied; never null. |
request_counts.completed + request_counts.failed equals total only when
the batch is terminal.
Status lifecycle
| Status | Meaning |
|---|
validating | The row is persisted; the JSONL is being streamed and validated synchronously. Brief, transient state. |
in_progress | Validation succeeded; lines are being processed. |
finalizing | All lines have been processed; the service is writing the output and error files. Brief, transient state. |
completed | Terminal. output_file_id (and error_file_id if there were failures) are populated. |
failed | Terminal. JSONL validation rejected the input. errors is populated with the offending line(s). The row is still retrievable via GET /v1/batches/{id} so SDKs can introspect what went wrong. |
expired | Terminal. The 24-hour window elapsed before all lines completed. Whatever finished is still available via output_file_id / error_file_id. |
cancelling | Cancellation was requested via POST /v1/batches/{id}/cancel. Lines already dispatched are allowed to finish; pending lines are short-circuited to the error file with code: "batch_cancelled". |
cancelled | Terminal. Cancellation finished; output and error files reflect whatever was processed before the cancel request. |
Once a batch reaches completed, expired, failed, or cancelled, no
further state changes occur.
POST /v1/batches: Create a batch
Submits a new batch job. The response returns immediately with
status: "in_progress", actual processing happens asynchronously.
Request
POST /v1/batches
Content-Type: application/json
x-api-key: <key>
x-project-id: <uuid>
{
"input_file_id": "file-abc123...",
"endpoint": "/v1/chat/completions",
"completion_window": "24h",
"metadata": { "job": "nightly-classify" }
}
| Field | Required | Description |
|---|
input_file_id | Yes | The ID of a JSONL file you uploaded with purpose=batch. Must exist in this project. |
endpoint | Yes | Must be /v1/chat/completions, the only batchable endpoint. See supported endpoint. Must match the url value used by every line in the JSONL file. |
completion_window | No | Must be "24h" if provided. Other values are rejected. Defaults to "24h". |
metadata | No | Any JSON object. Echoed back unchanged on retrieve. Useful for tagging batches (job ID, dataset version, etc.). |
Validation is synchronous; processing is asynchronousAll input-file and JSONL validation happens before the response returns -
if it succeeds, the batch is durably committed. Lines are then processed
asynchronously through an internal queue.
What happens at creation
The server performs all these checks synchronously before returning a
response:
- The input file exists in the project.
- The JSONL is streamed from storage and validated line by line:
- Each line is parseable JSON with
custom_id, method, url, body.
custom_id values are unique within the batch.
- All lines share the same
url, and that url matches the endpoint
parameter.
- No line has
stream: true.
- Per-line size ≤ 1 MB; total file size ≤ 200 MB; total lines ≤ 50,000.
- The batch row is persisted; one queue message is enqueued per line.
Any validation failure returns 400 with the offending line. Once the
response returns, the batch is durably committed and processing has begun.
Response: 200 OK
A Batch object with:
status: "in_progress" (the response is sent after validating succeeds)
output_file_id: null, error_file_id: null
request_counts: { total: N, completed: 0, failed: 0 }
Errors
| Status | Cause |
|---|
400 | input_file_id is required |
400 | endpoint is required |
400 | completion_window must be "24h" |
400 | endpoint "X" does not match the url "Y" used by the input file (when the endpoint field disagrees with the url in the JSONL) |
400 | JSONL validation failure. The error body includes line (1-based) to point at the offending line. The batch is also persisted with status: "failed" so it can be retrieved later via GET /v1/batches/{id}. |
404 | Input file not found: <file_id>, the input file is missing, deleted, or in another project. |
401 / 403 / 429 / 500 | See Errors reference. |
Validation errors look like:
{
"error": {
"message": "Line 5 duplicates custom_id \"req-1\"",
"type": "invalid_request_error",
"code": "invalid_request_error",
"param": null,
"line": 5
}
}
Example
curl -X POST https://api.zerogpu.ai/v1/batches \
-H "x-api-key: $ZGPU_API_KEY" \
-H "x-project-id: $ZGPU_PROJECT_ID" \
-H "content-type: application/json" \
-d '{
"input_file_id": "file-abc123...",
"endpoint": "/v1/chat/completions",
"completion_window": "24h",
"metadata": { "job": "nightly-classify" }
}'
GET /v1/batches/{batch_id}: Retrieve a batch
Returns a single batch with live counters if the batch is still
in-progress, or the final snapshot if it has reached a terminal state.
This is the endpoint you poll while waiting for a batch to finish.
Request
GET /v1/batches/{batch_id}
x-api-key: <key>
x-project-id: <uuid>
Response: 200 OK
A Batch object. When status is in_progress or
finalizing, the request_counts reflect the current real-time progress.
Example mid-flight:
{
"id": "batch_01HZX...",
"status": "in_progress",
"input_file_id": "file-abc123...",
"output_file_id": null,
"error_file_id": null,
"request_counts": { "total": 1500, "completed": 1342, "failed": 8 },
...
}
Example after completion:
{
"id": "batch_01HZX...",
"status": "completed",
"input_file_id": "file-abc123...",
"output_file_id": "file_out_xyz...",
"error_file_id": "file_err_xyz...",
"completed_at": 1736295000,
"request_counts": { "total": 1500, "completed": 1485, "failed": 15 },
...
}
Polling recommendations
- Batches are asynchronous; the first response is always
in_progress.
- A 30-second poll interval is a reasonable default. You can poll faster for
small batches.
- The
Retry-After header is not set; pick your own interval.
- Once
status is completed, expired, or failed, no further state
changes occur, stop polling.
Use this endpoint for live countsRetrieve returns real-time request_counts while a batch is
running. The list endpoint may lag by a few seconds, for accurate
progress, retrieve specific batches.
Errors
| Status | Cause |
|---|
400 | Missing batch ID. |
404 | Batch not found, or belongs to another project. |
401 / 403 / 429 / 500 | See Errors reference. |
Example
curl https://api.zerogpu.ai/v1/batches/batch_01HZX... \
-H "x-api-key: $ZGPU_API_KEY" \
-H "x-project-id: $ZGPU_PROJECT_ID"
GET /v1/batches: List batches
Returns batches in reverse-chronological order (newest first) with
cursor-based pagination.
Request
GET /v1/batches[?limit=...&after=...]
x-api-key: <key>
x-project-id: <uuid>
| Param | Default | Description |
|---|
limit | 20 | Number of batches per page. Clamped to [1, 100]. |
after | - | Cursor: pass last_id from the previous response to get the next page. |
Response: 200 OK
{
"object": "list",
"data": [ { /* Batch object */ }, { /* Batch object */ } ],
"first_id": "batch_01HZX...",
"last_id": "batch_01HZW...",
"has_more": true
}
| Field | Description |
|---|
object | Always "list". |
data | Batch objects in newest-first order. |
first_id | The id of the first item on the page (or null if data is empty). |
last_id | The id of the last item on the page. Use this as after for the next page. |
has_more | true if there are more batches beyond this page. |
Important: for batches still in flight, request_counts returned by the
list endpoint may lag behind the real-time counters. If you need accurate
progress for a specific batch, call GET /v1/batches/{id} instead.
Errors
| Status | Cause |
|---|
401 / 403 / 429 / 500 | See Errors reference. |
Example
# Page 1
curl "https://api.zerogpu.ai/v1/batches?limit=20" \
-H "x-api-key: $ZGPU_API_KEY" \
-H "x-project-id: $ZGPU_PROJECT_ID"
# Page 2 (using last_id from the previous response)
curl "https://api.zerogpu.ai/v1/batches?limit=20&after=batch_01HZW..." \
-H "x-api-key: $ZGPU_API_KEY" \
-H "x-project-id: $ZGPU_PROJECT_ID"
POST /v1/batches/{batch_id}/cancel: Cancel a batch
Requests cancellation of an in-flight batch. Returns immediately with the
batch object in status: "cancelling"; the batch then drains and
transitions to cancelled once all pending lines have been short-circuited.
Request
POST /v1/batches/{batch_id}/cancel
x-api-key: <key>
x-project-id: <uuid>
Response: 200 OK
The updated Batch object. On the first successful
cancel, status is cancelling and cancelling_at is populated.
Subsequent calls are idempotent and return the same shape; once the worker
finishes draining, polling GET /v1/batches/{id} will show
status: "cancelled" with cancelled_at populated.
Lines that completed before the cancel request are written to
output_file_id as normal. Lines that never dispatched appear in
error_file_id with error.code: "batch_cancelled".
Errors
| Status | Cause |
|---|
400 | Missing batch ID. |
404 | Batch not found, or belongs to another project. |
409 | Batch is already in a terminal state (completed, failed, expired) and cannot be cancelled. |
401 / 403 / 429 / 500 | See Errors reference. |
Example
curl -X POST https://api.zerogpu.ai/v1/batches/batch_01HZX.../cancel \
-H "x-api-key: $ZGPU_API_KEY" \
-H "x-project-id: $ZGPU_PROJECT_ID"
Limits
| Limit | Value |
|---|
| Max lines per batch | 50,000 |
| Max total input size | 200 MB |
| Max per-line size | 1 MB |
| Completion window | 24 hours (fixed) |
| Concurrent batches per project | enforced by quota, not by a hard cap |
Streaming (stream: true in line bodies) | rejected at creation |
After completion
When a batch reaches status: "completed":
output_file_id is set if at least one line returned 2xx. Download with
GET /v1/files/{output_file_id}/content. See
JSONL format for the line schema.
error_file_id is set if at least one line failed. Download with
GET /v1/files/{error_file_id}/content. See
JSONL format for the line schema.
- Both files share
purpose: "batch_output". Error files are additionally
tagged with the ZeroGPU-specific is_error: true flag on the File object
so you can distinguish them in GET /v1/files?purpose=batch_output. Both
follow the same 30-day retention policy as your uploads.
When a batch reaches status: "expired", whatever work finished before the
24-hour deadline is still recorded in output_file_id and error_file_id.
Anything that didn’t make it through appears in the error file with
code: "batch_expired".
Next steps
JSONL format →
Exact line schema for input, output, and error files.
Supported endpoints →
Body and response shape for /v1/chat/completions.
Errors reference →
Every HTTP status, every validation message, every JSONL error code.