Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.zerogpu.ai/llms.txt

Use this file to discover all available pages before exploring further.

Every batch is driven by three JSONL files:
FileDirectionPurpose
InputYou upload itOne request per line; describes what the batch should do
OutputService writes itOne result per successful line
ErrorService writes itOne record per failed line
This page is the canonical specification for all three line schemas.

Input JSONL

The input is a JSONL file (one JSON object per line, separated by \n). You upload it via POST /v1/files with purpose=batch, then reference its ID when you call POST /v1/batches.

Line schema

{
  "custom_id": "request-1",
  "method":    "POST",
  "url":       "/v1/chat/completions",
  "body":      { /* endpoint-specific payload */ }
}
FieldTypeRequiredDescription
custom_idstringYesYour identifier for this request. Must be non-empty and unique within the batch. Returned verbatim in the corresponding output or error line so you can match results to inputs.
methodstringYesHTTP method, case-insensitive. Only POST (any case) is accepted today.
urlstringYesMust be /v1/chat/completions, the only batchable endpoint. All lines in a single batch must share the same url, and it must match the endpoint field on the create call.
bodyobjectYesThe request body that would be sent to the synchronous endpoint. See Supported endpoints for the shape. The field stream: true is rejected.

Validation rules

The service performs all of the following checks at POST /v1/batches time (not at upload time). Failure returns 400 with the offending line:
  • Each line must be valid JSON (not arrays, not primitives, must be an object).
  • custom_id non-empty string, unique across the whole batch.
  • method is case-insensitive; the upper-cased value must equal "POST".
  • url must be exactly /v1/chat/completions on every line in the file. Other ZeroGPU sync routes are rejected here.
  • body must be a non-empty object.
  • body.stream === true is rejected (streaming is not supported in batch mode).
  • Per-line size ≤ 1 MB.
  • Total file size ≤ 200 MB.
  • Total line count ≤ 50,000.
  • The file must contain at least one line.

Format details

  • Lines are separated by \n (LF). A trailing newline is allowed.
  • Blank lines are skipped.
  • UTF-8 encoding is required.
  • The url value matches the endpoint field on the create request exactly, no trailing slash, no query string, no protocol/host prefix.

Example, chat completions

{"custom_id":"req-1","method":"POST","url":"/v1/chat/completions","body":{"model":"<model-id>","messages":[{"role":"user","content":"What is the capital of France?"}]}}
{"custom_id":"req-2","method":"POST","url":"/v1/chat/completions","body":{"model":"<model-id>","messages":[{"role":"user","content":"What is the capital of Germany?"}]}}
{"custom_id":"req-3","method":"POST","url":"/v1/chat/completions","body":{"model":"<model-id>","messages":[{"role":"user","content":"What is the capital of Italy?"}]}}
See Supported endpoints for the /v1/chat/completions body schema.

Output JSONL

When a batch reaches status: "completed", the service writes the successful results to a file referenced by output_file_id on the Batch object. You download it via GET /v1/files/{file_id}/content. Only lines that received a 2xx response from the underlying endpoint appear in the output file. Failed lines go to the error file (see below). The order of lines in the output file is not guaranteed to match the order of lines in the input file, match results to inputs by custom_id.

Line schema

{
  "id":        "batch_req_abc123...",
  "custom_id": "request-1",
  "response": {
    "status_code": 200,
    "request_id":  "req_xyz789...",
    "body":        { /* the endpoint's normal response body */ }
  }
}
FieldTypeDescription
idstringLine-level identifier (prefix batch_req_). Stable across retries.
custom_idstringEcho of the custom_id you supplied on the input line.
response.status_codeintegerHTTP status from the underlying endpoint (will always be 2xx in this file).
response.request_idstring | nullThe synchronous request id (req_…) of the underlying call to the orchestration API. Useful for cross-referencing logs. null if the upstream service didn’t surface a request id.
response.bodyobjectThe same JSON body the synchronous endpoint would have returned for this request.
See Supported endpoints for the exact response.body shape (OpenAI chat.completion).

Example, chat completions output

{"id":"batch_req_a","custom_id":"req-1","response":{"status_code":200,"request_id":"req_xyz","body":{"id":"chatcmpl-abc","object":"chat.completion","created":1736295000,"model":"<model-id>","choices":[{"index":0,"message":{"role":"assistant","content":"Paris."},"finish_reason":"stop"}],"usage":{"prompt_tokens":12,"completion_tokens":2,"total_tokens":14},"system_fingerprint":null}}}
{"id":"batch_req_b","custom_id":"req-2","response":{"status_code":200,"request_id":"req_uvw","body":{"id":"chatcmpl-def","object":"chat.completion","created":1736295001,"model":"<model-id>","choices":[{"index":0,"message":{"role":"assistant","content":"Berlin."},"finish_reason":"stop"}],"usage":{"prompt_tokens":12,"completion_tokens":2,"total_tokens":14},"system_fingerprint":null}}}

Reading the output

import json

with open("output.jsonl") as f:
    for line in f:
        record = json.loads(line)
        custom_id = record["custom_id"]
        body      = record["response"]["body"]
        answer    = body["choices"][0]["message"]["content"]
        print(f"{custom_id}: {answer}")

Error JSONL

When a batch reaches status: "completed" (or expired) and at least one line failed, the service writes the failures to a file referenced by error_file_id. Successful lines go to the output file; the two are disjoint.

Line schema

{
  "id":        "batch_req_def456...",
  "custom_id": "request-5",
  "response":  null,
  "error": {
    "code":    "invalid_request_error",
    "message": "[legacy:http_400] request body is invalid",
    "param":   "model"
  }
}
FieldTypeDescription
idstringLine-level identifier (prefix batch_req_).
custom_idstringEcho of the custom_id you supplied on the input line.
responsenullAlways null on error lines. Mirrors OpenAI’s {id, custom_id, response, error} discriminator so SDK type-guards work unchanged.
error.codestringOpenAI semantic error code. See the table below.
error.messagestringHuman-readable description. The original ZeroGPU internal code is preserved as a [legacy:<old>] prefix on this string for forensics.
error.paramstring | nullName of the offending parameter when known.

Error codes you might see

error.codeTriggered whenWhat to do
invalid_request_errorThe line body was rejected by the endpoint (bad shape, unknown model, missing field). Legacy prefix: http_400 / http_422.Fix the line’s body and rerun just that line.
authentication_errorThe API key used to submit the batch is no longer valid for this endpoint mid-flight. Legacy: http_401 / http_403 / api_key_unavailable.Check the key status; issue a new key if needed.
not_found_errorThe endpoint or a resource referenced in the body (e.g. model ID) doesn’t exist. Legacy: http_404.Verify model IDs and endpoint URL.
request_too_largeLine body exceeded the per-request size or token limit. Legacy: http_413.Reduce input length.
insufficient_quotaOrganization ran out of quota partway through the batch. Legacy: http_420 / http_429.Top up balance and rerun remaining lines.
rate_limit_exceededThe endpoint rate-limited the request (non-quota).Rerun the line; transient.
internal_errorThe upstream service returned 5xx on every retry, or the queue exhausted its delivery attempts. Legacy: http_5xx / retries_exhausted / dlq_exhausted.Rerun the line; transient.
batch_cancelledThe parent batch was cancelled via POST /v1/batches/{id}/cancel before this line was dispatched.The line did not run; submit a fresh batch if you still need the result.
batch_expiredThe 24-hour completion window elapsed before this line was dispatched.Submit a fresh batch with the remaining lines.

Example, error file

{"id":"batch_req_e1","custom_id":"req-5","response":null,"error":{"code":"invalid_request_error","message":"[legacy:http_400] {\"message\":\"messages: missing required field\",\"type\":\"invalid_request_error\"}","param":null}}
{"id":"batch_req_e2","custom_id":"req-9","response":null,"error":{"code":"internal_error","message":"[legacy:retries_exhausted] Line failed after 4 attempts: orchestration-api returned 503: upstream timeout","param":null}}
{"id":"batch_req_e3","custom_id":"req-12","response":null,"error":{"code":"internal_error","message":"[legacy:dlq_exhausted] Message exhausted main-queue retries and was routed to the DLQ","param":null}}

Reading the error file

import json

with open("errors.jsonl") as f:
    for line in f:
        record = json.loads(line)
        line_id   = record["id"]            # batch_req_…
        custom_id = record["custom_id"]
        code      = record["error"]["code"]  # OpenAI semantic code
        msg       = record["error"]["message"]
        print(f"FAIL {custom_id} ({line_id}) [{code}]: {msg}")

Matching outputs and errors back to inputs

The output and error files together account for all lines in your input file, but their order is not preserved and they may be split across the two files. To reconstruct results for your dataset:
import json

results = {}                                # custom_id -> result-or-error
with open("output.jsonl") as f:
    for line in f:
        rec = json.loads(line)
        results[rec["custom_id"]] = ("ok", rec["response"]["body"])
with open("errors.jsonl") as f:
    for line in f:
        rec = json.loads(line)
        results[rec["custom_id"]] = ("err", rec["error"])

# now iterate your original inputs and look up by custom_id
If a custom_id you submitted is missing from both files, the batch is not complete yet, re-check status and download again.
Always match by custom_idOutput and error lines are not guaranteed to appear in input order. Index your inputs by custom_id in your code and look results up by that key.

Next steps

Supported endpoints →

Body and response shape for /v1/chat/completions.

Examples →

Full end-to-end walkthrough in curl and Python.

Errors reference →

Every JSONL validation message and error code, with recovery guidance.