Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.zerogpu.ai/llms.txt

Use this file to discover all available pages before exploring further.

This page walks through a complete batch from start to finish in both curl and the Python openai SDK. By the end you’ll have submitted 3 chat completions through the Batch API and read the results.

Prerequisites

CredentialWhere to find it
API key (x-api-key)ZeroGPU dashboard → API Keys
Project ID (x-project-id)ZeroGPU dashboard → Projects
Both headers are required on every request. Missing either returns 401.
Keep your API key out of source controlStore it in environment variables, secret managers, or your CI’s secret store, never commit it.
export ZGPU_API_KEY="your-api-key"
export ZGPU_PROJECT_ID="your-project-uuid"

Base URLs

EnvironmentURL
Productionhttps://api.zerogpu.ai
Staginghttps://staging.api.zerogpu.ai
Developmenthttps://dev.api.zerogpu.ai
The Batch and Files endpoints live under these hostnames:
  • POST /v1/files, GET /v1/files, GET /v1/files/{id}, GET /v1/files/{id}/content, DELETE /v1/files/{id}
  • POST /v1/batches, GET /v1/batches, GET /v1/batches/{batch_id}


1. Build the input JSONL

Every batch is driven by a JSONL file where each line is one inference request:
{"custom_id": "req-1", "method": "POST", "url": "/v1/chat/completions", "body": { ... }}
FieldRequiredDescription
custom_idYesYour identifier for the request. Must be unique within the batch. Echoed back in the output so you can match results to inputs.
methodYesMust be "POST".
urlYesMust be /v1/chat/completions, the only supported batch endpoint. All lines in a batch must share the same url.
bodyYesThe JSON body you would send to that endpoint synchronously. "stream": true is rejected, batches are non-streaming.
Full schema and validation rules: JSONL format.
cat > input.jsonl <<'EOF'
{"custom_id":"q-1","method":"POST","url":"/v1/chat/completions","body":{"model":"<model-id>","messages":[{"role":"user","content":"What is the capital of France?"}]}}
{"custom_id":"q-2","method":"POST","url":"/v1/chat/completions","body":{"model":"<model-id>","messages":[{"role":"user","content":"What is the capital of Germany?"}]}}
{"custom_id":"q-3","method":"POST","url":"/v1/chat/completions","body":{"model":"<model-id>","messages":[{"role":"user","content":"What is the capital of Italy?"}]}}
EOF

2. Upload the file

Send the JSONL to POST /v1/files with purpose=batch. The response contains the file_id you’ll reference when creating the batch.
curl -X POST https://api.zerogpu.ai/v1/files \
  -H "x-api-key: $ZGPU_API_KEY" \
  -H "x-project-id: $ZGPU_PROJECT_ID" \
  -F purpose=batch \
  -F [email protected]
Response:
{
  "id":         "file-abc123...",
  "object":     "file",
  "bytes":      612,
  "created_at": 1736290000,
  "filename":   "input.jsonl",
  "purpose":    "batch",
  "status":     "processed",
  "expires_at": 1738882000
}

3. Create the batch

Submit the batch with the file ID, the target endpoint, and a 24-hour completion window. The response returns immediately with status: "in_progress", actual processing is asynchronous.
curl -X POST https://api.zerogpu.ai/v1/batches \
  -H "x-api-key: $ZGPU_API_KEY" \
  -H "x-project-id: $ZGPU_PROJECT_ID" \
  -H "content-type: application/json" \
  -d '{
    "input_file_id":     "file-abc123...",
    "endpoint":          "/v1/chat/completions",
    "completion_window": "24h"
  }'
Response:
{
  "id":             "batch_01HZX...",
  "object":         "batch",
  "endpoint":       "/v1/chat/completions",
  "status":         "in_progress",
  "input_file_id":  "file-abc123...",
  "output_file_id": null,
  "error_file_id":  null,
  "created_at":     1736290000,
  "expires_at":     1736376400,
  "request_counts": { "total": 3, "completed": 0, "failed": 0 }
}
Validation runs at create timeThe server streams the entire input JSONL and validates every line before responding. If anything is wrong, duplicate custom_id, line over 1 MB, stream: true, mismatched url, you’ll get a 400 with the offending line. Once the response returns, the batch is durably committed.

4. Poll until complete

Poll GET /v1/batches/{batch_id} until status is completed, expired, or failed. A 30-second interval is a reasonable default.
while true; do
  RESP=$(curl -s "https://api.zerogpu.ai/v1/batches/$BATCH_ID" \
    -H "x-api-key: $ZGPU_API_KEY" \
    -H "x-project-id: $ZGPU_PROJECT_ID")
  STATUS=$(echo "$RESP" | jq -r '.status')
  echo "$RESP" | jq -c '{status, request_counts}'
  case "$STATUS" in completed|failed|expired) break;; esac
  sleep 30
done
When status is completed, the response contains output_file_id (and, if any line failed, error_file_id).

5. Download the results

Stream the output and error files via GET /v1/files/{file_id}/content.
curl -s "https://api.zerogpu.ai/v1/files/$OUTPUT_FILE_ID/content" \
  -H "x-api-key: $ZGPU_API_KEY" \
  -H "x-project-id: $ZGPU_PROJECT_ID" \
  -o output.jsonl

jq -r '"\(.custom_id): \(.response.body.choices[0].message.content)"' output.jsonl
Output line shape (one per successful line, order not preserved, match by custom_id):
{"id": "batch_req_a", "custom_id": "q-1", "response": {"status_code": 200, "request_id": "req_xyz", "body": { ... }}}
Error line shape (one per failed line):
{"id": "batch_req_b", "custom_id": "q-2", "response": null, "error": {"code": "invalid_request_error", "message": "[legacy:http_400] ...", "param": null}}
Full schemas: JSONL format.

Next steps

JSONL format →

Exact line schema for input, output, and error files.

Supported endpoints →

Body and response shape for /v1/chat/completions.

Batches API reference →

Status lifecycle, limits, full Batch object schema.

More examples →

Full Python script, plus a realistic IAB-classification batch.