Skip to main content
This page walks through a complete batch from start to finish in both curl and the Python openai SDK. By the end you’ll have submitted 3 chat completions through the Batch API and read the results.

Prerequisites

CredentialWhere to find it
API key (x-api-key)ZeroGPU dashboard → API Keys
Project ID (x-project-id)ZeroGPU dashboard → Projects
Both headers are required on every request. Missing either returns 401.
Keep your API key out of source controlStore it in environment variables, secret managers, or your CI’s secret store, never commit it.
export ZGPU_API_KEY="your-api-key"
export ZGPU_PROJECT_ID="your-project-uuid"

Base URLs

EnvironmentURL
Productionhttps://api.zerogpu.ai
Staginghttps://staging.api.zerogpu.ai
Developmenthttps://dev.api.zerogpu.ai
The Batch and Files endpoints live under these hostnames:
  • POST /v1/files, GET /v1/files, GET /v1/files/{id}, GET /v1/files/{id}/content, DELETE /v1/files/{id}
  • POST /v1/batches, GET /v1/batches, GET /v1/batches/{batch_id}

1. Build the input JSONL

Every batch is driven by a JSONL file where each line is one inference request:
{"custom_id": "req-1", "method": "POST", "url": "/v1/chat/completions", "body": { ... }}
FieldRequiredDescription
custom_idYesYour identifier for the request. Must be unique within the batch. Echoed back in the output so you can match results to inputs.
methodYesMust be "POST".
urlYesMust be /v1/chat/completions, the only supported batch endpoint. All lines in a batch must share the same url.
bodyYesThe JSON body you would send to that endpoint synchronously. "stream": true is rejected, batches are non-streaming.
Full schema and validation rules: JSONL format.
cat > input.jsonl <<'EOF'
{"custom_id":"q-1","method":"POST","url":"/v1/chat/completions","body":{"model":"<model-id>","messages":[{"role":"user","content":"What is the capital of France?"}]}}
{"custom_id":"q-2","method":"POST","url":"/v1/chat/completions","body":{"model":"<model-id>","messages":[{"role":"user","content":"What is the capital of Germany?"}]}}
{"custom_id":"q-3","method":"POST","url":"/v1/chat/completions","body":{"model":"<model-id>","messages":[{"role":"user","content":"What is the capital of Italy?"}]}}
EOF

2. Upload the file

Send the JSONL to POST /v1/files with purpose=batch. The response contains the file_id you’ll reference when creating the batch.
curl -X POST https://api.zerogpu.ai/v1/files \
  -H "x-api-key: $ZGPU_API_KEY" \
  -H "x-project-id: $ZGPU_PROJECT_ID" \
  -F purpose=batch \
  -F [email protected]
Response:
{
  "id":         "file-abc123...",
  "object":     "file",
  "bytes":      612,
  "created_at": 1736290000,
  "filename":   "input.jsonl",
  "purpose":    "batch",
  "status":     "processed",
  "expires_at": 1738882000
}

3. Create the batch

Submit the batch with the file ID, the target endpoint, and a 24-hour completion window. The response returns immediately with status: "in_progress", actual processing is asynchronous.
curl -X POST https://api.zerogpu.ai/v1/batches \
  -H "x-api-key: $ZGPU_API_KEY" \
  -H "x-project-id: $ZGPU_PROJECT_ID" \
  -H "content-type: application/json" \
  -d '{
    "input_file_id":     "file-abc123...",
    "endpoint":          "/v1/chat/completions",
    "completion_window": "24h"
  }'
Response:
{
  "id":             "batch_01HZX...",
  "object":         "batch",
  "endpoint":       "/v1/chat/completions",
  "status":         "in_progress",
  "input_file_id":  "file-abc123...",
  "output_file_id": null,
  "error_file_id":  null,
  "created_at":     1736290000,
  "expires_at":     1736376400,
  "request_counts": { "total": 3, "completed": 0, "failed": 0 }
}
Validation runs at create timeThe server streams the entire input JSONL and validates every line before responding. If anything is wrong, duplicate custom_id, line over 1 MB, stream: true, mismatched url, you’ll get a 400 with the offending line. Once the response returns, the batch is durably committed.

4. Poll until complete

Poll GET /v1/batches/{batch_id} until status is completed, expired, or failed. A 30-second interval is a reasonable default.
while true; do
  RESP=$(curl -s "https://api.zerogpu.ai/v1/batches/$BATCH_ID" \
    -H "x-api-key: $ZGPU_API_KEY" \
    -H "x-project-id: $ZGPU_PROJECT_ID")
  STATUS=$(echo "$RESP" | jq -r '.status')
  echo "$RESP" | jq -c '{status, request_counts}'
  case "$STATUS" in completed|failed|expired) break;; esac
  sleep 30
done
When status is completed, the response contains output_file_id (and, if any line failed, error_file_id).

5. Download the results

Stream the output and error files via GET /v1/files/{file_id}/content.
curl -s "https://api.zerogpu.ai/v1/files/$OUTPUT_FILE_ID/content" \
  -H "x-api-key: $ZGPU_API_KEY" \
  -H "x-project-id: $ZGPU_PROJECT_ID" \
  -o output.jsonl

jq -r '"\(.custom_id): \(.response.body.choices[0].message.content)"' output.jsonl
Output line shape (one per successful line, order not preserved, match by custom_id):
{"id": "batch_req_a", "custom_id": "q-1", "response": {"status_code": 200, "request_id": "req_xyz", "body": { ... }}}
Error line shape (one per failed line):
{"id": "batch_req_b", "custom_id": "q-2", "response": null, "error": {"code": "invalid_request_error", "message": "[legacy:http_400] ...", "param": null}}
Full schemas: JSONL format.

Scale it up

The same five steps handle a real workload. This script sends 1,000 prompts and collects the answers into a CSV, generating the JSONL from your own data, polling once a minute, and matching results back by custom_id:
import csv, json, os, time
from openai import OpenAI

client = OpenAI(
    api_key="ignored-by-zerogpu",
    base_url="https://api.zerogpu.ai/v1",
    default_headers={
        "x-api-key":    os.environ["ZGPU_API_KEY"],
        "x-project-id": os.environ["ZGPU_PROJECT_ID"],
    },
)

# 1. Write JSONL from your data (anything yielding (doc_id, prompt) tuples)
prompts = load_my_dataset()  # 1000 records
with open("chat.jsonl", "w") as f:
    for doc_id, prompt in prompts:
        f.write(json.dumps({
            "custom_id": doc_id,
            "method":    "POST",
            "url":       "/v1/chat/completions",
            "body":      {"model": "<model-id>", "messages": [{"role": "user", "content": prompt}]},
        }) + "\n")

# 2-3. Upload + create
uploaded = client.files.create(file=open("chat.jsonl", "rb"), purpose="batch")
batch    = client.batches.create(
    input_file_id=uploaded.id,
    endpoint="/v1/chat/completions",
    completion_window="24h",
)

# 4. Poll
while batch.status not in ("completed", "failed", "expired", "cancelled"):
    time.sleep(60)
    batch = client.batches.retrieve(batch.id)
    print(f"{batch.status}: {batch.request_counts.completed}/{batch.request_counts.total}")

if batch.status != "completed":
    raise SystemExit(f"Batch ended with status {batch.status}")

# 5. Download, parse, write CSV
output = client.files.content(batch.output_file_id).read().decode()
results = {
    rec["custom_id"]: rec["response"]["body"]["choices"][0]["message"]["content"]
    for rec in (json.loads(line) for line in output.splitlines() if line.strip())
}
with open("results.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerow(["doc_id", "answer"])
    writer.writerows(results.items())
For larger jobs, also read error_file_id and resubmit only the failed lines, see Errors reference for the recovery pattern.

Next steps

JSONL format →

Exact line schema for input, output, and error files.

Supported endpoints →

Body and response shape for /v1/chat/completions.

Objects & lifecycle →

Status lifecycle, the full Batch object schema, and every endpoint.

Errors reference →

Recover from failed lines without re-running the whole batch.