Quickstart - ZeroGPU

This page walks through a complete batch from start to finish in both curl and the Python openai SDK. By the end you’ll have submitted 3 chat completions through the Batch API and read the results.

Prerequisites

Credential	Where to find it
API key (`x-api-key`)	ZeroGPU dashboard → API Keys
Project ID (`x-project-id`)	ZeroGPU dashboard → Projects

Both headers are required on every request. Missing either returns 401.

Keep your API key out of source controlStore it in environment variables, secret managers, or your CI’s secret store, never commit it.

export ZGPU_API_KEY="your-api-key"
export ZGPU_PROJECT_ID="your-project-uuid"

Base URLs

Environment	URL
Production	`https://api.zerogpu.ai`
Staging	`https://staging.api.zerogpu.ai`
Development	`https://dev.api.zerogpu.ai`

The Batch and Files endpoints live under these hostnames:

POST /v1/files, GET /v1/files, GET /v1/files/{id}, GET /v1/files/{id}/content, DELETE /v1/files/{id}
POST /v1/batches, GET /v1/batches, GET /v1/batches/{batch_id}

1. Build the input JSONL

Every batch is driven by a JSONL file where each line is one inference request:

{"custom_id": "req-1", "method": "POST", "url": "/v1/chat/completions", "body": { ... }}

Field	Required	Description
`custom_id`	Yes	Your identifier for the request. Must be unique within the batch. Echoed back in the output so you can match results to inputs.
`method`	Yes	Must be `"POST"`.
`url`	Yes	Must be `/v1/chat/completions`, the only supported batch endpoint. All lines in a batch must share the same `url`.
`body`	Yes	The JSON body you would send to that endpoint synchronously. `"stream": true` is rejected, batches are non-streaming.

Full schema and validation rules: JSONL format.

cat > input.jsonl <<'EOF'
{"custom_id":"q-1","method":"POST","url":"/v1/chat/completions","body":{"model":"<model-id>","messages":[{"role":"user","content":"What is the capital of France?"}]}}
{"custom_id":"q-2","method":"POST","url":"/v1/chat/completions","body":{"model":"<model-id>","messages":[{"role":"user","content":"What is the capital of Germany?"}]}}
{"custom_id":"q-3","method":"POST","url":"/v1/chat/completions","body":{"model":"<model-id>","messages":[{"role":"user","content":"What is the capital of Italy?"}]}}
EOF

2. Upload the file

Send the JSONL to POST /v1/files with purpose=batch. The response contains the file_id you’ll reference when creating the batch.

curl -X POST https://api.zerogpu.ai/v1/files \
  -H "x-api-key: $ZGPU_API_KEY" \
  -H "x-project-id: $ZGPU_PROJECT_ID" \
  -F purpose=batch \
  -F [email protected]

Response:

{
  "id":         "file-abc123...",
  "object":     "file",
  "bytes":      612,
  "created_at": 1736290000,
  "filename":   "input.jsonl",
  "purpose":    "batch",
  "status":     "processed",
  "expires_at": 1738882000
}

3. Create the batch

Submit the batch with the file ID, the target endpoint, and a 24-hour completion window. The response returns immediately with status: "in_progress", actual processing is asynchronous.

curl -X POST https://api.zerogpu.ai/v1/batches \
  -H "x-api-key: $ZGPU_API_KEY" \
  -H "x-project-id: $ZGPU_PROJECT_ID" \
  -H "content-type: application/json" \
  -d '{
    "input_file_id":     "file-abc123...",
    "endpoint":          "/v1/chat/completions",
    "completion_window": "24h"
  }'

Response:

{
  "id":             "batch_01HZX...",
  "object":         "batch",
  "endpoint":       "/v1/chat/completions",
  "status":         "in_progress",
  "input_file_id":  "file-abc123...",
  "output_file_id": null,
  "error_file_id":  null,
  "created_at":     1736290000,
  "expires_at":     1736376400,
  "request_counts": { "total": 3, "completed": 0, "failed": 0 }
}

Validation runs at create timeThe server streams the entire input JSONL and validates every line before responding. If anything is wrong, duplicate custom_id, line over 1 MB, stream: true, mismatched url, you’ll get a 400 with the offending line. Once the response returns, the batch is durably committed.

4. Poll until complete

Poll GET /v1/batches/{batch_id} until status is completed, expired, or failed. A 30-second interval is a reasonable default.

while true; do
  RESP=$(curl -s "https://api.zerogpu.ai/v1/batches/$BATCH_ID" \
    -H "x-api-key: $ZGPU_API_KEY" \
    -H "x-project-id: $ZGPU_PROJECT_ID")
  STATUS=$(echo "$RESP" | jq -r '.status')
  echo "$RESP" | jq -c '{status, request_counts}'
  case "$STATUS" in completed|failed|expired) break;; esac
  sleep 30
done

When status is completed, the response contains output_file_id (and, if any line failed, error_file_id).

5. Download the results

Stream the output and error files via GET /v1/files/{file_id}/content.

curl -s "https://api.zerogpu.ai/v1/files/$OUTPUT_FILE_ID/content" \
  -H "x-api-key: $ZGPU_API_KEY" \
  -H "x-project-id: $ZGPU_PROJECT_ID" \
  -o output.jsonl

jq -r '"\(.custom_id): \(.response.body.choices[0].message.content)"' output.jsonl

Output line shape (one per successful line, order not preserved, match by custom_id):

{"id": "batch_req_a", "custom_id": "q-1", "response": {"status_code": 200, "request_id": "req_xyz", "body": { ... }}}

Error line shape (one per failed line):

{"id": "batch_req_b", "custom_id": "q-2", "response": null, "error": {"code": "invalid_request_error", "message": "[legacy:http_400] ...", "param": null}}

Full schemas: JSONL format.

Next steps

JSONL format →

Exact line schema for input, output, and error files.

Supported endpoints →

Body and response shape for /v1/chat/completions.

Batches API reference →

Status lifecycle, limits, full Batch object schema.

More examples →

Full Python script, plus a realistic IAB-classification batch.

Documentation Index

​Prerequisites

​Base URLs

​1. Build the input JSONL

​2. Upload the file

​3. Create the batch

​4. Poll until complete

​5. Download the results

​Next steps

JSONL format →

Supported endpoints →

Batches API reference →

More examples →

Prerequisites

Base URLs

1. Build the input JSONL

2. Upload the file

3. Create the batch

4. Poll until complete

5. Download the results

Next steps