The Batch API processes hundreds, thousands, or up to 50,000 inference requests at a discounted rate and within a 24-hour window. It’s the right choice when you don’t need a real-time response and want to avoid per-request rate limits.Documentation Index
Fetch the complete documentation index at: https://docs.zerogpu.ai/llms.txt
Use this file to discover all available pages before exploring further.
Quickstart
First batch in under 10 minutes: auth, upload, create, poll, download.
Upload file (playground)
POST /v1/files, attach JSONL with purpose=batch.Create batch (playground)
POST /v1/batches after you have an input file id.Retrieve batch (playground)
Poll
GET /v1/batches/{batch_id} for status and output file ids.JSONL format
Input line schema, output schema, error schema, validation rules.
Files API reference
All five
/v1/files endpoints (prose).Batches API reference
Create, list, retrieve, cancel (prose).
Examples
End-to-end walkthroughs in
curl and the Python openai SDK.Quick facts
| Base URL (production) | https://api.zerogpu.ai |
| Auth headers | x-api-key, x-project-id |
| Completion window | 24 hours (fixed) |
| Supported batch endpoint | /v1/chat/completions (only) |
| Max requests per batch | 50,000 |
| Max input file size | 200 MB total, 1 MB per line |
| Max upload size | 100 MB |
| File retention | 30 days |
When to use the Batch API
| You need… | Use |
|---|---|
| A single immediate response | The synchronous endpoint directly (e.g. POST /v1/chat/completions) |
| Hundreds-to-thousands of completions, can wait minutes-to-hours | The Batch API |
| To avoid per-second rate limits during a backfill | The Batch API |
| Streaming responses | The synchronous endpoint, streaming is not supported in batch mode |
Go deeper
Files API reference
Upload, list, retrieve, download, delete, every endpoint, every parameter.
Batches API reference
Create, list, retrieve, including the full Batch object schema and lifecycle.
Errors reference
Every HTTP status, every validation message, every code that can appear in the error JSONL.

