Skip to main content
ZeroGPU is an inference provider. We run specialized small and nano language models across an edge-powered network, purpose-built for the high-volume tasks specific verticals depend on, from ad-tech classification to content moderation to document extraction. The result: more efficient compute, faster latency, and roughly 50% lower cost than frontier-model workflows. The cookbook turns those models into working code. Each recipe is copy, adapt, and run: a focused walkthrough that combines ZeroGPU with another tool, runtime, or SDK against the OpenAI-compatible endpoint at https://api.zerogpu.ai/v1. For single-endpoint request and response shapes, see the API reference and model pages. Before you begin, read the quickstart to provision an API key and pick a model from the model catalog. Bring your own API key, project ID, and model from the dashboard; see Authentication if you haven’t set up yet.

Tutorials

Longer, step-by-step walkthroughs with example data.

Resume & profile extraction

Three production-style extraction use cases over plain text (resumes, profile exports, job posts). No OCR.

Tag customer reviews overnight

Process a CSV of customer reviews with sentiment and topic tagging using the Batch API, including error recovery and result merging.

Integrations & plugins

Recipes that combine ZeroGPU with another tool, runtime, or SDK.

Sanitize a CSV with the Claude Code plugin

Redact PII from a feedback export and produce an audit log, from a single natural-language prompt.

Screen resumes with LangChain

Extract entities, redact PII, and route a candidate from a PDF resume.