Skip to main content
ZeroGPU is an inference provider. We run specialized small and nano language models across an edge-powered network, purpose-built for the high-volume tasks specific verticals depend on, from ad-tech classification to content moderation to document extraction. The result: more efficient compute, faster latency, and roughly 50% lower cost than frontier-model workflows. The cookbook turns those models into working code. Each recipe is copy, adapt, and run: a focused walkthrough that combines ZeroGPU with another tool, runtime, or SDK against the OpenAI-compatible endpoint at https://api.zerogpu.ai/v1. For single-endpoint request and response shapes, see the API reference and model pages. Before you begin, read the quickstart to provision an API key and pick a model from the model catalog. Bring your own API key, project ID, and model from the dashboard; see Authentication if you haven’t set up yet.

Tutorials

Longer, step-by-step walkthroughs with example data.

Resume & profile extraction

Three production-style extraction use cases over plain text (resumes, profile exports, job posts). No OCR.

Tag customer reviews overnight

Process a CSV of customer reviews with sentiment and topic tagging using the Batch API, including error recovery and result merging.

Integrations & plugins

Recipes that combine ZeroGPU with another tool, runtime, or SDK.

Sanitize a CSV with the Claude Code plugin

Redact PII from a feedback export and produce an audit log, from a single natural-language prompt.

Screen resumes with LangChain

Extract entities, redact PII, and route a candidate from a PDF resume.

Turn content into targeted copy with Claude

Classify an article with the IAB model, then have Claude write an ad brief, newsletter blurb, and content pitch from the signals.