Skills + CLI (Claude Code)

Claude Code is Anthropic’s agentic coding tool that runs Claude directly in your terminal, with file editing, command execution, and a plugin system that extends sessions with custom slash commands and skills. Plugins package skills that Claude can auto-invoke when a user request matches, or that users can call explicitly by name. The plugin marketplace lets teams share these extensions across projects. ZeroGPU is an ultra-fast, compute-efficient inference provider for apps and agents. We run purpose-built small and nano language models across an edge-powered network for the high-volume, purpose-specific tasks your app or agent runs constantly. Plug in our OpenAI-compatible API and you’re live - zero GPU infrastructure, serverless, auto-scaling by default.

Overview

This guide walks through installing the zerogpu-router plugin, which exposes every ZeroGPU CLI command as a Claude Code skill. You’ll install the CLI, authenticate, load the plugin from its marketplace, and try both auto-invoked and manually-invoked skills. By the end you’ll be able to offload cheap, well-defined NLP tasks (classification, entity and PII extraction, short chat) from Claude to ZeroGPU’s edge-optimized models without leaving your terminal session.

Video walkthrough

Quickstart

Prerequisites

Node.js 20 or newer.
Claude Code installed: npm install -g @anthropic-ai/claude-code.
A ZeroGPU API key and Project ID.
A model ID from the model catalog if you plan to call models by name.

Get your ZeroGPU API key

Sign in to the ZeroGPU dashboard.
Open API Keys and click Create key.
Copy the key (starts with zgpu-api-) and grab your Project ID (UUID) from the project settings page.

Install the ZeroGPU CLI

npm install -g zerogpu-cli
zerogpu --version

Then authenticate so every skill call works without re-prompting:

zerogpu login

You’ll be prompted for your API key and Project ID. For CI or non-interactive setups, pass them as flags:

zerogpu login \
  --api-key zgpu-api-XXXXXXXXXXXXXXXXXX \
  --project-id 4ed3e5bb-c2ed-4d4a-8a66-2b161a27fd1a

Your first request

Start a Claude Code session by running claude in your terminal, then add the marketplace and install the plugin:

/plugin marketplace add zerogpu/zerogpu-router
/plugin install zerogpu-router@zerogpu
/reload-plugins

Confirm it’s loaded with /plugin. You should see zerogpu-router - enabled. Now try a skill:

/zerogpu-router:redact-pii "Email John Smith at [email protected] about invoice 12345."

You’ll get back: Email [PERSON] at [EMAIL] about invoice 12345.

Usage

Every inference skill auto-invokes when Claude detects a matching request, or you can call any skill explicitly with /zerogpu-router:<name> <args>. Two skills (signin and status) are manual-only. This section walks through every skill in the plugin: what it does, which ZeroGPU model it routes to, the full argument surface, and the shape of the response you’ll get back.

How auto-invocation works

The plugin ships each skill with a description that Claude reads when deciding whether to invoke it for a given user message. You don’t have to remember slash command names: just describe the task. Claude picks based on intent words like “redact”, “scrub”, “classify by”, “extract”, or “tag this as”, and on the structure of the data you supply (a label list implies zero-shot; a JSON schema with category axes implies structured classification). Three common auto-invoked patterns:

Redact PII from this support ticket before I paste it into our public tracker:

"Hi team, this is Sarah Chen ([email protected], +1 415-555-0182).
Our prod database started throwing connection timeouts around 2:14 AM PT last
night. The on-call engineer Marcus Rivera restarted pgbouncer but the issue
came back. Billing should go to our CFO Priya Patel at
[email protected], billing address 1455 Market St, Suite 600,
San Francisco, CA 94103."

Claude routes to redact-pii. Names, emails, phone numbers, and street addresses come back replaced by uppercase label placeholders like [PERSON], [EMAIL], [PHONE_NUMBER], and [ADDRESS]. The raw PII never enters Claude’s context window, which is the point: you can paste the masked output into a public tracker or log without leaking customer data. Project-specific identifiers (internal hostnames, IPs, contract numbers, card last-fours) aren’t in the model’s label set; strip those yourself, or use extract-entities with custom labels.

Pull all the email addresses and phone numbers out of this:
"Reach Maria at [email protected] or 415-555-0188"

Claude routes to extract-pii, which returns structured JSON grouped by category instead of an in-line mask.

Classify this support ticket by sentiment and topic:
"Support replied quickly but the fix didn't work"

Claude routes to classify-structured and synthesizes a schema with sentiment and topic axes before calling the skill. If you ever want a specific skill instead of letting Claude choose, invoke it explicitly with the /zerogpu-router:<name> syntax shown below.

`/zerogpu-router:signin`

Sign in to ZeroGPU and persist your credentials so every subsequent skill call works without re-prompting. Manual only; not auto-invoked by Claude. Wraps zerogpu login. Synopsis

/zerogpu-router:signin [--api-key <key>] [--project-id <id>]

Flag	Required	Description
`--api-key <key>`	optional	API key. Must start with `zgpu-api-`. If omitted, you’ll be prompted (masked).
`--project-id <id>`	optional	Project ID (UUID v4). If omitted, you’ll be prompted.

On success the API key and Project ID are written to your config file, and ZEROGPU_API_KEY is added to your shell profile so other tools can pick it up.

`/zerogpu-router:status`

Show your current ZeroGPU sign-in status and the masked API key. Manual only. Wraps zerogpu status. Exit code is 0 when signed in, 1 when not. Use this in scripts that need a hard fail before running any inference skill.

`/zerogpu-router:chat`

Short, single-turn chat reply for things that don’t need Claude-level reasoning or prior conversation context. Use this when you’d otherwise burn Claude tokens on a one-liner.

Model: LFM2.5-1.2B-Instruct
Wraps: zerogpu chat
When Claude auto-invokes: quick factual answers, one-liners, basic rephrasings where you’ve signalled “use a small model.”

Synopsis

/zerogpu-router:chat <text> [-i <instructions>]

Name	Required	Description
`text`	yes	The user message / prompt (quoted).
`-i`, `--instructions <instructions>`	optional	System instructions to steer behavior.

Example

/zerogpu-router:chat "Explain WebSockets in two sentences." -i "You are a concise technical writer."

The skill returns raw assistant text (or pretty-printed JSON if the model returned a structured object).

`/zerogpu-router:chat-thinking`

Same shape as chat, but the model returns its reasoning trace alongside the answer.

Model: LFM2.5-1.2B-Thinking
Wraps: zerogpu chat_thinking
When Claude auto-invokes: short logic / math / word-problem questions where step-by-step reasoning is useful or explicitly requested.

Example

/zerogpu-router:chat-thinking "If a train leaves at 3 PM going 60 mph, when does it cover 150 miles?"

`/zerogpu-router:summarize`

Condense a longer passage into a short summary. Use this when you need the gist of a report, ticket thread, transcript, or document without spending Claude tokens on the read.

Model: llama-3-1-8b-instruct-fast
Wraps: zerogpu summarize
When Claude auto-invokes: “summarize this”, “give me the gist”, “TL;DR this passage”, “condense this into a few sentences.”

Synopsis

/zerogpu-router:summarize <text>

Example

/zerogpu-router:summarize "The board met Thursday to review Q3 results. Revenue rose 18% \
year-over-year to $42M, driven mainly by enterprise renewals and a strong launch in the EU \
market. Operating margin slipped to 11% from 14% as headcount grew 30% ahead of the new \
data-center buildout. The CFO flagged rising cloud costs as the top risk for Q4 and proposed \
a hiring freeze on non-engineering roles until margins recover. The board approved the freeze \
and asked for a revised 2025 budget by mid-December."

Illustrative output

Q3 revenue grew 18% YoY to $42M on enterprise renewals and EU growth, but operating margin
fell to 11% due to a 30% headcount increase for the data-center buildout. Citing cloud costs
as the main Q4 risk, the board approved a hiring freeze on non-engineering roles and requested
a revised 2025 budget by mid-December.

`/zerogpu-router:classify-iab`

Classify text against the IAB content / audience taxonomy (standard ad-tech category labels).

Model: zlm-v1-iab-classify-edge
Wraps: zerogpu classify_iab
When Claude auto-invokes: “what IAB category is this?”, “tag this article for ad targeting”, “give me the topic taxonomy.”

Example

/zerogpu-router:classify-iab "The Lakers signed a new point guard ahead of the playoffs."

Illustrative output

{
  "categories": [
    { "id": "IAB17-44", "name": "Basketball", "confidence": 0.97 }
  ]
}

`/zerogpu-router:classify-iab-enriched`

Enriched IAB classification that returns audience categories plus topics, keywords, and inferred user intent.

Model: zlm-v1-iab-classify-edge-enriched
Wraps: zerogpu classify_iab_enriched
When Claude auto-invokes: “give me topics, keywords, and intent”, any request asking for richer ad/audience signals than plain IAB labels.

Example

/zerogpu-router:classify-iab-enriched "Compare the Tesla Model Y and the Hyundai Ioniq 5 for a family of four."

Illustrative output

{
  "categories": [{ "id": "IAB2-1", "name": "Auto Buyers", "confidence": 0.92 }],
  "topics": ["electric vehicles", "family cars"],
  "keywords": ["Tesla Model Y", "Hyundai Ioniq 5"],
  "intent": "comparison-shopping"
}

`/zerogpu-router:classify-zero-shot`

Zero-shot classification against an arbitrary list of candidate labels you supply at call time. The model picks the single best fit and returns scores for all candidates.

Model: deberta-v3-small
Wraps: zerogpu classify_zero_shot
When Claude auto-invokes: “is this positive, negative, or neutral?”, “tag this as bug, feature, or question.”

Synopsis

/zerogpu-router:classify-zero-shot <text> (-l <label>...) | (--labels a,b,c)

Name	Required	Description
`text`	yes	Text to classify (quoted).
`-l <label>`	one of `-l` / `--labels`	A single label. Repeatable.
`--labels <a,b,c>`	one of `-l` / `--labels`	Comma-separated label list.

Example

/zerogpu-router:classify-zero-shot "I love how fast this laptop boots up." -l positive -l negative -l neutral

Illustrative output

{ "label": "positive", "scores": { "positive": 0.94, "neutral": 0.04, "negative": 0.02 } }

`/zerogpu-router:classify-structured`

Schema-driven, multi-axis classification. You define each category and its allowed labels, and the model returns one chosen label per category.

Model: gliner2-base-v1
Wraps: zerogpu classify_structured
When Claude auto-invokes: “classify by sentiment and topic”, any request that names multiple classification dimensions with explicit label sets.

Synopsis

/zerogpu-router:classify-structured <text> -s '<json schema>'

Name	Required	Description
`text`	yes	Text to classify.
`-s`, `--schema <json>`	yes	JSON object mapping each category to its allowed labels.

Example

/zerogpu-router:classify-structured "Support replied quickly but the fix didn't work." \
  -s '{"sentiment":["positive","negative","neutral"],"topic":["support","billing","product"]}'

Illustrative output

{ "sentiment": "negative", "topic": "support" }

`/zerogpu-router:extract-entities`

Custom-label named-entity recognition. You define the entity labels; the model finds matching spans with confidence scores.

Model: gliner2-base-v1
Wraps: zerogpu extract_entities
When Claude auto-invokes: “extract all people, organizations, and locations from this”, “find every product mention.”

Synopsis

/zerogpu-router:extract-entities <text> (-l <label>... | --labels a,b,c) [-t <0..1>]

Name	Required	Default	Description
`text`	yes	-	Source text.
`-l <label>` / `--labels <a,b,c>`	yes (one)	-	Entity labels to extract.
`-t`, `--threshold <number>`	optional	`0.3`	Minimum confidence in `[0, 1]`.

Example

/zerogpu-router:extract-entities "Apple CEO Tim Cook met with Sundar Pichai in Cupertino on Monday." \
  --labels person,organization,location -t 0.4

Illustrative output

[
  { "label": "organization", "text": "Apple",         "score": 0.98 },
  { "label": "person",       "text": "Tim Cook",      "score": 0.97 },
  { "label": "person",       "text": "Sundar Pichai", "score": 0.96 },
  { "label": "location",     "text": "Cupertino",     "score": 0.91 }
]

`/zerogpu-router:extract-pii`

Extract personally identifiable information entities, grouped by category, without modifying the source text. Use when you need structured data about PII (for redaction policies, audits, or downstream tooling) rather than a masked version.

Model: gliner-multi-pii-v1
Wraps: zerogpu extract_pii
When Claude auto-invokes: “find all PII”, “what personal info is in this?”, “list emails/phones/names.”

Synopsis

/zerogpu-router:extract-pii <text> [-t <threshold>] [(-c | --categories) <list>]

Name	Required	Default	Description
`text`	yes	-	Source text.
`-t`, `--threshold <number>`	optional	`0.5`	Minimum confidence.
`-c`, `--categories <list>`	optional	`identity,contact`	Comma-separated. Other values: `financial`, `medical`, `credentials`.

Example

/zerogpu-router:extract-pii "Contact Jane Doe at [email protected] or +1 (415) 555-1212." -t 0.6 -c identity,contact,financial

Illustrative output

[
  { "category": "identity", "label": "person", "text": "Jane Doe",          "score": 0.96 },
  { "category": "contact",  "label": "email",  "text": "[email protected]",  "score": 0.99 },
  { "category": "contact",  "label": "phone",  "text": "+1 (415) 555-1212", "score": 0.95 }
]

If you want to mask PII inline rather than extract it, use /zerogpu-router:redact-pii.

`/zerogpu-router:redact-pii`

Detect PII and replace each span in-line with a [LABEL] placeholder. Use this before sharing or logging sensitive text, or before forwarding user input to another LLM that you don’t want to expose raw PII to.

Model: gliner-multi-pii-v1 (with mask: "label")
Wraps: zerogpu redact_pii
When Claude auto-invokes: “redact”, “scrub”, “mask”, “anonymize”, or “sanitize this for sharing.”

Example

/zerogpu-router:redact-pii "Email John Smith at [email protected] about invoice 12345."

Output

Email [PERSON] at [EMAIL] about invoice 12345.

Note that 12345 is not masked: only spans the model recognizes as PII are replaced. Domain-specific identifiers (account numbers, internal ticket IDs) should be handled by your own redaction layer or by extract-entities with custom labels.

`/zerogpu-router:extract-json`

Pull specific named fields out of free text into a structured JSON object, defined by a per-field schema. Each field is declared as name::type::description.

Model: gliner2-base-v1
Wraps: zerogpu extract_json
When Claude auto-invokes: “extract the contact info as JSON”, “parse this invoice”, “pull these fields out.”

Synopsis

/zerogpu-router:extract-json <text> -s '<json schema>'

Example

/zerogpu-router:extract-json "Reach Maria Lopez at [email protected] or 415-555-0188." \
  -s '{"contact":["name::str::Full name","email::str::Email address","phone::str::Phone number"]}'

Illustrative output

{
  "contact": {
    "name": "Maria Lopez",
    "email": "[email protected]",
    "phone": "415-555-0188"
  }
}

Skills reference table

Every skill at a glance.

Skill	Purpose	Example
`/zerogpu-router:signin`	Sign in and persist API key + Project ID (manual only)	`/zerogpu-router:signin`
`/zerogpu-router:status`	Show current sign-in status (manual only)	`/zerogpu-router:status`
`/zerogpu-router:chat <text>`	Short chat reply via `LFM2.5-1.2B-Instruct`	`/zerogpu-router:chat "Explain WebSockets in two sentences."`
`/zerogpu-router:chat-thinking <text>`	Chat with the Thinking variant (returns reasoning)	`/zerogpu-router:chat-thinking "If a train leaves at 3 PM going 60 mph..."`
`/zerogpu-router:summarize <text>`	Condense a longer passage into a short summary	`/zerogpu-router:summarize "The board met Thursday to review Q3 results..."`
`/zerogpu-router:classify-iab <text>`	IAB taxonomy classification	`/zerogpu-router:classify-iab "The Lakers signed a new point guard."`
`/zerogpu-router:classify-iab-enriched <text>`	IAB + topics/keywords/intent	`/zerogpu-router:classify-iab-enriched "Compare the Tesla Model Y..."`
`/zerogpu-router:classify-zero-shot <text> -l ...`	Zero-shot against custom labels	`/zerogpu-router:classify-zero-shot "fast laptop" -l positive -l negative`
`/zerogpu-router:classify-structured <text> -s '...'`	Schema-based multi-axis classification	`/zerogpu-router:classify-structured "ticket text" -s '{"sentiment":[...]}'`
`/zerogpu-router:extract-entities <text> -l ...`	Custom-label NER	`/zerogpu-router:extract-entities "Tim Cook met Sundar Pichai..." -l person -l location`
`/zerogpu-router:extract-pii <text>`	Extract PII entities (returns JSON)	`/zerogpu-router:extract-pii "Contact Jane at [email protected]"`
`/zerogpu-router:redact-pii <text>`	Mask PII in-line with `[LABEL]` placeholders	`/zerogpu-router:redact-pii "Email John at [email protected]"`
`/zerogpu-router:extract-json <text> -s '...'`	Schema-driven JSON extraction	`/zerogpu-router:extract-json "..." -s '{"contact":["name::str::Full name"]}'`

For full flag reference on any command, run zerogpu <command> --help outside the Claude Code session.

Patterns and recipes

Sanitize before Claude sees raw input. Pipe untrusted text through redact-pii first when you don’t want personal data captured in Claude’s transcript or sent to downstream LLMs. Combine with extract-pii if you also need an audit log of what was masked. Cheap router in front of Claude. Use classify-zero-shot or classify-structured to triage an incoming message (bug / feature / question, urgent / normal, in-scope / out-of-scope) and only escalate the hard cases to Claude itself. The classifier call costs orders of magnitude less than a Claude turn. Structured extraction over free-form parsing. When you have semi-structured text (signatures, invoices, contact blocks), prefer extract-json over asking Claude to “parse this into JSON.” It’s deterministic on the schema, faster, and cheaper. Keep field descriptions short and specific - the description is what the model uses to find the span. Confidence thresholds. For NER and PII extraction, the default thresholds (0.3 and 0.5 respectively) are tuned for recall. Raise -t to 0.6 or higher when you need precision (e.g. compliance-grade redaction lists); lower it when you’d rather over-extract and filter downstream.

Troubleshooting

zerogpu: command not found - the CLI isn’t installed or isn’t on your PATH. Run npm install -g zerogpu-cli and restart your shell. If you use a Node version manager (nvm, fnm, volta), make sure the shell that launched Claude Code has the same Node version active. Skill returns “You’re not signed in yet.” - no credentials on disk. Run /zerogpu-router:signin inside Claude Code, or zerogpu login in your terminal. Check /zerogpu-router:status to confirm. /zerogpu-router:* skills don’t appear in /help - the plugin isn’t enabled. Run /plugin to view installed plugins, enable zerogpu-router, then /reload-plugins. If it’s not listed at all, re-run /plugin marketplace add zerogpu/zerogpu-router followed by /plugin install zerogpu-router@zerogpu. Request failed with status 401 - your API key is missing, revoked, or mistyped. Rotate the key in the dashboard and re-run /zerogpu-router:signin. Keys must start with zgpu-api-. Request failed with status 403 - the key is valid but doesn’t have access to the project, or the project doesn’t have access to the requested model. Confirm your Project ID matches the project that owns the key. Request failed with status 429 - you’re being rate-limited. Back off and retry with exponential delay, or switch heavy workloads to the Batch API, which has separate quotas tuned for bulk jobs. Wrong skill auto-invoked for a request. Claude picks based on phrasing. If you want a specific skill, call it explicitly with /zerogpu-router:<name>. If a skill keeps getting picked when you don’t want it, rephrase to remove the trigger words (“redact”, “classify”, “extract”) or invoke the right one by name. Schema parsing errors on classify-structured / extract-json. The -s flag expects a single-quoted JSON string. On Windows PowerShell, escape inner double quotes or use a here-string. Run the CLI directly (zerogpu classify_structured ... -s '...') to isolate quoting issues from Claude Code. Empty or low-confidence results. Lower -t to surface more candidates, or check that the label set you supplied matches the language of the source text (the underlying models are English-tuned for most label sets). For very short inputs (one or two words), expect lower confidence across the board. /reload-plugins doesn’t pick up a new CLI version. Plugin reload only touches Claude Code state; it doesn’t reinstall the CLI. Run npm install -g zerogpu-cli@latest in your terminal, then zerogpu --version to confirm.

Conclusion

The zerogpu-router plugin turns ZeroGPU’s nano language models into first-class Claude Code skills, so Claude can hand off classification, extraction, and short chat tasks to a cheaper, faster model without you leaving the session. It’s a fast way to keep raw PII out of Claude’s context, cut token spend on well-defined NLP work, and prototype routing patterns you can later promote to production.

Model Catalog

Browse every model the plugin can route to.

API Reference

Explore the full OpenAI-compatible API surface.

Cookbook

Worked examples for classification, extraction, and batch jobs.

Join Discord

Ask questions and share what you’re building.

​Overview

​Video walkthrough

​Quickstart

​Prerequisites

​Get your ZeroGPU API key

​Install the ZeroGPU CLI

​Your first request

​Usage

​How auto-invocation works

​/zerogpu-router:signin

​/zerogpu-router:status

​/zerogpu-router:chat

​/zerogpu-router:chat-thinking

​/zerogpu-router:summarize

​/zerogpu-router:classify-iab

​/zerogpu-router:classify-iab-enriched

​/zerogpu-router:classify-zero-shot

​/zerogpu-router:classify-structured

​/zerogpu-router:extract-entities

​/zerogpu-router:extract-pii

​/zerogpu-router:redact-pii

​/zerogpu-router:extract-json

​Skills reference table

​Patterns and recipes

​Troubleshooting

​Conclusion

Model Catalog

API Reference

Cookbook

Join Discord

Overview

Video walkthrough

Quickstart

Prerequisites

Get your ZeroGPU API key

Install the ZeroGPU CLI

Your first request

Usage

How auto-invocation works

`/zerogpu-router:signin`

`/zerogpu-router:status`

`/zerogpu-router:chat`

`/zerogpu-router:chat-thinking`

`/zerogpu-router:summarize`

`/zerogpu-router:classify-iab`

`/zerogpu-router:classify-iab-enriched`

`/zerogpu-router:classify-zero-shot`

`/zerogpu-router:classify-structured`

`/zerogpu-router:extract-entities`

`/zerogpu-router:extract-pii`

`/zerogpu-router:redact-pii`

`/zerogpu-router:extract-json`

Skills reference table

Patterns and recipes

Troubleshooting

Conclusion