Skip to main content
POST
/
responses
deberta-v3-small: Responses
curl --request POST \
  --url https://api.zerogpu.ai/v1/responses \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: <api-key>' \
  --header 'x-project-id: <api-key>' \
  --data '
{
  "input": "Apple is expected to unveil its next-generation M5 chip at WWDC this June, promising a 40% boost in GPU performance and a new dedicated AI core for on-device machine learning tasks.",
  "model": "deberta-v3-small"
}
'
{}

Documentation Index

Fetch the complete documentation index at: https://docs.zerogpu.ai/llms.txt

Use this file to discover all available pages before exploring further.

Microsoft’s DeBERTa-v3-small uses a disentangled attention mechanism that separately encodes content and position giving it a major edge in understanding context compared to models of similar size. With ELECTRA-style pre-training and gradient-disentangled embedding sharing, it punches well above its weight on NLU benchmarks, outperforming many frontier models on classification and inference tasks. It’s one of the most efficient classifiers you can run at the edge. Ideal for high-throughput classification workloads where every millisecond and every cent matters.
References: Model docsTermsPrivacy

Specifications

PropertyValue
Model IDdeberta-v3-small
TaskText Classification
Typenli-deberta-v3
Parameters142M
Version1
Max Tokens400
ProviderMicrosoft
Input Price$0.05 / 1M
Output Price$0.40 / 1M
Total Price$0.45 / 1M

Try it

Send a live request with your x-api-key and x-project-id. Model is fixed to deberta-v3-small. Use request examples below to switch use cases (JSON extraction, NER, PII, and so on).

Authorizations

x-api-key
string
header
required
x-project-id
string
header
required

Body

application/json
input
string<textarea>
required

Multi-line text or document content to send to the model.

Required string length: 1 - 131072
instructions
string
metadata
object

Response

Success

The response is of type object.