Skip to main content

POST /v1/responses

Send a list of input messages to an AI model and receive a generated response.

Request headers

HeaderTypeRequiredDescription
x-api-keystringYesYour ZeroGPU API key
x-project-idstringYesYour project UUID
content-typestringYesMust be application/json

Request body

ParameterTypeRequiredDescription
modelstringYesThe model identifier (available from your dashboard)
inputarrayYesArray of input message objects
textobjectNoResponse format configuration

Input message object

FieldTypeDescription
rolestringThe role of the message author: user or system
contentstringThe content of the message

Text format object

FieldTypeDescription
text.format.typestringResponse format type (e.g., text)

Example request

curl --location 'https://api.zerogpu.ai/v1/responses' \
  --header 'content-type: application/json' \
  --header 'x-api-key: YOUR_API_KEY' \
  --header 'x-project-id: YOUR_PROJECT_ID' \
  --data '{
    "model": "YOUR_MODEL",
    "input": [
      {
        "role": "user",
        "content": "Your input text here..."
      }
    ],
    "text": {
      "format": {
        "type": "text"
      }
    }
  }'

Example response

{
  "id": "resp_abc123",
  "object": "response",
  "created": 1710000000,
  "model": "your-selected-model",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "The generated response from the model..."
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 24,
    "output_tokens": 32,
    "total_tokens": 56
  }
}

Response fields

FieldTypeDescription
idstringUnique identifier for the response
objectstringObject type (response)
createdintegerUnix timestamp of when the response was created
modelstringThe model used for inference
outputarrayArray of output message objects
output[].rolestringAlways assistant
output[].content[].textstringThe generated text response
usageobjectToken usage statistics
usage.input_tokensintegerNumber of tokens in the input
usage.output_tokensintegerNumber of tokens generated
usage.total_tokensintegerTotal tokens consumed

Context length

Each model has a maximum input token limit. If your input exceeds it:
  • The API may return 420 with error.code context_length_exceeded when the model is configured to reject over-length input.
  • Otherwise the input may be truncated to the limit and the response will include usage for the truncated input.
Keep requests within the model’s token limit or handle 420 and truncation in your client.