Skip to main content

Errors & limits

Error envelope

Every error response has the same shape:

{
"error": "INVALID_REQUEST",
"message": "Human-readable explanation",
"statusCode": 400
}
  • error - a stable, machine-readable code (switch on this).
  • message - a human-readable explanation (don't parse this).
  • statusCode - matches the HTTP status.

Error codes

CodeHTTPMeaning
INVALID_REQUEST400Body failed validation, or a precondition is unmet (e.g. no vector store configured).
UNAUTHORIZED401Missing, malformed, or revoked API key.
PAYMENT_REQUIRED402Insufficient credit balance, or a card was declined.
FORBIDDEN403The key is missing the required scope.
ACCOUNT_TERMINATED403The account has been terminated.
NOT_FOUND404The resource doesn't exist (e.g. no vector store configured).
RATE_LIMITED429You exceeded your request rate.
SERVICE_UNAVAILABLE503A downstream dependency (model provider, vector store) failed transiently. Retry with backoff.
Failed requests are never billed

If generation fails, the request isn't charged. Up-front charges are refunded on failure, and usage-metered charges settle only on success.

Rate limiting

Requests are rate-limited per account. Defaults:

  • 60 requests/minute, with a short per-second burst allowance (roughly one-tenth of the per-minute limit). Your account may be provisioned with a higher limit.

Exceeding a window returns 429 RATE_LIMITED. Back off and retry.

Retrying safely

  • 429 and 503 are transient - retry with exponential backoff.
  • 400, 401, 403 are not transient - fix the request or the key; retrying won't help.
  • 402 means you're out of credit (or a card was declined) - top up in the dashboard before retrying.