Errors & limits
Error envelope
Every error response has the same shape:
{
"error": "INVALID_REQUEST",
"message": "Human-readable explanation",
"statusCode": 400
}
error- a stable, machine-readable code (switch on this).message- a human-readable explanation (don't parse this).statusCode- matches the HTTP status.
Error codes
| Code | HTTP | Meaning |
|---|---|---|
INVALID_REQUEST | 400 | Body failed validation, or a precondition is unmet (e.g. no vector store configured). |
UNAUTHORIZED | 401 | Missing, malformed, or revoked API key. |
PAYMENT_REQUIRED | 402 | Insufficient credit balance, or a card was declined. |
FORBIDDEN | 403 | The key is missing the required scope. |
ACCOUNT_TERMINATED | 403 | The account has been terminated. |
NOT_FOUND | 404 | The resource doesn't exist (e.g. no vector store configured). |
RATE_LIMITED | 429 | You exceeded your request rate. |
SERVICE_UNAVAILABLE | 503 | A downstream dependency (model provider, vector store) failed transiently. Retry with backoff. |
Failed requests are never billed
If generation fails, the request isn't charged. Up-front charges are refunded on failure, and usage-metered charges settle only on success.
Rate limiting
Requests are rate-limited per account. Defaults:
- 60 requests/minute, with a short per-second burst allowance (roughly one-tenth of the per-minute limit). Your account may be provisioned with a higher limit.
Exceeding a window returns 429 RATE_LIMITED. Back off and retry.
Retrying safely
429and503are transient - retry with exponential backoff.400,401,403are not transient - fix the request or the key; retrying won't help.402means you're out of credit (or a card was declined) - top up in the dashboard before retrying.