Errors can happen before a response starts or after a streaming response is already in progress.
Your client should handle both cases.
Standard error shape
All APIs use normal HTTP status codes for non-streaming failures, with error payloads that include a machine-readable type and a human-readable message.
Typical JSON error:
{
"error": {
"type": "invalid_request_error",
"message": "Human-readable description"
}
}
Common status codes
| Code | Meaning | What to do |
|---|
400 | Invalid request or unsupported field combination | Fix the request before retrying |
401 | Missing or invalid API or provisioning key | Check which key type the endpoint expects |
402 | Insufficient credits | Top up balance or switch to an allowed model |
403 | Access blocked, denied, or filtered | Inspect the error body and request context |
429 | Rate limiting or abuse protection | Retry with backoff |
500 / 503 | Internal or upstream failure | Retry with capped backoff |
Streaming Errors
Streaming APIs can fail after the server has already sent 200 OK. In that case, the error arrives inside the stream instead of as a normal JSON body.
A streamed 200 OK only means the connection started successfully. It does
not guarantee the model finished successfully. Keep parsing the stream until
you see the real terminal event or error frame.
Clients should parse frames structurally instead of blindly appending text.
| API | Late-stream failure pattern |
|---|
Responses | error event, often followed by response.failed, then [DONE] |
Chat Completions | protocol-native streamed error payload |
Messages | Anthropic-style streamed error payload |
Example mid-stream error payload:
{
"error": {
"type": "inappropriate_content",
"message": "We got a bad response from the source. Status 403. Error message: Unable to show the generated image."
}
}
Retry guidance
- do not blindly retry
400 or 401
- retry
429, 500, and 503 with exponential backoff
- keep retries bounded and log the final failure
Practical advice
- always branch on the machine-readable
error.type
- log unknown error payloads before normalizing them away
- stop normal stream processing as soon as an error frame appears
- keep HTTP-level errors and late-stream errors in the same client error model when possible