Skip to main content
Errors can happen before a response starts or after a streaming response is already in progress. Your client should handle both cases.

Standard error shape

All APIs use normal HTTP status codes for non-streaming failures, with error payloads that include a machine-readable type and a human-readable message. Typical JSON error:
{
  "error": {
    "type": "invalid_request_error",
    "message": "Human-readable description"
  }
}

Common status codes

CodeMeaningWhat to do
400Invalid request or unsupported field combinationFix the request before retrying
401Missing or invalid API or provisioning keyCheck which key type the endpoint expects
402Insufficient creditsTop up balance or switch to an allowed model
403Access blocked, denied, or filteredInspect the error body and request context
429Rate limiting or abuse protectionRetry with backoff
500 / 503Internal or upstream failureRetry with capped backoff

Streaming Errors

Streaming APIs can fail after the server has already sent 200 OK. In that case, the error arrives inside the stream instead of as a normal JSON body.
A streamed 200 OK only means the connection started successfully. It does not guarantee the model finished successfully. Keep parsing the stream until you see the real terminal event or error frame.
Clients should parse frames structurally instead of blindly appending text.
APILate-stream failure pattern
Responseserror event, often followed by response.failed, then [DONE]
Chat Completionsprotocol-native streamed error payload
MessagesAnthropic-style streamed error payload
Example mid-stream error payload:
{
  "error": {
    "type": "inappropriate_content",
    "message": "We got a bad response from the source. Status 403. Error message: Unable to show the generated image."
  }
}

Retry guidance

  • do not blindly retry 400 or 401
  • retry 429, 500, and 503 with exponential backoff
  • keep retries bounded and log the final failure

Practical advice

  • always branch on the machine-readable error.type
  • log unknown error payloads before normalizing them away
  • stop normal stream processing as soon as an error frame appears
  • keep HTTP-level errors and late-stream errors in the same client error model when possible