Support Matrix
| API | Enable it with | Main text delta | Main tool delta | Terminal signal |
|---|---|---|---|---|
Responses | stream: true | response.output_text.delta | response.function_call_arguments.delta | response.completed and [DONE] |
Chat Completions | stream: true | choices[0].delta.content | choices[0].delta.tool_calls | final chunk with finish_reason |
Messages | stream: true | content_block_delta with text_delta | content_block_delta with input_json_delta | message_delta and message_stop |
When To Use It
- lower perceived latency in chat and assistant UIs
- show long answers as they are generated
- react to tool-call arguments before the final answer finishes
Recommended Example
Protocol Differences
Responsesstreams named lifecycle events and typed deltas.Chat Completionsstreams OpenAI-compatiblechat.completion.chunkpayloads.Messagesstreams Anthropic-style event names such ascontent_block_deltaandmessage_stop.
Protocol Examples
- Responses
- Chat Completions
- Messages
Responses streams semantic events, so you usually branch on event.type.Client Checklist
- parse the stream structurally instead of treating it as plain text
- stop normal stream processing if an error frame appears
- expect tool and reasoning deltas to be interleaved with text on some models
- if you need usage data, check the API-specific streaming docs for how it is delivered