stream: true to receive chat.completion.chunk payloads over SSE.
Use this when your client already expects chat chunks and incremental deltas rather than Responses-style semantic events.
Request
Usestream_options.include_usage when you want the final usage trailer.
Chunk Shape
The first chunk usually establishes the assistant role:choices[0].delta.content.
What to listen for
choices[0].delta.rolefor the initial assistant rolechoices[0].delta.contentfor text deltaschoices[0].delta.tool_callsfor tool-call deltasfinish_reasonon the terminal chunk
Tool Call Deltas
Tool calls stream throughchoices[0].delta.tool_calls.
Final Chunks
When generation finishes, the stream ends with a chunk whosefinish_reason is set.
If stream_options.include_usage is true, the stream can then include a usage trailer with empty choices:
Error Behavior
If a failure happens after headers are sent, the stream can end with a payload that contains a top-levelerror object.
Common mistakes
- assuming all chunks contain text
- forgetting to enable
stream_options.include_usagewhen you need final usage data - treating chat chunks like Responses semantic events