Use streaming when you want tokens, tool arguments, or structured events to arrive before the full response is finished. NagaAI supports streaming across the main generation APIs, but each surface uses a different event format.Documentation Index
Fetch the complete documentation index at: https://docs.naga.ac/llms.txt
Use this file to discover all available pages before exploring further.
Support Matrix
| API | Enable it with | Main text delta | Main tool delta | Terminal signal |
|---|---|---|---|---|
Responses | stream: true | response.output_text.delta | response.function_call_arguments.delta | response.completed and [DONE] |
Chat Completions | stream: true | choices[0].delta.content | choices[0].delta.tool_calls | final chunk with finish_reason |
Messages | stream: true | content_block_delta with text_delta | content_block_delta with input_json_delta | message_delta and message_stop |
When To Use It
- lower perceived latency in chat and assistant UIs
- show long answers as they are generated
- react to tool-call arguments before the final answer finishes
Recommended Example
Protocol Differences
Responsesstreams named lifecycle events and typed deltas.Chat Completionsstreams OpenAI-compatiblechat.completion.chunkpayloads.Messagesstreams Anthropic-style event names such ascontent_block_deltaandmessage_stop.
Protocol Examples
- Responses
- Chat Completions
- Messages
Responses streams semantic events, so you usually branch on event.type.Client Checklist
- parse the stream structurally instead of treating it as plain text
- stop normal stream processing if an error frame appears
- expect tool and reasoning deltas to be interleaved with text on some models
- if you need usage data, check the API-specific streaming docs for how it is delivered