Thinking Blocks - NagaAI Documentation

Messages API uses Anthropic-style thinking blocks rather than Responses reasoning items or chat-style reasoning_details fields. Use this page when you need reasoning controls but your integration depends on the Anthropic Messages protocol.

Request Controls

Two request surfaces affect reasoning behavior here:

thinking with type: enabled, disabled, or adaptive
output_config.effort with low, medium, high, or max

from anthropic import Anthropic

client = Anthropic(
    base_url="https://api.naga.ac",
    api_key="YOUR_API_KEY",
)

message = client.messages.create(
    model="claude-sonnet-4.5",
    max_tokens=512,
    messages=[
        {
            "role": "user",
            "content": "Explain the likely cause of a slow query and suggest two checks.",
        }
    ],
    thinking={"type": "enabled", "budget_tokens": 16384},
    output_config={"effort": "high"},
)

print(message.content)

output_config is a reasoning control on this API. It is not the generic structured-output equivalent. Start with a modest reasoning budget and raise it only when the extra quality is worth the cost and latency.

Non-Streaming Response Shape

Thinking can appear as a typed content block before text blocks.

{
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "thinking",
      "thinking": "First check the query plan and index usage.",
      "signature": "sig_1"
    },
    {
      "type": "text",
      "text": "Start by checking the query plan and whether the filter columns are indexed."
    }
  ]
}

Streaming Behavior

The stream can emit thinking before text.

a thinking block starts with content_block_start
thinking text streams through content_block_delta
signatures arrive as content_block_delta with type: signature_delta
normal answer text can begin in a later content block

From the protocol tests, a streamed thinking block starts as:

{
  "type": "content_block_start",
  "content_block": {
    "type": "thinking",
    "thinking": ""
  }
}

And the signature can arrive as:

{
  "type": "content_block_delta",
  "delta": {
    "type": "signature_delta",
    "signature": "sig_1"
  }
}

Preserve Thinking Blocks Across Tool Turns

If you continue a tool-using conversation in a later request, replay the assistant’s prior thinking blocks unchanged before you send the later tool_result block.

Preserve the replayed thinking text and signature exactly as generated. If you mutate them, reasoning continuity across later tool turns can break.

{
  "model": "claude-sonnet-4.5",
  "max_tokens": 256,
  "tools": [
    {
      "name": "lookup_weather",
      "description": "Look up current weather for a city.",
      "input_schema": {
        "type": "object",
        "properties": {
          "city": { "type": "string" }
        },
        "required": ["city"]
      }
    }
  ],
  "messages": [
    {
      "role": "user",
      "content": "What is the weather in Prague and should I bring a coat?"
    },
    {
      "role": "assistant",
      "content": [
        {
          "type": "thinking",
          "thinking": "I need live weather data before answering.",
          "signature": "sig_1"
        },
        {
          "type": "tool_use",
          "id": "toolu_1",
          "name": "lookup_weather",
          "input": {
            "city": "Prague"
          }
        }
      ]
    },
    {
      "role": "user",
      "content": [
        {
          "type": "tool_result",
          "tool_use_id": "toolu_1",
          "content": "{\"temperature_c\":7,\"raining\":true}"
        }
      ]
    }
  ]
}

Keep the thinking text and signature exactly as generated if you want reasoning continuity across turns.

Caveats

clients that only read text blocks can miss thinking output completely
thinking: {"type":"disabled"} takes precedence over effort-style hints
adaptive maps to a medium-style reasoning setting internally
output_config.effort: "max" maps to the highest normalized reasoning effort internally

Common mistakes

treating output_config like a generic output-format feature instead of a reasoning control
modifying replayed thinking blocks before a later tool turn
assuming all models expose visible thinking blocks

​Request Controls

​Non-Streaming Response Shape

​Streaming Behavior

​Preserve Thinking Blocks Across Tool Turns

​Caveats

​Common mistakes

​Related Docs

Request Controls

Non-Streaming Response Shape

Streaming Behavior

Preserve Thinking Blocks Across Tool Turns

Caveats

Common mistakes

Related Docs