Tokens and Usage - NagaAI Documentation

Tokens are the units models use to process input and produce output. Understanding usage helps you estimate cost, control context size, and interpret API responses correctly.

What is a token?

A token is a piece of data. Depending on the input, it can represent:

text, usually a word or part of a word
image content converted into visual tokens
audio content converted into audio tokens

As a rough rule of thumb, 1 token ≈ 4 characters in English.

Token Categories

Category	What it means
Input tokens	Tokens you send in prompts, messages, files, images, or audio
Output tokens	Tokens the model generates in its response
Cached input tokens	Reused input tokens on providers that support prompt caching
Reasoning tokens	Extra model-internal reasoning work on supported reasoning models

Input tokens are usually cheaper than output tokens. Cached input tokens, when supported, are often discounted relative to normal input tokens.

How Usage is Reported

Because NagaAI supports multiple API surfaces, the exact JSON shape varies by API. Every major API returns a usage object in its response, which you can log to track costs or analyze workloads.

Why usage shapes differ

Responses focuses on typed output items and can include richer usage details
Chat Completions uses OpenAI-style fields such as prompt_tokens and completion_tokens
Messages uses Anthropic-style fields such as input_tokens and output_tokens

Practical advice

log usage for both successful requests and streamed requests when available
watch for large input growth from long prompts, tools, or conversation history
treat cached and reasoning usage as separate cost drivers when your models expose them

API-Specific Guides

Learn how to read the usage object and handle streaming usage for your specific API:

Responses Usage

Usage tracking, cached tokens, and reasoning tokens in the primary Responses API.

Chat Completions Usage

prompt_tokens, completion_tokens, and include_usage in the OpenAI-compatible layer.

Messages Usage

input_tokens and output_tokens in the Anthropic-compatible layer.

Embeddings API

Input token tracking for vector generation.

​What is a token?

​Token Categories

​How Usage is Reported

​Why usage shapes differ

​Practical advice

​API-Specific Guides