Messages API reports usage in the Anthropic-style format.
Use this page if your client expects input_tokens, output_tokens, and Messages streaming events.
Non-Streaming Usage
When making a synchronous request to the Messages API, the top-level response object includes ausage block containing the token metrics for the generation.
Field breakdown
input_tokens: The number of tokens in the prompt (input), including any images or documents. This correlates directly with input token pricing.output_tokens: The number of tokens generated by the model (output). This correlates directly with output token pricing.
Streaming Usage
When usingstream: true in the Anthropic protocol, usage is not delivered in a single top-level object at the end. Instead, usage information is incrementally delivered through specific SSE events during the stream lifecycle.
You do not need to request streaming usage explicitly; the protocol includes it by default.
message_start event
The very first event in the stream (message_start) includes the usage details for the input prompt.
Example message_start Event
output_tokens is 0 because generation has just begun. The input_tokens value represents your prompt cost.
message_delta event
Near the end of the stream, immediately before the message_stop event, the server emits a message_delta event. This event contains the final usage metrics for the output tokens generated during the stream.
Example message_delta Event
output_tokens value in this delta event represents the final output cost.
To determine the total cost of a streaming interaction, your application logic must capture input_tokens from the message_start event and output_tokens from the message_delta event.
Practical Advice
- If you are using the official
@anthropic-ai/sdk, you can usually access the final aggregated usage directly via thefinalMessage()helper or by accumulating the events manually. - Always log the
usageobject in your application database alongside the request metadata. It is the most reliable way to attribute costs to specific features or users before running aggregate reports against/v1/account/activity.
Common mistakes
- expecting one final OpenAI-style usage trailer chunk instead of Anthropic event-based usage
- only capturing
message_startand forgetting the finalmessage_delta - treating Messages usage fields as interchangeable with Chat Completions usage fields