Usage Logging
Certain endpoints, such as chat completions, embeddings, and images generations, return a usage object in their responses. This object provides details on the number of tokens used by the request, which is essential for tracking consumption and calculating costs.
Chat Completions
For non-streaming chat completions, the usage object includes the prompt tokens, completion tokens, and total tokens.
Example:
{
  "usage": {
    "prompt_tokens": 194,
    "completion_tokens": 2,
    "total_tokens": 196
  }
}
Streaming
For streaming requests, you need to set stream_options.include_usage to true to receive the usage information. With this option enabled, the usage object will be included in one of the last chunks of the stream, allowing you to account for usage and calculate costs.
Example:
{
  "model": "gpt-4o-mini",
  "messages": [...],
  "stream_options": {"include_usage": true}
}
Embeddings
The embeddings endpoint returns a usage object with prompt tokens and total tokens (since there are no completion tokens).
Example:
{
  "usage": {
    "prompt_tokens": 125,
    "total_tokens": 125
  }
}
Images Generations
For images generations, the usage object may include input tokens (often null), output tokens, and total tokens.
Example:
{
  "usage": {
    "input_tokens": null,
    "output_tokens": 1536,
    "total_tokens": 1536
  }
}