Usage Logging

Certain endpoints, such as chat completions, embeddings, and images generations, return a usage object in their responses. This object provides details on the number of tokens used by the request, which is essential for tracking consumption and calculating costs.

Chat Completions

For non-streaming chat completions, the usage object includes the prompt tokens, completion tokens, and total tokens.

Example:

{
  "usage": {
    "prompt_tokens": 194,
    "completion_tokens": 2,
    "total_tokens": 196
  }
}

Streaming

For streaming requests, you need to set stream_options.include_usage to true to receive the usage information. With this option enabled, the usage object will be included in one of the last chunks of the stream, allowing you to account for usage and calculate costs.

Example:

{
  "model": "gpt-4o-mini",
  "messages": [...],
  "stream_options": {"include_usage": true}
}

Embeddings

The embeddings endpoint returns a usage object with prompt tokens and total tokens (since there are no completion tokens).

Example:

{
  "usage": {
    "prompt_tokens": 125,
    "total_tokens": 125
  }
}

Images Generations

For images generations, the usage object may include input tokens (often null), output tokens, and total tokens.

Example:

{
  "usage": {
    "input_tokens": null,
    "output_tokens": 1536,
    "total_tokens": 1536
  }
}

Chat Completions​

Streaming​

Embeddings​

Images Generations​

Chat Completions

Streaming

Embeddings

Images Generations