Skip to main content

POST /v1/chat/completions

Create a chat completion following the OpenAI-compatible Chat Completions format.

  • Method: POST
  • Path: /v1/chat/completions
  • Auth: Bearer token in Authorization header
  • Content-Type: application/json
OpenAI-compatible

The request/response format mirrors OpenAI's Chat Completions. Bring your existing OpenAI SDK code and only change baseURL and the apiKey.

Request parameters

  • model (string, required): Target model ID.
  • messages (array, required): Chat messages. Each item:
    • role (string): "system" | "user" | "assistant" | "tool"
    • content (string or array): Text, or an array of blocks including:
      • Text: { "type": "text", "text": "..." }
      • Image: { "type": "image_url", "image_url": { "url": "https://...", "detail": "low|high|auto" } }
      • File (PDF): { "type": "file", "file": { "filename": "...", "file_data": "https://..." } }
      • Audio (Gemini only): { "type": "input_audio", "input_audio": { "data": "BASE64", "format": "wav|mp3" } }
  • tools (array): Tool definitions (JSON), OpenAI-compatible.
  • tool_choice (string | object): Tool forcing strategy.
  • response_format:
    • Text: { "type": "text" }
    • JSON object: { "type": "json_object" }
    • JSON schema: { "type": "json_schema", "json_schema": { "name": "...", "schema": { ... }, "strict": true } }
  • temperature (0..2), top_p (0..1)
  • stream (boolean, default false)
  • stream_options (object): { "include_usage": boolean }
  • stop (string | string[])
  • max_completion_tokens (integer ≥ 1)
  • reasoning_effort ("minimal" | "low" | "medium" | "high")
  • presence_penalty, frequency_penalty (-2..2)
  • logit_bias (object)
  • parallel_tool_calls (boolean)
  • prediction.static_content (object): Pre-seeded content for structured tasks.
  • web_search_options (object): Optional web search config.

Note: See Features → Multimodal for details about text, image, file, and audio blocks in messages.

Example Requests

from openai import OpenAI

client = OpenAI(
base_url="https://api.naga.ac/v1",
api_key="YOUR_API_KEY",
)

resp = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": "What's 2+2?"}
],
temperature=0.2,
)
print(resp.choices[0].message.content)

Multimodal Example

Image and file inputs follow the same content-array format:

{
"model": "gemini-2.5-flash",
"messages": [
{
"role": "user",
"content": [
{ "type": "text", "text": "What is in this audio and document?" },
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
}
},
{
"type": "input_audio",
"input_audio": { "data": "BASE64_AUDIO", "format": "wav" }
},
{
"type": "file",
"file": {
"filename": "document.pdf",
"file_data": "https://bitcoin.org/bitcoin.pdf"
}
}
]
}
]
}

See Features → Multimodal for provider-specific modality support.

Authentication

Provide your key as a Bearer token:

Authorization: Bearer YOUR_API_KEY

Response

A standard OpenAI-compatible response including:

  • id, object, created, model
  • choices (array) with message, finish_reason, etc.
  • Optional usage if enabled (stream_options.include_usage or at final response)

When stream=true, server-sent events follow the OpenAI streaming format.