POST /v1/chat/completions
Create a chat completion following the OpenAI-compatible Chat Completions format.
- Method: POST
- Path: /v1/chat/completions
- Auth: Bearer token in Authorizationheader
- Content-Type: application/json
OpenAI-compatible
The request/response format mirrors OpenAI's Chat Completions. Bring your existing OpenAI SDK code and only change baseURL and the apiKey.
Request Parameters
- model(string, required): Target model ID.
- messages(array, required): Chat messages. Each item:- role(string): "system" | "user" | "assistant" | "tool"
- content(string or array): Text, or an array of blocks including:- Text: { "type": "text", "text": "..." }
- Image: { "type": "image_url", "image_url": { "url": "https://...", "detail": "low|high|auto" } }
- File (PDF): { "type": "file", "file": { "filename": "...", "file_data": "https://..." } }
- Audio (Gemini only): { "type": "input_audio", "input_audio": { "data": "BASE64", "format": "wav|mp3" } }
 
- Text: 
 
- tools(array): Tool definitions (JSON), OpenAI-compatible.
- tool_choice(string | object): Tool forcing strategy.
- response_format:- Text: { "type": "text" }
- JSON object: { "type": "json_object" }
- JSON schema: { "type": "json_schema", "json_schema": { "name": "...", "schema": { ... }, "strict": true } }
 
- Text: 
- temperature(0..2),- top_p(0..1)
- stream(boolean, default false)
- stream_options(object):- { "include_usage": boolean }
- stop(string | string[])
- max_completion_tokens(integer ≥ 1)
- reasoning_effort("minimal" | "low" | "medium" | "high")
- presence_penalty,- frequency_penalty(-2..2)
- logit_bias(object)
- parallel_tool_calls(boolean)
- prediction.static_content(object): Pre-seeded content for structured tasks.
- web_search_options(object): Optional web search config.
- image_config(object): Image generation configuration (for models with native image generation like- gemini-2.5-flash-image)- aspect_ratio(string): Aspect ratio for generated images. Supported values: "1:1", "2:3", "3:2", "3:4", "4:3", "4:5", "5:4", "9:16", "16:9", "21:9"
 
Note: See Features → Multimodal for details about text, image, file, and audio blocks in messages.
Example Request
- Python
- Node.js
- cURL
from openai import OpenAI
client = OpenAI(
    base_url="https://api.naga.ac/v1",
    api_key="YOUR_API_KEY",
)
resp = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": "What's 2+2?"}
    ],
    temperature=0.2,
)
print(resp.choices[0].message.content)
import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://api.naga.ac/v1",
  apiKey: "YOUR_API_KEY",
});
const resp = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "What's 2+2?" }],
  temperature: 0.2,
});
console.log(resp.choices[0].message.content);
curl https://api.naga.ac/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      { "role": "user", "content": "What'\''s 2+2?" }
    ],
    "temperature": 0.2
  }'
Multimodal Example
Image and file inputs follow the same content-array format:
{
  "model": "gemini-2.5-flash",
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text", "text": "What is in this audio and document?" },
        {
          "type": "image_url",
          "image_url": {
            "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
          }
        },
        {
          "type": "input_audio",
          "input_audio": { "data": "BASE64_AUDIO", "format": "wav" }
        },
        {
          "type": "file",
          "file": {
            "filename": "document.pdf",
            "file_data": "https://bitcoin.org/bitcoin.pdf"
          }
        }
      ]
    }
  ]
}
See Features → Multimodal for provider-specific modality support.
Authentication
Provide your key as a Bearer token:
Authorization: Bearer YOUR_API_KEY
Response
A standard OpenAI-compatible response including:
- id,- object,- created,- model
- choices(array) with- message,- finish_reason, etc.
- Optional usageif enabled (stream_options.include_usageor at final response)
When stream=true, server-sent events follow the OpenAI streaming format.
Response Fields
- id(string): Unique request identifier
- object(string): Always "chat.completion" or "chat.completion.chunk" for streaming
- created(integer): Unix timestamp of request creation
- model(string): The model used for completion
- choices(array): Array of completion choices- index(integer): Choice index
- message(object): The completion message- role(string): Always "assistant"
- content(string): The generated response text
- tool_calls(array, optional): Tool calls if tools were used
 
- finish_reason(string): Reason for completion stop ("stop", "length", "tool_calls", etc.)
 
- usage(object, optional): Token usage statistics- prompt_tokens(integer): Input tokens used
- completion_tokens(integer): Output tokens generated
- total_tokens(integer): Total tokens used