OpenAI-compatible The request/response format mirrors OpenAI’s Chat
Completions. Bring your existing OpenAI SDK code and only change baseURL and
the apiKey.
Request Parameters
Chat messages. Each item:
role (string): “system” | “user” | “assistant” | “tool”
content (string or array): Text, or an array of blocks including:
Text: { "type": "text", "text": "..." }
Image: { "type": "image_url", "image_url": { "url": "https://...", "detail": "low|high|auto" } }
File (PDF): { "type": "file", "file": { "filename": "...", "file_data": "https://..." } }
Audio (Gemini only): { "type": "input_audio", "input_audio": { "data": "BASE64", "format": "wav|mp3" } }
Tool definitions (JSON), OpenAI-compatible.
Text: { "type": "text" }
JSON object: { "type": "json_object" }
JSON schema: { "type": "json_schema", "json_schema": { "name": "...", "schema": { ... }, "strict": true } }
Sampling temperature (0..2).
Nucleus sampling probability (0..1).
Whether to stream the response.
{ "include_usage": boolean }
Maximum number of tokens to generate.
“minimal” | “low” | “medium” | “high”
Penalty for new tokens based on their presence in the text so far (-2..2).
Penalty for new tokens based on their frequency in the text so far (-2..2).
Modify the likelihood of specified tokens appearing in the completion.
Whether to enable parallel tool calls.
prediction.static_content
Pre-seeded content for structured tasks.
Optional web search config.
Image generation configuration (for models with native image generation like
gemini-2.5-flash-image) - aspect_ratio (string): Aspect ratio for
generated images. Supported values: “1:1”, “2:3”, “3:2”, “3:4”, “4:3”, “4:5”,
“5:4”, “9:16”, “16:9”, “21:9”
See Features → Multimodal for details about text, image, file, and audio blocks in messages.
Example Request
from openai import OpenAI
client = OpenAI(
base_url = "https://api.naga.ac/v1" ,
api_key = "YOUR_API_KEY" ,
)
resp = client.chat.completions.create(
model = "gpt-4o-mini" ,
messages = [
{ "role" : "user" , "content" : "What's 2+2?" }
],
temperature = 0.2 ,
)
print (resp.choices[ 0 ].message.content)
Multimodal Example
Image and file inputs follow the same content-array format:
{
"model" : "gemini-2.5-flash" ,
"messages" : [
{
"role" : "user" ,
"content" : [
{ "type" : "text" , "text" : "What is in this audio and document?" },
{
"type" : "image_url" ,
"image_url" : {
"url" : "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
}
},
{
"type" : "input_audio" ,
"input_audio" : { "data" : "BASE64_AUDIO" , "format" : "wav" }
},
{
"type" : "file" ,
"file" : {
"filename" : "document.pdf" ,
"file_data" : "https://bitcoin.org/bitcoin.pdf"
}
}
]
}
]
}
See Features → Multimodal for provider-specific modality support.
Response
A standard OpenAI-compatible response including:
id, object, created, model
choices (array) with message, finish_reason, etc.
Optional usage if enabled (stream_options.include_usage or at final response)
When stream=true, server-sent events follow the OpenAI streaming format.
Response Fields
Unique request identifier
Always “chat.completion” or “chat.completion.chunk” for streaming
Unix timestamp of request creation
The model used for completion
Array of completion choices The completion message The generated response text
Tool calls if tools were used
Reason for completion stop (“stop”, “length”, “tool_calls”, etc.)