Audio Inputs
NagaAI supports audio input for Gemini models via the input_audio
content block in Chat Completions. For other models, use text/image/file in Chat Completions and the dedicated Speech/Transcription/Translation endpoints for audio processing.
How to Send Audio
Using Base64-Encoded Audio
Encode your audio file as base64 and provide it in the input_audio.data
field:
{
"model": "gemini-2.5-flash",
"messages": [
{
"role": "user",
"content": [
{ "type": "text", "text": "Tell me what is said in this audio." },
{
"type": "input_audio",
"input_audio": { "data": "BASE64_AUDIO", "format": "wav" }
}
]
}
]
}
Supported Audio Types
audio/wav
audio/mp3
Discover Supported Models
You can see which models accept audio input on the NagaAI Models page (audio filter).