Skip to main content

Audio Inputs

NagaAI supports audio input for Gemini models via the input_audio content block in Chat Completions. For other models, use text/image/file in Chat Completions and the dedicated Speech/Transcription/Translation endpoints for audio processing.

How to Send Audio

Using Base64-Encoded Audio

Encode your audio file as base64 and provide it in the input_audio.data field:

{
"model": "gemini-2.5-flash",
"messages": [
{
"role": "user",
"content": [
{ "type": "text", "text": "Tell me what is said in this audio." },
{
"type": "input_audio",
"input_audio": { "data": "BASE64_AUDIO", "format": "wav" }
}
]
}
]
}

Supported Audio Types

  • audio/wav
  • audio/mp3

Discover Supported Models

You can see which models accept audio input on the NagaAI Models page (audio filter).