Text to Speech - NagaAI Documentation

Use POST /v1/audio/speech when you want audio output instead of generated text. This endpoint returns streamed audio bytes, not a JSON text response.

Required fields

Field	Required	Notes
`model`	Yes	Text-to-speech model
`input`	Yes	Text to synthesize
`voice`	Yes	Voice preset
`response_format`	No	Output audio format such as `mp3` or `wav`

Request Example

from pathlib import Path
from openai import OpenAI

client = OpenAI(
    base_url="https://api.naga.ac/v1",
    api_key="YOUR_API_KEY",
)

speech_file = Path("speech.mp3")

with client.audio.speech.with_streaming_response.create(
    model="gpt-4o-mini-tts",
    input="Welcome to NagaAI. Your job finished successfully.",
    voice="alloy",
    response_format="mp3",
) as response:
    response.stream_to_file(speech_file)

Important Difference From Text APIs

This endpoint does not return generated text JSON. It returns streamed audio bytes with a media type such as audio/mpeg, audio/opus, or audio/wav.

Response Formats

Supported output formats include:

mp3
opus
aac
flac
wav
pcm

Common mistakes

expecting a normal JSON response instead of binary audio
forgetting to write the response to a file or audio buffer
choosing a response format your playback pipeline does not handle

Reference

Create speech

​Required fields

​Request Example

​Important Difference From Text APIs

​Response Formats

​Common mistakes

​Reference