POST /v1/audio/speech when you want audio output instead of generated text.
This endpoint returns streamed audio bytes, not a JSON text response.
Required fields
| Field | Required | Notes |
|---|---|---|
model | Yes | Text-to-speech model |
input | Yes | Text to synthesize |
voice | Yes | Voice preset |
response_format | No | Output audio format such as mp3 or wav |
Request Example
Important Difference From Text APIs
This endpoint does not return generated text JSON. It returns streamed audio bytes with a media type such asaudio/mpeg, audio/opus, or audio/wav.
Response Formats
Supported output formats include:mp3opusaacflacwavpcm
Common mistakes
- expecting a normal JSON response instead of binary audio
- forgetting to write the response to a file or audio buffer
- choosing a response format your playback pipeline does not handle