Transcription and translation uploads
ForPOST /v1/audio/transcriptions and POST /v1/audio/translations, the request body is multipart/form-data.
Required and optional fields
| Field | Required | Notes |
|---|---|---|
model | Yes | The transcription or translation model |
file | Yes | Binary audio file upload |
prompt | No | Short hint for formatting or name preservation |
language | No | Optional language hint |
Example upload
Validation and decoding
The gateway attempts to decode the uploaded audio before processing it. If the file cannot be decoded, the API returns aninvalid_request_error.
The OpenAPI contract requires a binary file field, but it does not publish a strict file-extension allowlist here. In practice, use standard decodable audio files and validate them in your own pipeline before upload.
Text-to-speech output formats
ForPOST /v1/audio/speech, the response_format field supports:
mp3opusaacflacwavpcm
Practical advice
- use clean source files whenever possible
- prefer standard, decodable audio containers and codecs
- keep prompts short and specific when you need name preservation or formatting hints