Skip to main content
Use the Audio API when you need speech synthesis, transcription, or translation as direct audio workflows.

Audio Operations

  • POST /v1/audio/speech converts text into audio output
  • POST /v1/audio/transcriptions converts uploaded audio into text
  • POST /v1/audio/translations translates uploaded audio into text
These are separate operations with different request and response contracts:
  • text-to-speech returns streamed audio bytes
  • transcription returns JSON text
  • translation returns JSON text

Text to Speech

Convert text into streamed audio output.

Speech to Text

Transcribe uploaded audio into text.

Speech Translation

Translate uploaded audio into text.

Formats and Uploads

Check supported file formats and upload guidance.

Quick Example

from pathlib import Path
from openai import OpenAI

client = OpenAI(
    base_url="https://api.naga.ac/v1",
    api_key="YOUR_API_KEY",
)

transcription = client.audio.transcriptions.create(
    model="whisper-1",
    file=Path("sample.mp3"),
)

print(transcription.text)

Reference