Endpoints

Audio Generation

POSThttps://api.oneinfer.ai/v1/ula/generate-audio

Generate high-quality audio (Text-to-Speech) or transcribe audio files (Speech-to-Text) using the universal API. Our universal API supports state-of-the-art audio models for text-to-speech and transcription.

01Request Headers

Authorizationstringrequired

Bearer token for authentication. Format: Bearer <YOUR_TOKEN>. Exchange your API key for a token via the Authentication endpoint.

Content-Typestringrequired

Set to application/json for all audio generation requests.

02Request Parameters

providerstringrequired

Audio provider. Supported: 'minimax' and 'sarvam'. Use the GET Models endpoint to retrieve available provider and model details.

modelstringrequired

Audio model name. For Sarvam TTS use 'bulbul:v2' or 'bulbul:v3'.

promptstringrequired

Text to synthesize into speech.

streamboolean

If true, returns chunked audio bytes directly instead of JSON metadata.

voice_idstring

Voice identifier. For Sarvam models, fetch supported values from the Get Supported Voice for Audio Models endpoint.

formatstring

Output audio format (e.g. mp3, wav). For MiniMax streaming, mp3 is required.

03Example Request

Example TTS Request
{
    "provider": "sarvam",
    "model": "bulbul:v3",
    "prompt": "Namaste from OneInfer audio generation.",
    "stream": false,
    "voice_id": "shubh",
    "format": "mp3",
    "speed": 1.0,
    "volume": 1.0,
    "pitch": 0
}

04Response

{ "api_details": { "api_status": "success", "api_message": "API has returned response successfully." }, "data": { "id": "aud_12345abcde", "created": 1711468800, "text": "Generated audio for prompt: Namaste from OneInfer audio generation.", "finish_reason": "stop", "provider": "sarvam", "model": "bulbul:v3", "usage": { "prompt_tokens": 7, "completion_tokens": 0, "total_tokens": 7 }, "latency_ms": 980, "audios": [ { "url": "data:audio/mpeg;base64,....", "format": "mp3", "base64_data": "...", "mime_type": "audio/mpeg" } ] }, "error": {} }

Error Status Codes

CodeStatusDescription
200OKAudio generated successfully.
400Bad RequestInvalid request body or unsupported provider/model.
401UnauthorizedMissing or invalid Authorization header / Bearer token.
403ForbiddenInsufficient credit balance.
422Unprocessable EntityRequest body failed schema validation.
500Internal Server ErrorUnexpected error during audio generation.

Response

202 - application/json