Audio Generation
https://api.oneinfer.ai/v1/ula/generate-audioGenerate high-quality audio (Text-to-Speech) or transcribe audio files (Speech-to-Text) using the universal API. Our universal API supports state-of-the-art audio models for text-to-speech and transcription.
01Request Headers
Bearer token for authentication. Format: Bearer <YOUR_TOKEN>. Exchange your API key for a token via the Authentication endpoint.
Set to application/json for all audio generation requests.
02Request Parameters
Audio provider. Supported: 'minimax' and 'sarvam'. Use the GET Models endpoint to retrieve available provider and model details.
Audio model name. For Sarvam TTS use 'bulbul:v2' or 'bulbul:v3'.
Text to synthesize into speech.
If true, returns chunked audio bytes directly instead of JSON metadata.
Voice identifier. For Sarvam models, fetch supported values from the Get Supported Voice for Audio Models endpoint.
Output audio format (e.g. mp3, wav). For MiniMax streaming, mp3 is required.
03Example Request
{
"provider": "sarvam",
"model": "bulbul:v3",
"prompt": "Namaste from OneInfer audio generation.",
"stream": false,
"voice_id": "shubh",
"format": "mp3",
"speed": 1.0,
"volume": 1.0,
"pitch": 0
}04Response
Error Status Codes
| Code | Status | Description |
|---|---|---|
| 200 | OK | Audio generated successfully. |
| 400 | Bad Request | Invalid request body or unsupported provider/model. |
| 401 | Unauthorized | Missing or invalid Authorization header / Bearer token. |
| 403 | Forbidden | Insufficient credit balance. |
| 422 | Unprocessable Entity | Request body failed schema validation. |
| 500 | Internal Server Error | Unexpected error during audio generation. |