Skip to main content
Speech-to-text API providing accurate audio transcription with speaker diarization and language detection. Best for workflows that convert spoken audio to text — meeting transcription, voice command processing, and audio content indexing. Unlike ElevenLabs (text-to-speech), Soniox handles the inverse: converting speech audio into structured text. 6 endpoints available through Lava’s AI Gateway. See the Soniox API docs for full documentation.
Supports both managed (Lava’s API keys) and unmanaged (bring your own credentials) mode.

Endpoints

Create a transcription job

POST https://api.soniox.com/v1/transcriptions1.50input,1.50 input, 3.50 output / 1M tokens
const data = await lava.gateway('https://api.soniox.com/v1/transcriptions', { body: {"audio_url":"https://example.com/audio.mp3"} });

Get transcript text

GET https://api.soniox.com/v1/transcriptions/{id}/transcript — Free

Get transcription status and results

GET https://api.soniox.com/v1/transcriptions/{id}1.50input,1.50 input, 3.50 output / 1M tokens

Upload an audio file

POST https://api.soniox.com/v1/files — Free
const data = await lava.gateway('https://api.soniox.com/v1/files', { method: 'POST' });

Get file details

GET https://api.soniox.com/v1/files/{id} — Free

List available speech models

GET https://api.soniox.com/v1/models — Free
const data = await lava.gateway('https://api.soniox.com/v1/models', { method: 'GET' });

Next Steps

All Providers

Browse all supported AI providers

Forward Proxy

Learn how to construct proxy URLs and authenticate requests