Skip to content

SenseVoiceSmall

Brief: Transcribe audio to text using the SenseVoiceSmall model, with multi-language automatic detection support.


Overview

  • Method: POST
  • Path: /v1/audio/transcriptions
  • Content-Type: multipart/form-data

Authentication

  • Header: Authorization: Bearer <token>
  • Supports bearer token authentication

Request Body Parameters

ParameterTypeRequiredDescription
filefileYesAudio file object, supporting flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm. Maximum file size is 25 MB
modelstringYesModel name, set to SenseVoiceSmall
languagestringNoAudio language. Supported values: auto, zh, en, de, es, yue, ja, ko, nospeech. Default: auto

Note: When language is set to auto, the model will automatically detect the audio language.


curl Example

bash
curl -X POST "https://api.gpt.ge/v1/audio/transcriptions" \
  -H "Authorization: Bearer sk-xxxx" \
  -F "file=@./audio.wav" \
  -F "model=SenseVoiceSmall" \
  -F "language=auto"

JavaScript (fetch) Example

javascript
const formData = new FormData();
formData.append('file', audioFile);
formData.append('model', 'SenseVoiceSmall');
formData.append('language', 'auto');

fetch('https://api.gpt.ge/v1/audio/transcriptions', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sk-xxxx'
  },
  body: formData
}).then(r => r.json()).then(console.log);

Python (requests) Example

python
import requests

with open('audio.wav', 'rb') as f:
    files = {'file': f}
    data = {
        'model': 'SenseVoiceSmall',
        'language': 'auto'
    }
    response = requests.post(
        'https://api.gpt.ge/v1/audio/transcriptions',
        headers={'Authorization': 'Bearer sk-xxxx'},
        files=files,
        data=data
    )
print(response.json())

Response Example (200)

json
{
  "text": "Hello, this is the SenseVoiceSmall transcription result."
}

Note: SenseVoiceSmall supports multi-language recognition and automatic language detection, making it suitable for fast audio transcription.