whisper-large-v3-turbo
Brief: Transcribe audio to text using OpenAI whisper-large-v3-turbo, optimized for low-latency multi-language recognition.
Overview
- Method:
POST - Path:
/v1/audio/transcriptions - Content-Type:
multipart/form-data
Authentication
- Header:
Authorization: Bearer <token> - Supports bearer token authentication
Request Body Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| file | file | Yes | Audio file object, supporting flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm. Max file size 25 MB |
| model | string | Yes | Model name, set to whisper-large-v3-turbo |
| response_format | string | No | Output format. Supported values: json, text, srt, verbose_json, vtt. Default: json |
| language | string | No | Audio language. Supported values: zh, en, de, es. Use ISO-639-1 codes to improve accuracy |
whisper-large-v3-turbo is a high-performance speech recognition model optimized for low latency and large-scale usage.
curl Example
bash
curl -X POST "https://api.gpt.ge/v1/audio/transcriptions" \
-H "Authorization: Bearer sk-xxxx" \
-F "file=@./audio.wav" \
-F "model=whisper-large-v3-turbo" \
-F "response_format=json" \
-F "language=zh"JavaScript (fetch) Example
javascript
const formData = new FormData();
formData.append('file', audioFile);
formData.append('model', 'whisper-large-v3-turbo');
formData.append('response_format', 'json');
formData.append('language', 'zh');
fetch('https://api.gpt.ge/v1/audio/transcriptions', {
method: 'POST',
headers: {
'Authorization': 'Bearer sk-xxxx'
},
body: formData
}).then(r => r.json()).then(console.log);Python (requests) Example
python
import requests
with open('audio.wav', 'rb') as f:
files = {'file': f}
data = {
'model': 'whisper-large-v3-turbo',
'response_format': 'json',
'language': 'zh'
}
response = requests.post(
'https://api.gpt.ge/v1/audio/transcriptions',
headers={'Authorization': 'Bearer sk-xxxx'},
files=files,
data=data
)
print(response.json())Response Example (200)
json
{
"text": "Hello, this is the OpenAI Whisper Large V3 Turbo transcription result."
}Note: whisper-large-v3-turbo offers a strong balance between low latency and accuracy, making it suitable for real-time transcription.