Skip to content

whisper-large-v3-turbo

Brief: Transcribe audio to text using OpenAI whisper-large-v3-turbo, optimized for low-latency multi-language recognition.


Overview

  • Method: POST
  • Path: /v1/audio/transcriptions
  • Content-Type: multipart/form-data

Authentication

  • Header: Authorization: Bearer <token>
  • Supports bearer token authentication

Request Body Parameters

ParameterTypeRequiredDescription
filefileYesAudio file object, supporting flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm. Max file size 25 MB
modelstringYesModel name, set to whisper-large-v3-turbo
response_formatstringNoOutput format. Supported values: json, text, srt, verbose_json, vtt. Default: json
languagestringNoAudio language. Supported values: zh, en, de, es. Use ISO-639-1 codes to improve accuracy

whisper-large-v3-turbo is a high-performance speech recognition model optimized for low latency and large-scale usage.


curl Example

bash
curl -X POST "https://api.gpt.ge/v1/audio/transcriptions" \
  -H "Authorization: Bearer sk-xxxx" \
  -F "file=@./audio.wav" \
  -F "model=whisper-large-v3-turbo" \
  -F "response_format=json" \
  -F "language=zh"

JavaScript (fetch) Example

javascript
const formData = new FormData();
formData.append('file', audioFile);
formData.append('model', 'whisper-large-v3-turbo');
formData.append('response_format', 'json');
formData.append('language', 'zh');

fetch('https://api.gpt.ge/v1/audio/transcriptions', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sk-xxxx'
  },
  body: formData
}).then(r => r.json()).then(console.log);

Python (requests) Example

python
import requests

with open('audio.wav', 'rb') as f:
    files = {'file': f}
    data = {
        'model': 'whisper-large-v3-turbo',
        'response_format': 'json',
        'language': 'zh'
    }
    response = requests.post(
        'https://api.gpt.ge/v1/audio/transcriptions',
        headers={'Authorization': 'Bearer sk-xxxx'},
        files=files,
        data=data
    )
print(response.json())

Response Example (200)

json
{
  "text": "Hello, this is the OpenAI Whisper Large V3 Turbo transcription result."
}

Note: whisper-large-v3-turbo offers a strong balance between low latency and accuracy, making it suitable for real-time transcription.