Skip to content

whisper-large-v3

Brief: Transcribe audio to text using OpenAI whisper-large-v3, a high-accuracy multi-language speech recognition model.


Overview

  • Method: POST
  • Path: /v1/audio/transcriptions
  • Content-Type: multipart/form-data

Authentication

  • Header: Authorization: Bearer <token>
  • Supports bearer token authentication

Request Body Parameters

ParameterTypeRequiredDescription
filefileYesAudio file object, supporting flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm. Max file size 25 MB
modelstringYesModel name, set to whisper-large-v3
languagestringNoAudio language. Supported values: zh, en, de, es. Use ISO-639-1 codes to improve accuracy

whisper-large-v3 is a high-accuracy, multi-language speech recognition model, suitable for real-time transcription and voice interaction applications.


curl Example

bash
curl -X POST "https://api.gpt.ge/v1/audio/transcriptions" \
  -H "Authorization: Bearer sk-xxxx" \
  -F "file=@./audio.wav" \
  -F "model=whisper-large-v3" \
  -F "language=zh"

JavaScript (fetch) Example

javascript
const formData = new FormData();
formData.append('file', audioFile);
formData.append('model', 'whisper-large-v3');
formData.append('language', 'zh');

fetch('https://api.gpt.ge/v1/audio/transcriptions', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sk-xxxx'
  },
  body: formData
}).then(r => r.json()).then(console.log);

Python (requests) Example

python
import requests

with open('audio.wav', 'rb') as f:
    files = {'file': f}
    data = {
        'model': 'whisper-large-v3',
        'language': 'zh'
    }
    response = requests.post(
        'https://api.gpt.ge/v1/audio/transcriptions',
        headers={'Authorization': 'Bearer sk-xxxx'},
        files=files,
        data=data
    )
print(response.json())

Response Example (200)

json
{
  "text": "Hello, this is the OpenAI Whisper Large V3 transcription result."
}

Note: whisper-large-v3 is suitable for high-accuracy multi-language transcription, especially for voice interaction and real-time scenarios.