Skip to content

whisper-large

Brief: Transcribe audio to text using OpenAI whisper-large.


Overview

  • Method: POST
  • Path: /v1/audio/transcriptions
  • Content-Type: multipart/form-data

Authentication

  • Header: Authorization: Bearer <token>
  • Supports bearer token authentication

Request Body Parameters

ParameterTypeRequiredDescription
filefileYesAudio file object, supporting flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm. Max file size 25 MB
modelstringYesModel name, set to whisper-large
languagestringNoAudio language. Supported values: zh, en, de, es. Use ISO-639-1 codes to improve accuracy

whisper-large is the larger OpenAI Whisper model, suitable for higher-accuracy transcription with multiple languages, accents, and background noise.


curl Example

bash
curl -X POST "https://api.gpt.ge/v1/audio/transcriptions" \
  -H "Authorization: Bearer sk-xxxx" \
  -F "file=@./audio.wav" \
  -F "model=whisper-large" \
  -F "language=zh"

JavaScript (fetch) Example

javascript
const formData = new FormData();
formData.append('file', audioFile);
formData.append('model', 'whisper-large');
formData.append('language', 'zh');

fetch('https://api.gpt.ge/v1/audio/transcriptions', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sk-xxxx'
  },
  body: formData
}).then(r => r.json()).then(console.log);

Python (requests) Example

python
import requests

with open('audio.wav', 'rb') as f:
    files = {'file': f}
    data = {
        'model': 'whisper-large',
        'language': 'zh'
    }
    response = requests.post(
        'https://api.gpt.ge/v1/audio/transcriptions',
        headers={'Authorization': 'Bearer sk-xxxx'},
        files=files,
        data=data
    )
print(response.json())

Response Example (200)

json
{
  "text": "Hello, this is the OpenAI Whisper Large transcription result."
}

Note: whisper-large is trained on 680,000 hours of multilingual data, suitable for high-accuracy transcription in complex audio scenarios.