whisper-large-v3
Brief: Transcribe audio to text using OpenAI whisper-large-v3, a high-accuracy multi-language speech recognition model.
Overview
- Method:
POST - Path:
/v1/audio/transcriptions - Content-Type:
multipart/form-data
Authentication
- Header:
Authorization: Bearer <token> - Supports bearer token authentication
Request Body Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| file | file | Yes | Audio file object, supporting flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm. Max file size 25 MB |
| model | string | Yes | Model name, set to whisper-large-v3 |
| language | string | No | Audio language. Supported values: zh, en, de, es. Use ISO-639-1 codes to improve accuracy |
whisper-large-v3 is a high-accuracy, multi-language speech recognition model, suitable for real-time transcription and voice interaction applications.
curl Example
bash
curl -X POST "https://api.gpt.ge/v1/audio/transcriptions" \
-H "Authorization: Bearer sk-xxxx" \
-F "file=@./audio.wav" \
-F "model=whisper-large-v3" \
-F "language=zh"JavaScript (fetch) Example
javascript
const formData = new FormData();
formData.append('file', audioFile);
formData.append('model', 'whisper-large-v3');
formData.append('language', 'zh');
fetch('https://api.gpt.ge/v1/audio/transcriptions', {
method: 'POST',
headers: {
'Authorization': 'Bearer sk-xxxx'
},
body: formData
}).then(r => r.json()).then(console.log);Python (requests) Example
python
import requests
with open('audio.wav', 'rb') as f:
files = {'file': f}
data = {
'model': 'whisper-large-v3',
'language': 'zh'
}
response = requests.post(
'https://api.gpt.ge/v1/audio/transcriptions',
headers={'Authorization': 'Bearer sk-xxxx'},
files=files,
data=data
)
print(response.json())Response Example (200)
json
{
"text": "Hello, this is the OpenAI Whisper Large V3 transcription result."
}Note: whisper-large-v3 is suitable for high-accuracy multi-language transcription, especially for voice interaction and real-time scenarios.