Skip to content

whisper-1

Brief: Transcribe audio to text using OpenAI whisper-1.


Overview

  • Method: POST
  • Path: /v1/audio/transcriptions
  • Content-Type: multipart/form-data

Authentication

  • Header: Authorization: Bearer <token>
  • Supports bearer token authentication

Request Body Parameters

ParameterTypeRequiredDescription
filefileYesAudio file object, supporting flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm. Max file size 25 MB
modelstringYesModel name, set to whisper-1
promptstringNoOptional prompt text to guide transcription style. Should match the audio language
response_formatstringNoOutput format. Supported: json, text, srt, verbose_json, vtt. Default: json
temperaturenumberNoSampling temperature, range 0 to 1. Higher values increase randomness, lower values increase stability
timestamp_granularitiesarrayNoTimestamp granularity. Options: response_format, verbose_json, word, segment
languagestringNoAudio language. Use ISO-639-1 codes to improve transcription accuracy

curl Example

bash
curl -X POST "https://api.gpt.ge/v1/audio/transcriptions" \
  -H "Authorization: Bearer sk-xxxx" \
  -F "file=@./audio.wav" \
  -F "model=whisper-1" \
  -F "response_format=json" \
  -F "language=zh"

JavaScript (fetch) Example

javascript
const formData = new FormData();
formData.append('file', audioFile);
formData.append('model', 'whisper-1');
formData.append('response_format', 'json');
formData.append('language', 'zh');

fetch('https://api.gpt.ge/v1/audio/transcriptions', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sk-xxxx'
  },
  body: formData
}).then(r => r.json()).then(console.log);

Python (requests) Example

python
import requests

with open('audio.wav', 'rb') as f:
    files = {'file': f}
    data = {
        'model': 'whisper-1',
        'response_format': 'json',
        'language': 'zh'
    }
    response = requests.post(
        'https://api.gpt.ge/v1/audio/transcriptions',
        headers={'Authorization': 'Bearer sk-xxxx'},
        files=files,
        data=data
    )
print(response.json())

Response Example (200)

json
{
  "text": "Hello, I am OpenAI Whisper."
}

Note: If response_format is srt, vtt, or verbose_json, the output will include detailed timestamps or subtitle information.