Task: OCR Service

Short description: Use the OCR service to recognize text in images or documents and generate downloadable text or document results.

Overview

Method: POST
Path: /task/pic/ocr
Content-Type: multipart/form-data

Authentication

Header: Authorization: Bearer <token>

Request Example

Form Parameters

Parameter	Type	Required	Description
image_file	file	No	Source image or document file (binary), mutually exclusive with `image_url`; supports pdf, ppt, pptx, xls, xlsx, doc, docx, jpeg, jpg, png, gif, bmp
image_url	string	No	Source image or document URL, mutually exclusive with `image_file`; use 80 or 443 port addresses
format	string	No	Output format, supports `txt`, `pdf`, `docx`, `xlsx`, `pptx`
language	string	No	Input file language, default `ChinesePRC`; supports multiple languages separated by commas, for example `English,ChinesePRC,Digits`
password	string	No	Document password if the input file is protected; maximum 32 characters

curl Example

bash

curl -X POST "https://api.gpt.ge/task/pic/ocr" \
  -H "Authorization: Bearer sk-xxxx" \
  -F "image_url=https://example.com/document.pdf" \
  -F "format=txt" \
  -F "language=ChinesePRC,English" \
  -F "password=123456"

JavaScript (fetch) Example

javascript

const formData = new FormData();
formData.append('image_url', 'https://example.com/document.pdf');
formData.append('format', 'txt');
formData.append('language', 'ChinesePRC,English');
formData.append('password', '123456');

fetch('https://api.gpt.ge/task/pic/ocr', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sk-xxxx'
  },
  body: formData
}).then(r => r.json()).then(console.log);

Python (requests) Example

python

import requests

url = 'https://api.gpt.ge/task/pic/ocr'
headers = {
    'Authorization': 'Bearer sk-xxxx'
}

files = {
    'image_url': (None, 'https://example.com/document.pdf')
}
data = {
    'format': 'txt',
    'language': 'ChinesePRC,English',
    'password': '123456'
}

response = requests.post(url, headers=headers, files=files, data=data)
print(response.json())

Response Example (200)

json

{
  "status": 200,
  "data": {
    "file": "https://wxtechsz.oss-cn-shenzhen.aliyuncs.com/tasks/output/ocr/a695981c-5c4f-45c4-a931-92bf4f58077f.txt",
    "type": 101,
    "state": 1,
    "task_id": "a695981c-5c4f-45c4-a931-92bf4f58077f",
    "progress": 100,
    "ocr_pages": 1,
    "created_at": 1746953927,
    "file_pages": 1,
    "input_size": 116929,
    "output_size": 164,
    "completed_at": 1746953930,
    "processed_at": 1746953927,
    "state_detail": "Complete"
  }
}

Note: This endpoint uses multipart/form-data to upload files. Submit either image_file or image_url, not both. The language parameter supports multiple comma-separated values and is case-sensitive.

Task: OCR Service ​

Overview ​

Authentication ​

Request Example ​

Form Parameters ​

curl Example ​

JavaScript (fetch) Example ​

Python (requests) Example ​

Response Example (200) ​

Task: OCR Service

Overview

Authentication

Request Example

Form Parameters

curl Example

JavaScript (fetch) Example

Python (requests) Example

Response Example (200)