Task: PDF Parse

Short description: Parse PDF content and return extracted text or structured results using a FormData request.

Overview

Method: POST
Path: /task/gi/pdf-parse
Content-Type: multipart/form-data

Authentication

Header: Authorization: Bearer <token>

Request Example

Form Parameters

Parameter	Type	Required	Description
file	file	Yes	PDF file to be parsed
end_pages	integer	Yes	Number of PDF pages to process
is_ocr	boolean	No	Whether to enable OCR
language	string	No	Specify language to improve recognition accuracy; default is auto-detect. Supported values: `ch`, `en`, `korean`, `japan`, `chinese_cht`, `ta`, `te`, `ka`, `latin`, `arabic`, `cyrillic`, `devanagari`
formula_enable	boolean	No	Whether to enable formula parsing
table_enable	boolean	No	Whether to enable table parsing
layout_model	string	No	Layout analysis model: `layoutlmv3` or `doclayout_yolo`

curl Example

bash

curl -X POST "https://api.gpt.ge/task/gi/pdf-parse" \
  -H "Authorization: Bearer sk-xxxx" \
  -F "file=@/path/to/document.pdf" \
  -F "end_pages=10" \
  -F "is_ocr=false" \
  -F "language=en" \
  -F "formula_enable=false" \
  -F "table_enable=true" \
  -F "layout_model=layoutlmv3"

JavaScript (fetch) Example

javascript

const formData = new FormData();
formData.append('file', fileInput.files[0]);
formData.append('end_pages', '10');
formData.append('is_ocr', 'false');
formData.append('language', 'en');
formData.append('formula_enable', 'false');
formData.append('table_enable', 'true');
formData.append('layout_model', 'layoutlmv3');

fetch('https://api.gpt.ge/task/gi/pdf-parse', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sk-xxxx'
  },
  body: formData
}).then(r => r.json()).then(console.log);

Python (requests) Example

python

import requests

url = 'https://api.gpt.ge/task/gi/pdf-parse'
headers = {
    'Authorization': 'Bearer sk-xxxx'
}
files = {
    'file': open('document.pdf', 'rb')
}
data = {
    'end_pages': 10,
    'is_ocr': 'false',
    'language': 'en',
    'formula_enable': 'false',
    'table_enable': 'true',
    'layout_model': 'layoutlmv3'
}

response = requests.post(url, headers=headers, files=files, data=data)
print(response.json())

Response Example (200)

json

{
  "text": "我是morphogen api智能客服，小墨\n需要我的帮助吗？"
}

Note: This endpoint uses multipart/form-data to upload the PDF file. The response includes a text field containing the extracted text.

Task: PDF Parse ​

Overview ​

Authentication ​

Request Example ​

Form Parameters ​

curl Example ​

JavaScript (fetch) Example ​

Python (requests) Example ​

Response Example (200) ​

Task: PDF Parse

Overview

Authentication

Request Example

Form Parameters

curl Example

JavaScript (fetch) Example

Python (requests) Example

Response Example (200)