Hello , welcome.

Documentation

API documentation

About

Pdfocr.org is one of the best online PDF conversion tool helps customer convert PDF and images to various file formats include Word, Excel, Powerpoint, Texts and more.

Pdfocr.org online service is free to use to all, with a limitation on the core functionalities OCR. In general, you can convert up to five scanned pdf files for free every day, each file limited to 20 pages and max 15M in size. When you use up your daily limit, there is a small fee in place, you can either pay or wait a day.

Pdfocr also offers API access for customers who has lots of files to convert. You can sign up to get an API key, buy some credits and you are good to go. If you are not familiar with coding, we also provide a webpage for your to make conversions here

At the moment, PDF 2 Images, Images 2 PDF not supported.

Please unlock your pdf before upload if there are password or permission restrictions.

API Usage

Once get the API key, please use POST to send your file (using the key 'file' MAX 60M) and queries to the endpoint: https://pdfocr.org/ocrapiv1.html

To avoid any unexpected errors, the file name of your pdf file should only contain letters, numbers, hyphen.

Post Parameters

Key Value description
API key You get an API key when you sign up Ger yours here
lang The language of your PDF file, supported language values see ref[1] English is the default language, if your pdf file has more than one lanuage in addition to Enligh, choose the other.
filext The extension of the file you want to convert to, for example, 'docx', 'xlsx', 'txt' etc. Please note it's not the extention of your pdf file, we get that extension from the file you upload.
ocr Specify if OCR required or not 1 for yes, 0 for no
filename File name pluse filext. For instance, example.pptx is the file name for example.pdf This parameter is used to query if the conversion is complete when you are disconneted from the server for whatever reason, see explaination below.ref[2]

[1]. At the moment, supported language values include: English,French,Spanish,German,Italian,Turkish,Thai,Arabic,Russian,Korean,Japanese,Indonesian,Portuguese,Vietnamese,ChineseSimplified,ChineseTraditional

[2]. Please note, pdfocr.org uses cloudflare as the DNS server, which will disconnect you from the server after about 100 seconds. If your file is too big which takes over 100 seconds to convert then you will get a timeout status code 524. In thise case you can send a few requests at intervals of a few secondes with the parameter downfile with the value of your filename plus the filext (such as example.xlsx) to the same url, and you will get a response if conversion is completed.

Response

The API returns results in JSON format. A typical successfull Response would be like below:

{"retcode":"200","msg":"Conversion success.","data":{"filename":"example.docx","link":"https://pdfocr.org/doc/example.docx","available pages":2000}}

A failed Response would be like this:

{"retcode":"404","msg":"API key does not exist, please sign up."}

Key Value description
retcode A retcode shows if your conversion is successfull or note 200 for sccuess, while 4xx means failed
msg An explannation of the retcode You can find the reason why a conversion is failed and take actions accordingly
data Information about the conversion results You can get the converted filename, download link etc here.

Sample code

Here is a simple Python sample code for your ref, in production you need to limit that while loop to run a few minuts only.

Support

If you need help, drop me a line here https://www.facebook.com/pdfocr