Hello , welcome.

Documentation

API documentation

About

Pdfocr.org is one of the best online PDF conversion tool helps customer convert PDF and images to various file formats include Word, Excel, Powerpoint, Texts and more.

Pdfocr.org online service is free to use to all, with a limitation on the core functionalities OCR. In general, you can convert up to five scanned pdf files for free every day, each file limited to 20 pages and max 15M in size.

Pdfocr also offers API access for customers who has lots of files to convert. You can sign up to get an API key, buy some credits and you are good to go. If you are not familiar with coding, we also provide a webpage for your to make conversions here

At the moment, PDF 2 Images, Images 2 PDF not supported.

Please unlock your pdf before upload if there are password or permission restrictions.

API Usage

There are two ways to make requests once you get the API key, use either POST to upload your file, or provide an url to your file (make sure it's accessible and downloadable) using GET and together with other required data to the endpoint: https://pdfocr.org/ocrapiv1.html

To avoid any unexpected errors, the file name of your pdf file should only contain letters, numbers, hyphen.

Parameters

Key Value description
API key You get an API key when you sign up Ger yours here
lang The language of your PDF file, supported language values see ref[1] English is the default language, if your pdf file has more than one lanuage in addition to Enligh, use the other.
outformat The output format of the file you want to convert to For example, 'docx', 'xlsx', 'txt', 'pdf', 'pptx' etc.
ocr Specify if OCR required or not 1 for yes, 0 for no
url Url of your file (no special characters). If you do not want to upload your file, please provide a URL, we can download it directly from your server (please note the url shouldn't contain any special characters in case of download failure).
filename Name of your file (should match your uploaded file). This parameter is used together with outformat to query if the conversion is complete when you are disconneted from the server for whatever reason, see explaination below.ref[2]

[1]. At the moment, supported language values include: English,French,Spanish,German,Italian,Turkish,Thai,Arabic,Russian,Korean,Japanese,Indonesian,Portuguese,Vietnamese,ChineseSimplified,ChineseTraditional

[2]. Please note, pdfocr.org uses cloudflare as the DNS server, which will disconnect you from the server after about 100 seconds. If your file is too big which takes over 100 seconds to convert then you will get a timeout status code 524. In thise case you can send a few requests at intervals of a few seconds with the parameter filename plus the outformat to the endpoint for the status of your conversion.

Response

The API returns results in JSON format. A typical successfull Response would be like below:

{"retcode":"200","msg":"Conversion success.","data":{"filename":"example.docx","link":"https://pdfocr.org/doc/example.docx","available pages":2000}}

A failed Response would be like this:

{"retcode":"404","msg":"API key does not exist, please sign up."}

Key Value description
retcode A retcode shows if your conversion is successfull or note 200 for sccuess, while 4xx means failure
msg An explannation of the retcode You can find the reason why a conversion fails and take actions accordingly
data Information about the conversion results You can get the converted filename, download link etc here.

Sample code

Here is a simple Python sample code for your ref, make changes accordingly please.

Pricing

Pdfocr.org now provides no monthly or yearly plans at the moment, you can just purchase credits for your conversions accordingly.

Number of Pages Price $ per pages
1000 $40 0.04
3000 $80 0.027
5000 $100 0.02
10000 $150 0.015
30000 $300 0.01
50000 $400 0.008

Support

If you need help, drop me a line here https://www.facebook.com/pdfocr