Documentation
About
Pdfocr.org is one of the best online PDF conversion tool helps customer convert PDF and images to various file formats include Word, Excel, Powerpoint, Texts and more.
Pdfocr.org online service is free to use to all, with a limitation on the core functionalities OCR. In general, you can convert up to five scanned pdf files for free every day, each file limited to 20 pages and max 15M in size.
Pdfocr also offers API access for customers who has lots of files to convert. You can sign up to get an API key, buy some credits and you are good to go. If you are not familiar with coding, we also provide a webpage for your to make conversions here
At the moment, PDF 2 Images, Images 2 PDF not supported.
Please unlock your pdf before upload if there are password or permission restrictions.
API Usage
There are two ways to make requests once you get the API key, use either POST to upload your file, or provide an url to your file (make sure it's accessible and downloadable) using GET and together with other required data to the endpoint: https://pdfocr.org/ocrapiv1.htmlTo avoid any unexpected errors, the file name of your pdf file should only contain letters, numbers, hyphen.
Parameters
Key | Value | description |
---|---|---|
API key | You get an API key when you sign up | Ger yours here |
lang | The language of your PDF file, supported language values see ref[1] | English is the default language, if your pdf file has more than one lanuage in addition to Enligh, use the other. |
outformat | The output format of the file you want to convert to | For example, 'docx', 'xlsx', 'txt', 'pdf', 'pptx' etc. |
ocr | Specify if OCR required or not | 1 for yes, 0 for no,4 for others like docx/xlsx/pptx to pdf. |
url | Url of your file (no special characters). | If you do not want to upload your file, please provide a URL, we can download it directly from your server (please note the url shouldn't contain any special characters in case of download failure). |
filename | Name of your file (should match your uploaded file). | This parameter is used together with outformat to query if the conversion is complete when you are disconneted from the server for whatever reason, see explaination below.ref[2] |
[1]. At the moment, supported language values include: English,French,Spanish,German,Italian,Turkish,Thai,Arabic,Russian,Korean,Japanese,Indonesian,Portuguese,Vietnamese,ChineseSimplified,ChineseTraditional
[2]. Please note, pdfocr.org uses cloudflare as the DNS server, which will disconnect you from the server after about 100 seconds. If your file is too big which takes over 100 seconds to convert then you will get a timeout status code 524. In thise case you can send a few requests at intervals of a few seconds with the parameter filename plus the outformat to the endpoint for the status of your conversion.
Response
The API returns results in JSON format. A typical successfull Response would be like below:
{"retcode":"200","msg":"Conversion success.","data":{"filename":"example.docx","link":"https://pdfocr.org/doc/example.docx","available pages":2000}}
A failed Response would be like this:
{"retcode":"404","msg":"API key does not exist, please sign up."}
Key | Value | description |
---|---|---|
retcode | A retcode shows if your conversion is successfull or note | 200 for sccuess, while 4xx means failure |
msg | An explannation of the retcode | You can find the reason why a conversion fails and take actions accordingly |
data | Information about the conversion results | You can get the converted filename, download link etc here. |
Sample code
Here is a simple Python sample code for your ref, make changes accordingly please.
Pricing
Pdfocr.org now provides no monthly or yearly plans at the moment, you can just purchase credits for your conversions accordingly.
Number of Pages | Price | $ per pages |
---|---|---|
1000 | $40 | 0.04 |
3000 | $80 | 0.027 |
5000 | $100 | 0.02 |
10000 | $150 | 0.015 |
30000 | $300 | 0.01 |
Support
Email support is available, if you want a custom plan or have any questions, please drop me a line at znxhw#hotmail.com (replace # with @)