Documentation
About
Pdfocr.org is one of the best online PDF conversion tool helps customer convert PDF and images to various file formats include Word, Excel, Powerpoint, Texts and more.
Pdfocr.org online service is free to use to all, with a limitation on the core functionalities OCR. In general, you can convert up to five scanned pdf files for free every day, each file limited to 20 pages and max 15M in size.
Pdfocr also offers API access for customers who has lots of files to convert. You can sign up to get an API key, buy some credits and you are good to go. If you are not familiar with coding, we also provide a webpage for your to make conversions here
At the moment, PDF 2 Images, Images 2 PDF not supported.
Please unlock your pdf before upload if there are password or permission restrictions.
Request
Please upload your file with required Parameters in the headers to the endpoint: https://pdfocr.org/ocrapiv1.html, a sample code for javascript would be like:xhr.setRequestHeader('X-Filename', encodeURIComponent(filename));
xhr.setRequestHeader('X-Lang', "English");
xhr.setRequestHeader('X-Ocr', 1);
xhr.setRequestHeader('X-Outformat', ".docx");
xhr.setRequestHeader("Authorization", "Apikey " + 'Your-key-here');
To avoid any unexpected errors, the file name of your pdf file should only contain letters, numbers, hyphen.
Parameters
| Key | Value | description |
|---|---|---|
| Apikey | You get an API key when you sign up | Ger yours here |
| X-Lang | The language of your PDF file. | English is the default language, if your pdf file has more than one lanuage in addition to Enligh, use the other. All supported languages include: English,French,Spanish,German,Italian,Turkish,Thai,Arabic,Russian,Korean,Japanese,Indonesian,Portuguese,Vietnamese,ChineseSimplified,ChineseTraditional. |
| X-Outformat | The output format of the file you want to convert to | For example, 'docx', 'xlsx', 'txt', 'pdf', 'pptx' etc. |
| X-Ocr | Specify if OCR required or not | 1 for yes, 0 for no,4 for others like docx/xlsx/pptx to pdf. |
Please note, pdfocr.org uses cloudflare as the DNS server, which will disconnect you from the server after about 100 seconds. If your file is too big which takes over 100 seconds to convert then you will get a timeout status code 524. In thise case you can send a few requests at intervals of a few seconds with the parameter filename plus the outformat to the endpoint for the status of your conversion.
Response
The API returns results in JSON format. A typical successfull Response would be like below:
{"retcode":"200","msg":"Conversion success.","data":{"filename":"example.docx"}}
A failed Response would be like this:
{"retcode":"404","msg":"API key does not exist, please sign up."}
Once you get the filename, please send requests to query the conversion result to the endpoint: https://pdfocr.org/query?filename=Your-filename, for more information, please refer to the Python sample code.
If the coversion is successful, you will get the filename which you can download at https://pdfocr.org/doc/Your-filename
| Key | Value | description |
|---|---|---|
| retcode | A retcode shows if your conversion is successfull or note | 200 for sccuess, while 4xx means failure |
| msg | An explannation of the retcode | You can find the reason why a conversion fails and take actions accordingly |
| data | Information about the conversion results | You can get the converted filename, download link etc here. |
Sample code
Here is a simple Python sample code for your ref, make changes accordingly please.
Pricing
Pdfocr.org now provides no monthly or yearly plans at the moment, you can just purchase credits for your conversions accordingly.
| Number of Pages | Price | $ per pages |
|---|---|---|
| 1000 | $40 | 0.04 |
| 3000 | $80 | 0.027 |
| 5000 | $100 | 0.02 |
| 10000 | $150 | 0.015 |
| 30000 | $300 | 0.01 |
Support
Email support is available, if you want a custom plan or have any questions, please drop me a line at znxhw#hotmail.com (replace # with @)