Answer the question
In order to leave comments, you need to log in
How to get binary data of each page of Pdf file?
Hello everyone, comrades. I ran into a problem where I would need to parse a PDF file page by page, but how to get the binary data of this PDF file in PHP like file_get_contents() only for each page? I thought that PDF Parser would help me , but I did not find a method that could implement this.
Answer the question
In order to leave comments, you need to log in
In essence, tasks of this kind boil down to the following:
1. Split pdf pages into separate images (for example, using imagemagick)
2. Run images through some kind of OCR (for example, Tesseract)
3. Parse the received data
Why get the binary data of each pdf page?
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question