Answer the question
In order to leave comments, you need to log in
Is it possible to recognize text in pdf with its embedding in the same pdf, is it possible for free, i.e. for nothing?
There are a certain number of jpg files, the task is to first batch convert them to pdf. I don't think this will be a problem.
And then, without making any special efforts, recognize the text in pdf and embed it in files.
Then the files will be uploaded to the LogicalDoc electronic archive of the free edition, which parses text documents and can search for them, but, alas, cannot recognize text from a picture.
Answer the question
In order to leave comments, you need to log in
Why does the text need to be edited and recognized in pdf and not earlier, in jpeg?
tesseract is an open and free set of utilities for text recognition, usually pre-manipulations are done with the image using filters or some other logic so that tesseract can recognize it (for example, if the image is not a scan but a photo of paper documents, it is necessary to remove light transitions and geometry distortions) .
ps ' without much effort ' - will not work
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question