K
K
KvanTTT2012-01-22 17:05:31
Text recognising
KvanTTT, 2012-01-22 17:05:31

Determining positions and sizes of blocks of text in tessract in console mode

For text recognition I use tesseract in console mode.
The input arguments are
imagename - the path to the image;
outputbase is the name of the file with the recognized text.

You can also set different recognition parameters with -psm pagesegmode :
pagesegmode values ​​are:
0 = Orientation and script detection (OSD) only.
1 = Automatic page segmentation with OSD.
2 = Automatic page segmentation, but no OSD, or OCR
3 = Fully automatic page segmentation, but no OSD. (Default)
4 = Assume a single column of text of variable sizes.
5 = Assume a single uniform block of vertically aligned text.
6 = Assume a single uniform block of text.
7 = Treat the image as a single text line.
8 = Treat the image as a single word.
9 = Treat the image as a single word in a circle.
10 = Treat the image as a single character.
-l lang and/or -psm pagesegmode must occur before anyconfigfile.

But I still could not google whether it is possible to determine the exact positions and sizes of blocks with text and pictures. And if it is possible, then how to do it?
Should these settings be set in the configfile ?

PS I am writing my program in Visual Studio C# and it uses Tesseract.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
K
KvanTTT, 2012-02-01
@KvanTTT

I will answer myself: in tesseract 3.0 there is an option “hocr”, which allows you to return not just recognized text, but a page in html format containing recognized words and their coordinates.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question