How to use neural networks for digit recognition?

D

Dima Petruk2015-11-07 21:11:53

Image processing

Dima Petruk, 2015-11-07 21:11:53

Hello,
I want to write a script (Python) that will recognize numbers on streaming video.
While training on the images. Actually, first I convert to Grayscale, then I do binarization using the Otsu method, then using the findContours and boundingRect methods in openCV I get an array with rectangles that contain the coordinates of the rectangles inside with contours.
I noticed that among these found objects there is a lot of noise (rectangles a few pixels in size, so I just filter too small on one of the sides or disproportionate (one side is 2+ times larger than the other)) (perhaps you can tell me a better way here, since the figure "1", written as a vertical/horizontal stick, is immediately cut off).
Then I bring all these Areas of Interest (ROI) to the total size (28x28).
Before that, I trained the classifier (I used it ready for MNIST (SVM). And I pass each ROI to the input of this network. But how does the ROI in the image find ALL sufficiently large contours - the question arises - what to do when there is not a figure among such ROIs? How to design a network or can do preprocessing?

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

_

_ _, 2015-11-07
@AMar4enko

What do you get out of the network?
There can be two options:
1. there is one neuron in the output layer, it gives you from 0 to 1, you denormalize the output to the original range, like 0="0" ... 1="9"
In this case, the signal for a character that is not figure, it will be difficult to distinguish.
2. in the output layer you have N neurons, each corresponding to one digit. In this case, you will have a strong signal at the output of one neuron as a result of correct recognition. Those. if the signal on neuron "4" is 0.8, and on all the others it is 0.08, then there is a high probability that it is 4.
And if two or three outputs are close to the same values, then there is a high probability that the input is something different from a number.
Of course, with this approach, it is necessary to train the network for such false cases.