P
P
Pavel2011-09-27 18:48:43
CCTV
Pavel, 2011-09-27 18:48:43

Camcorder options to recognize text on video

Need advice from someone who has experience with video equipment: what should be the minimum resolution (and number of TV lines) of the camera so that the recorded video can recognize text (12 font) taken from a distance of ~ 3 meters?

We are talking about stationary or portable analog video cameras (not hand-held HD).

Answer the question

In order to leave comments, you need to log in

3 answer(s)
R
rPman, 2011-09-27
@Pavluha

A simple case, if we look at the screenshot (photo):
The height of the characters of the 12th size is 4.2mm, so that at least manually the characters can be recognized, it is necessary (in height) to allocate 5-8 pixels for each (feel free to multiply by 2 - noise, distortion), i.e. 0.42mm per pixel
Further, either you need to choose a camera with the required focal length (so that at a distance of 3 meters the entire object is in the frame - which is almost certainly unrealistic for you, although this is just a matter of optics), or calculate what size of the symbol will be for each camera (usually they do not differ much), the first HD camera that came across in Google for the query 'HD video camera viewing angles' was received by Microsoft LifeCam HD-5000 - a viewing angle of 66 ° (for simplicity, not they write the viewing angle in height, since the HD standard defines it as width / 1.33)
So at 3 meters the frame will contain an object 4.4m wide, 3.3m in height, so we need a pixel to hold no more than 3.3 * 1000 / 0.42 = 7857 vertical pixels and accordingly (if we use the analogy with HD standards *1.77777..) — 13967 horizontal pixels
If you create an ingenious algorithm that determines a symbol by a matrix of dots 3-4 pixels high (theoretically possible, using information about the brightness as well), then the requirements are reduced by 2-3 times, i.e. 4655x2619 is not an ordinary camera anyway.
Bottom line, look for a camera with a very small viewing angle (for FullHD it's about 10") either reduce the distance or ...
ps as I know cameras do interpolation, including based on the principles described below, so that the requirements may not be so terrible, but it is better to experiment.



The video contains much more useful information, as it provides several different (camera or object moving, hands shaking, light changing ..) images of the same object, for example, a person can recognize in a video with much smaller requirements for the size of dots than described above, including through his intellect.
It is possible to use several adjacent frames during processing by defining offsets (the object can be made moving or the camera can be forced to move - for example, swing it or a mirror / prism on a pendulum) by any algorithm used to compensate for shaking in the corresponding video processors or advanced video cameras (they are not so complex, it seems there were reviews on Habré).
Due to the larger number of frames, it is possible to increase the image resolution (theoretically, it is limited only by physical limits, namely, a long light wave, but in practice it will be somewhat inconvenient to make a video of the same object for several years in order to obtain its image with an accuracy of a micron).
ps I can’t get my hands on this task, since I can’t find ready-made implementation examples.

P
Pavel, 2011-09-30
@Pavluha

rPman, thank you very much for the detailed answer!
Accordingly, the question about analog cameras and TV lines disappeared immediately - analog cameras are not suitable for such detailed shooting.
Regarding the calculations - write, please, by what formula did you calculate the width of the frame at a distance of 3 meters, knowing the angle (dumb, damn it!).
Question 2: in the expression "3.3*1000/0.42 = 7857" why is 1000 taken?

S
SpCreator, 2014-09-17
@SpCreator

I don’t know how anyone, I installed a Zenith G7W IP camera at work yourcamera.ru/ip-kamery/zenith-g7-detail specifically for internal security, in order to see what documentation my subordinates wear and read. There was a case that I was reading a magazine from the camera at an employee on the table, everything is perfectly visible, so 200% see such things as badizhi with names and surnames

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question