I
I
ivandzemianchyk2014-03-04 11:38:15
Image processing
ivandzemianchyk, 2014-03-04 11:38:15

How does a Histogram of Directional Gradients (HOG) work?

How does the object detection process look like in a sliding window, what metric is used to decide the object / not the object?
Please share a method or a link.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
N
niosus, 2014-03-04
@ivandzemianchyk

Actually, the article @Fesor cited is the (capitalized) HOG article. So, I support - everything you need is there.
In order to make it easier to read it, I will try to duplicate in simple language what is happening there (anyway, the article is required to read).
So.
To begin with, you need a large amount of initial data. Let's look at a specific example - the detection of car faces. So, you need about 1000 pictures, for example 128x128 pixels, each of which will contain only a car. You also need many times more (I used 8000 recently) pictures of the same size, where there will definitely not be cars.
All these pictures are analyzed using HOG, which, in fact, calculates the gradient at each point and, depending on the direction of the gradient, writes some value to the histogram (this is well shown even on Wikipedia). In general, as a result, HOG looks like a very long vector of values, each of which is the value of a certain cell of the histogram.
After analyzing all your outgoing data, you will have a whole bunch of these vectors. Essentially, each of these vectors is a point in n-dimensional space. And you need to divide these points into 2 clusters (for detection). Most often this is done using SVM (Support Vector Machines), as they guarantee the optimal division of points into 2 clusters.
At the output of the SVM, you have a boundary (most often used linear) that passes between your points and it is the decision (detection) criterion.
Further, in short and on the fingers, for each of your sliding windows, the HOG is considered and it is checked on which side of the border it lies. If from the side of the "machines" then the program screams that it has detected something, and if from the other side, then it silently checks the next window.
I hope it didn't come out too messy.
If you need a code reference, you can take a look at my github: https://github.com/niosus/HOGclassifier
There is still a little extra there, but I think you can catch the main idea. In about a month, the project will be reduced to pure detection and everything unnecessary will be removed from it.

S
Sergey, 2014-03-04
Protko @Fesor

lear.inrialpes.fr/people/triggs/pubs/Dalal-cvpr05.pdf
describes all the stages of searching for objects using HOG descriptors. What else is actually needed?

Z
ZaStalin, 2014-08-28
@ZaStalin

And objects can be of the same type, but quite different, for example, if you need to find a helicopter?? I also realized that it would not be possible to simply download photos with standards from the Internet, since they contain not only a standard, but also other objects, and also differ greatly in size. But only the standard cannot be of the same size, is it not scary if there is a white background on the sides ??? And it’s not clear with win size how exactly it should be specified, otherwise if my object of a larger scale does not frame it, but is located somewhere in the center ??

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question