Do I understand correctly one of the advantages of a convolutional neural network over a regular one for image classification?

A

aIan312022-03-05 17:49:49

Neural networks

aIan31, 2022-03-05 17:49:49

I'm new to deep learning, don't judge too harshly. So let's say I want to use a regular neural network to classify images, whether it's a person or not.

The whole picture has 680 pixels, let's say.

The neurons of the input layer receive the value of the pixels, the neurons of the hidden layer, on the other hand, reveal some facial features, and the neuron marked in green is, for example, responsible for finding the nose. -75 (let's say), this means that the weights between neurons 50-75 of the input layer and the neuron of the hidden layer responsible for the nose (green) are much greater than between the other neurons of the input layer and this neuron of the hidden layer.

In this case, if we want to classify a picture that the neural network has not seen before, in which the nose is on other pixels, let's say 220-245 , in this case, the weights between neurons 220-245 and the neuron of the hidden layer responsible for finding the nose will not be suitable .

And do I understand correctly that this is just the big advantage of the convolutional neural network and the use of filters in it over the usual one for image classification?

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

R

rPman, 2022-03-05
@rPman

the neural network on images that it has not seen will be in an indeterminate state, i.e. nothing can be guaranteed
, but the main feature of neural networks (and multidimensional optimization algorithms in general comes from mathematics itself) is that it loves to find patterns, unusual, strange, even where they do not exist , and draw a conclusion based on them,
i.e. your green nose neurons may not turn out to be a green nose marker, but 'successfully' matched with a completely different criterion on the training sample, a protruding lip, for example, in all green-nosed pictures, and when the network sees something unknown, it can find a protruding lip, for example, in the bend of a squid tentacle and react.

F

freeExec, 2022-03-05
@freeExec

Yes, in general you are right. With the classical approach and unfolding the image into one row of pixels, connections with neighboring pixels of the original image are lost.