What is the principle of selecting weight coefficients when training an artificial neural network?

V

Vokanim132017-10-14 22:31:08

Neural networks

Vokanim13, 2017-10-14 22:31:08

How is the network selected? Do the weights adjust when using different data samples, i.e. do they come down to some average? Or are they calculated for each sample separately, and then averaged values are already taken???

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

�

⚡ Kotobotov ⚡, 2017-10-15
@angrySCV

the classical approach takes the neural network as a given, the architecture (the number of layers, the decisive function, the interaction between the layers), is set by the creator, during training, only the coefficients are "adjusted" to the selected architecture.
The weights are NOT reduced to average values, you can of course average them, but this is pointless (what to average? in cases with photos, for example, the number of pixels (incoming signals) chtol? -> but the photos are adjusted to certain standards where all photos have the same resolution (the same number of pixels ), so this is an extra division by a constant, which can be discarded, but if averaged over the number of examples -> then this will simply lead to the fact that with an increase in the sample, any result will tend to zero)
moreover, in the training dataThere are NO coefficients , there are signals in the samples (for example, a pixel, whether there is or not - one or zero), we do not influence these signals in any way, all we can do is
take some kind of "decisive function" (for example, the sum of all incoming signals multiplied to unknown coefficients) and decide at what value the decisive function will answer us "yes" or "no", for example, to the question is this the CAT in the photo? -> with the sum of the coefficients multiplied by the incoming signals (by one or zero) and the result will be, for example, more than 10, if more than 10 - then CAT, less than 10 - NOT CAT.
and then - we can select the weights in such a way that, with the chosen "decisive function", we get the correct answers as often as possible (it does not matter what we took for the "cut-off value of the result" for dividing by YES OR NO, the value 10 or 100000, we are here adjust the coefficients in such a way that it still works, the coefficients in the latter case will only be 10,000 times larger in size).

G

Griboks, 2017-10-14
@Griboks

And isn't it the same thing? Indeed, with a large number of samples, the weight tends to its normal value.