Answer the question
In order to leave comments, you need to log in
Do I understand regularization correctly?
Let's say there is a network of 4 input neurons and 1 output. Weights w1=0.2 w2=0.5 w3=0.1 w4=0.9
The loss function is root-mean-square, let's assume it is 0.5, then we add (the parameter that is specified manually a=0.1 is multiplied by the sum of the squares of all weights. That is, L2=(0.2^2+0.5^2 +0.1^2+0.9^2)*a
L2=0.111+loss(0.5)=0.611
Then update the weights new_w1=w1*(L2*w1+gradient)/batch_size etc.
Have I understood the formula correctly? am I updating the weight correctly?
After several epochs, all the weights become zero, please tell me what I'm doing wrong?
Answer the question
In order to leave comments, you need to log in
The loss will look like this:
Loss = Value of the rms function. errors + L1 * a + L2 * b , where a, b are hyperparameters that will indicate how much we will penalize the model for large weights.
Then just from this we take the gradient from the resulting expression and change the weights.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question