A
A
Anton Savchenko2017-05-30 09:24:40
Python
Anton Savchenko, 2017-05-30 09:24:40

Why does the error rate keep jumping while training the network?

Why does the training error rate behave like this? de816a1450504f62b313bf83b88baab3.PNG
There is a neural network with three layers that takes pictures from the mnist database as input. Accordingly, there are 784 neurons in the input layer, one for each pixel. There are 30 neurons in the hidden layer, ten on the output, one for each class. On the hidden layer, tanh is used as the activation function, and softmax is used on the output layer. The cross entropy is used as the loss function. The training set has 60,000 pictures. I train the network using the stochastic gradient descent method, a sample of 5, 10, 100 (it doesn’t matter, the result as a whole does not change) random elements of the training set.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
S
Sergey Sokolov, 2017-05-30
@sergiks

Too big step in gradient descent. Try reducing it by half.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question