How does backpropagation work in a neural network with a threshold function?

F

Filipp422021-06-11 16:59:15

Algorithms

Filipp42, 2021-06-11 16:59:15

I am writing a simple neural network. The output should be a threshold activation function. But here's the problem, you need a derivative to backpropagate the error. And at the threshold, it is not only that, almost everywhere it is equal to zero, it is also not defined at the threshold. How to train a network?
About the network: two input neurons, two hidden layers per three neurons. One output neuron with a threshold function. The rest use a relay with leakage. Solves the exclusive or problem.

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

B

berng, 2021-08-22
@berng

Doesn't work at all. For the gradient descent method, the activation function must be continuous, and preferably monotonic, preferably without zeros of the derivative. And the derivative of the threshold function is zero almost everywhere, so the gradient of the loss function (according to which the increment of the coefficients of neurons is calculated, and which includes the derivative of the activation function by the multiplier) will almost always be zero - and the increment of the coefficients will be zero - training will not work. You can try using genetic algorithm instead of gradient descent, it will work. Again, if you have a threshold function only at the output - replace it with a sigmoid with a very small temperature (so that the 0-1 transition is sharper), this will allow the network to train with gradient descent, albeit very slowly.
And the exclusive or is solved optimally by three neurons with threshold functions analytically, without training - decompose xor according to the formula through the basic logic into normal form (and, or, not), find the coefficients of neurons for them.