How to understand a neural network for beginners?

1

1Tima12020-04-28 23:37:33

Neural networks

1Tima1, 2020-04-28 23:37:33

In general, I read the article - https://habr.com/en/post/312450/
And there was some interest.
I understand that for in-depth knowledge, of course, higher mathematics is needed. But the interest is only growing.
I won't say much.
1) For a simple neural network, use the sigmoid function. As I understand it, it gives out values within (-1: 1)
Why these limits? If it was (0:1), then I would write it off in binary code. But since there is -1.
I thought for a long time, perhaps for simplicity, or so as not to clog memory, or so that epochs pass faster. Well, I did not come to the right idea. Help!
2) why the root mean square error?
let's say the desired number-1 we got 0.36
if you stupidly subtract, you get 0.64
and for mse 0.4096 The
answers are different, so I assume that everything is deeper than it seems.
Why the percentage of error is 0.40, and not 0.64 - it’s not easy, they also decided to square it.
I'll leave this topic to explain to you (hopefully)
3) a more organizational question. In fact, only w1, w2, ..... wn(weight) should change after each epoch??
They also have meanings. So after each epoch, the value completely changes or the weights are reversed, let's say the value of w2 has moved to w1, and from w2 to w3. Well, it just seems logical to me)).
4) what does the neural network generally come down to.
there is a normal table exc. OR. There are 2 variables, you need to find 3
at the 1st input-1, at the 2nd input 0.
We got an answer of 0.33, an error of 45% (from the task by reference)
and what? 1) the answer may be negative, it is neither 0 nor 1.
2) what are we striving for? to reduce the error percentage, or to bring the output number closer to the correct answer?
to get 1-like the correct answer
O1input = 0.61*1.5+0.69*-2.3=-0.672
O1output = sigmoid(-0.672)=0.33
to get out 1 you need x to tend to infinity
and even here it will never be 1, it’s more reasonable to say about the function limit?
to be infinity, the weights must also be large.
And how big will they be if they are often given numbers less than -1
And to get the correct answer, all 5 weights must be large. Well, it's not a theory. If one weight is large enough to compensate for the second weight, then the second can be neglected. I just don’t really understand the structure of the neural network, in what way it approaches the answer with each epoch. and even so, we still use the random function, which means that no data is stored in memory and the neuron with the closest answer (selection) is not selected. There is no selection in that example, but what then is the neural network based on?
sorry)

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

D

dmshar, 2020-04-29
@1Tima1

Firstly, do you want to read a mini-course on neural networks here?
Secondly - if you are too lazy to figure it out yourself (although how can there be laziness if there is interest? most likely then it is not interest but just curiosity, but oh well) break your question into separate sub-questions and ask them one at a time. And you will have a free consultation on a mini course.
Along the way, I will answer the first question:
1) For a simple neural network, use the sigmoid function. As I understand it, it gives values within (-1: 1)
Firstly, there can be a great many activation functions: stepped, sigmoid, exponential, linear rectifier, arctangent, etc., etc.
The choice of a specific function is dictated by the specific task and experience of the analyst. As well as the limits of the sigmoid function, which can be both [-1,1] and [0,1] (by the way, generally speaking, it is the latter that is a classic). This has nothing to do with the binary code - a function with limits [-1,1] is given as simply as [0,1] to the dichatomy. About "hammering memory" - this is also not from this opera. The speed of learning is really affected, but not so much by the limits you have indicated, as by the fact that they are limited (including yours) on the one hand or unlimited on the other. Incidentally, if the range is infinite, training is generally more efficient and a lower learning rate is required. And if it is limited, then gradient-based learning methods are more stable.