How to correctly set the training sample for a neural network?

I

ivodopyanov2015-07-12 15:31:04

Neural networks

ivodopyanov, 2015-07-12 15:31:04

For example, there is such a data set for supervised learning, a classification problem:
The input data is 20 discrete parameters in the range from 1 to 200. Each value is a certain class. Class numbers are not related in any way, i.e. there is no more/less relationship between them, so the numbers don't really mean anything. The parameters themselves are equal, there is no difference between the first and second either.
The output is the number of one of the 20 options to choose from.
Analogy - there are several playing cards of different denominations, and you need to choose one of them.
Input
1) There can be directly 20 values in the range 1-200.
2) Or it could be splitting by the number of classes among the input variables - i.e. this will be 200 parameters in the range 1-20. Most of them will be equal to 0, some - 1, with a very small probability - 2 or more. But then there are much more connections in the network.
And according to the output -
1) It can be 200 values, the number of the selected class. But such a representation allows the neural network to make a choice of a class that is not represented in the input values.
2) The number of the selected parameter. But then permuting the input values can easily lead to a different result, I guess.
What is the right way to present such data?

Reply

Answer the question

In order to leave comments, you need to log in

3 answer(s)

[

[email protected]><e, 2015-07-12
@ivodopyanov

Signs (features) of the data must be presented correctly. The mathematics of networks is such that if you represent a feature as a number, then observations with feature values of 5 and 4 will be "closer" than, for example, 5 and 100. But if the feature is a group identifier (so-called categorical feature) of the user, then their numerical closeness means nothing.
Likewise with the exit. Predicting one number is equivalent to assuming that predicting 21 instead of 20 is not as bad as predicting 1000. Again, this is not true for categorical features.
In total, if the input consists of 20 categorical features, then each feature must be replaced with a set of new ones obtained by one-hot coding (also known as dummy variables): for each value of such a feature, a new indicator feature is created equal to 1 only if the corresponding the sign of this observation is equal to the corresponding value.
Likewise with the exit.

S

Sergey, 2015-07-12
@begemot_sun

NS training is an art. The result can be better with different representations of both input and output.
Experiment.

R

Roman Mirilaczvili, 2015-07-13
@2ord

Something is tricky with the encoding of the values. Read Application of Neural Networks for Classification Problems
and 2.2 Setting Problems with Categorical Features
Yes, and, in general, search on the topic "Information Coding Methods", about quantitative and qualitative features.