L
L
LionelCrowl2018-04-30 10:36:01
Neural networks
LionelCrowl, 2018-04-30 10:36:01

How to increase the sample for processing by a neural network?

Classification for ns.
There are 1700 objects, of which 130 are objects of class A, 1570 are objects of class B. 130 characteristics are given for each object, by selecting for multicollinearity (Kendall's tau is more than 0.7) and using genetic algorithms for probabilistic networks (statistica 6.1), 50 were selected significant features. Further, in the same package, I want to run mlp to classify these objects, but I can only submit 260 (130 of each class), because otherwise the ns will simply a priori attribute all objects to class B, but I read that the number of parameters (weights?) in ns should be 10 times smaller than the sample. Obviously, if you follow this rule, then there will be a couple of neurons on the hidden layer, and this, in theory, is not enough. It is necessary to somehow increase these 130 pieces of class A. Thoughts are going in the direction of reproduction by adding random noise for each characteristic, but this is not accurate.
I don’t know programming languages, please tell me a software product with an implemented sample increase or other ways to solve this problem, also, preferably, implemented programmatically :)

Answer the question

In order to leave comments, you need to log in

3 answer(s)
V
Vladimir Olohtonov, 2018-04-30
@sgjurano

Usually they do it differently: they create all possible features that they can think of, then they expand the dataset by any possible methods, and then they sample batches from it in such a way that they bypass the entire dataset during one epoch and bypass it 100 times or more, looking at the graph of the loss function on validation.

A
Arseny Kravchenko, 2018-05-07
@Arseny_Info

It is better not to use neural networks on a dataset of this size, augmentation will not help.

I
imageman, 2018-12-22
@imageman

Well, firstly, you can still try to learn on the entire array of available data (albeit skewed towards B). If you want, you can stupidly multiply class A.
Second, look towards other classifiers. For example, a tree (or forest) of decisions.
If the problem has already been solved, can you tell us how you solved it?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question