How to flatten percentage distribution across data?

A

Anton Tarara2018-03-30 13:29:13

numpy

Anton Tarara, 2018-03-30 13:29:13

Hello colleagues. I have a question. There is a dataset. It has signs (columns) binary. but in these columns the ratio of True to False is not very even. For example True 90% and False 10%. How to align all the features of this data at once? By adding randomly new ones? Studio ML has such a tool called SMOTE, but it only works with one feature. Maybe there are some mechanisms for pandas or numpy? Thanks

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

L

longclaps, 2018-03-30
@longclaps

The idea of deconstructing real data with fake data to get something meaningful is a crazy idea.
Tools for the implementation of crazy ideas can be picked up, yes.

A

Arseny Kravchenko, 2018-03-30
@Arseny_Info

contrib.scikit-learn.org/imbalanced-learn/stable
But in general 90/10 is a healthy ratio for most cases.