I
I
ifaceman2014-04-29 10:50:06
Database
ifaceman, 2014-04-29 10:50:06

What are the algorithms for finding the optimal sample/slice?

Welcome all!
There is a database for users with n-th number of fields containing information about them (gender, age, occupation, etc.). The fields are filled randomly for each user or not filled at all.
There are also some statistics for each user (for example, the number of system logins per month).
Accordingly, for any combination of parameters, it is possible to compile average statistics (for example: gender+age, gender+age+marital_status, ...+...+*) - 30-year-old men logged into the system on average 32 times a month. Thus, a cut is formed.
What algorithms are there to determine the slice that most closely matches a particular user? That is, knowing certain data about him, we can assume about his statistics, looking at the average for the most appropriate cut.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
A
Andrew, 2014-04-29
@ifaceman

k-Nearest Neighbors (kNN)
Your task in terms of this algorithm corresponds to the questions:
1) how to adjust the weights (significance) of the influence of parameters on the distance between neighbors
2) which kernel to choose
3) how to determine the optimal k for this kernel
All three have specific answers in the form of algorithms - there is a lot of literature.

A
Andrey Vershinin, 2014-04-29
@WolfdalE

Regression analysis

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question