Answer the question
In order to leave comments, you need to log in
What are the algorithms for finding the optimal sample/slice?
Welcome all!
There is a database for users with n-th number of fields containing information about them (gender, age, occupation, etc.). The fields are filled randomly for each user or not filled at all.
There are also some statistics for each user (for example, the number of system logins per month).
Accordingly, for any combination of parameters, it is possible to compile average statistics (for example: gender+age, gender+age+marital_status, ...+...+*) - 30-year-old men logged into the system on average 32 times a month. Thus, a cut is formed.
What algorithms are there to determine the slice that most closely matches a particular user? That is, knowing certain data about him, we can assume about his statistics, looking at the average for the most appropriate cut.
Answer the question
In order to leave comments, you need to log in
k-Nearest Neighbors (kNN)
Your task in terms of this algorithm corresponds to the questions:
1) how to adjust the weights (significance) of the influence of parameters on the distance between neighbors
2) which kernel to choose
3) how to determine the optimal k for this kernel
All three have specific answers in the form of algorithms - there is a lot of literature.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question