Answer the question
In order to leave comments, you need to log in
Proper organization of the database for working with statistical data
I'm not a database guru, so I can't solve the problem I'm facing without risking more problems later. :)
So, the initial conditions : PHP5 and MySQL with a minimum of features, the choice of MyISAM and InnoDB, but after reading about one and the other, I refused MyISAM (correct me if suddenly I was wrong).
Task : unfortunately, I cannot describe the task specifically, but I will try to state the essence as much as possible - the wording, in my opinion, is quite trivial.
In short, you can imagine that I collect statistics about fruit lovers.
On the one hand, there is Vasya, about whom I know the following: what fruits he eats, how much he eats, and which ones he loves the most. The favorite fruit is considered based on a five-point rating that the eater himself can once give, and the number of times this fruit has been eaten.
On the other hand, there is Petya and hundreds of other people about whom I know the same thing. Based on the available statistics, I need to offer Vasya those fruits that he did not eat, but which he might like.
The fruits recommended for Vasya are searched as follows: we take Vasya's favorite fruits and look for those people who also love them. Then we look for those fruits from these people that they like and that Vasya did not eat, sort them by the number of matches and issue the Top 10.
Thus, there is a table with rows of the following form:
[User] [Fruit] [How Much Eaten] [Five Points]
From these records, I need to get the very fruits that Vasya might like.
Purpose : due to limited resources, it is necessary to minimize the load on the database, which I planned to do through properly organized work with the database, that is:
Answer the question
In order to leave comments, you need to log in
The fruits recommended for Vasya are searched as follows: we take Vasya's favorite fruits and look for those people who also love them. Then we look for those fruits from these people that they like and that Vasya did not eat, sort them by the number of matches and issue the Top 10.
function buy( userId, fruitId ) {
for fruit in SELECT * FROM kuplenno {
INSERT INTO recomendacii VALUES ( fruitId, fruit, 1 ) ON DUBLICATE KEY
UPDATE recomendacii where fruitID = fruitID AND fruitID2 = fruit
}
INSERT INTO kuplenno VALUES ( userId, fruitId )
}
function getBestFruits( userId ) {
return
SELECT fruitID2, sum( cnt ) as ocenka FROM kuplenno, recomendacii
WHERE userID = userId AND kuplenno.fruitID = recomendacii.fruitID AND frutID not in (SELECT fruitID FROM kuplenno where userID = userId )
GRUP BY fruitID2
ORDER BY ocenka
LIMIT 10
}
It seems like this method better predicts the next purchase.
You need several tables:
User table
- id
- name
Fruit table
- id
- fruit name
User preference table
- user
id - fruit id
- how many times eaten
- rating
Here, of course, the question arises how the rating is calculated on a five-point scale, and when the fruit is evaluated, once or every time after eating?
For a day, such a structure will be enough to operate on data for everyone
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question