Answer the question
In order to leave comments, you need to log in
Social network wall: data sampling
We will talk about the social network wall, for example, as on VKontakte or Facebook, i.e. a wall post can contain different types of content: news, blog post, multiple images, poll, etc.
I've settled on the following database structure for now:
Wall table:
id | description
1 | "This is the description of the first entry on the wall"
2 | "And this is the second post on the wall"
Table "wall_element":
id | wall_id | components | element | element_id
1 | 1 | image | item | 1
2 | 1 | image | item | 2
3 | 1 | image | item | 3
4 | 2 | catalog | item | 1
5 | 2 | catalog | item | 2
6 | 2 | catalog | shop | 1
7 | 2 | catalog | shop | 2
Examples of records, like examples of table structures, are abstract. I think everything is clear with the “wall” table. The "wall_element" table contains a list of attached different content types that are added to the wall post. The fields in the table mean the following:
wall_id is the id of the post on the wall,
component is the name of the component. My components are, for example, a catalog, image, news, blog ...
element is a structural unit of the component, for example, in the “catalog” component there can be several such structural units, for example, “item” or “product” is the product itself , "shop" is a store, "catagory" is a product category, etc. Also with the "image" component, which is responsible for loading and displaying photos.
element_id is the id of the structural unit of the component, i.e. id of the product, store, photo, blog post, news, etc. Structural units of a component also have their own tables, such as catalog_product, catalog_shop, image_item, blog_item, news_item, etc. which contain information about them.
Question: How to pull information from the database correctly? One request is not enough here, and if it is possible somehow, then you don’t want to, since the request will be heavy. From the "wall_element" table, I will get the id of, for example, all the photos, or all the videos that are attached to the wall entry. What kind of requests are best to pull out these id? And what queries to pull data from the image_item table, i.e. How can I retrieve the photo data?
PS Of course, I know how to get everything out, but other points of view and preferably more experienced developers are needed.
UPD: All the same, I’ll write as I see it:
1) I’ll make the first query, with which I pull out 10-20 records from the “wall” table - the query is very easy, even with several million records there should be no problems.
2) In a cycle for these 10-20 records, I pull out data from the “wall_element” table. In each iteration, I will receive a list of photos, blog posts, news, etc. Now we need to process this data, i.e. give them to each component they belong to and get html back. But you need to somehow group those same 20 photos - and get information from the “image_item” table or 20 requests (very easy, since the request will go by id) or somehow more elegantly with one request ... This is how I collect in one iteration html of different structural units of the component…
I.e. there are pluses here that the requests are easy, and the minuses are that there will be a lot of requests for rendering the page ... Caching, of course, has not been canceled, but first you need to do everything as correctly as possible ...
Answer the question
In order to leave comments, you need to log in
UNION is not so bad. Especially UNION ALL where it is not required to select unique records. At the very least, it will be no slower than executing each request separately, due to the fact that it can theoretically be executed in parallel on several cores.
But if it doesn't fit, let's get away from the stove. You postulate the following "we can't select everything in one query so as not to interfere with the model".
This means that the minimum number of requests is equal to the number_of_data_types. To do this, you need to execute separately the queries that I have in UNION, but then manually sort this mess by dates in the code and group by wall_id. This is a bad way.
I offer this option:
SELECT
w.id,
w.description,
COUNT(i.id) i_cnt,
COUNT(bp.id) bp_cnt
FROM wall w
INNER JOIN wall_element we_i
ON we_i.wall_id = w.id
INNER JOIN image i
ON i.id = we_i.element_id
INNER JOIN wall_element we_bp
ON we_bp.wall_id = w.id
INNER JOIN blog_post bp
ON bp.id = we_bp.element_id
ORDER BY w.timestamp
GROUP BY w.id, w.description
I would do this: (read to the end)
1) create a wall table with the fields id, author_user_id, author_user_name, owner_user_id, wall_content, where
-id is the identifier of the entry in the table
- author_user_id is the identifier of the user who created the post
-author_user_name is his name, enter here same, in order not to make an extra request to the user tables
-owner_user_id - the identifier of the owner of the wall
-wall_record_content - the content of the record. is a json with the fields title, description, images, videos, polls and whatever you have there. example
content: {
title: SuperProPost,
Mysuper_Puper_Images: [{
Photo_link1, Photo_link1
}],
Mysuper_Puper_Video_links[{
videolink1, videolink2....
}]
и т.д,
}
Select * from wall_1234(лучше использовать шардинг чтобы не было по 20 млн записей в одной таблице) where owner_id=Owner_id_from request Limit 0, 20 (к примеру)
From the tables you don't have to pull for the whole wall, it's too hard.
About "caching"? What's the problem with caching the response to my request? The same memcashed will perfectly cache the results of queries
Select * from wall_1234 where owner_id=Owner_id_from_request Limit 0, 20
content: {
description: Срочно оцени эти супер-крутые фотки! ,
Mysuper_Puper_Images: [{
Photo_link1{
preview:"images.site.com/dsfgdfgsdg.jpg"
link:"www.site.com/viewphoto/345345345345"
},
Photo_link2{
preview:"images.site.com/dsfgdfgsdg.jpg"
link:"www.site.com/viewphoto/345345345345"
}
}],
Mysuper_Puper_Video_links[{
videolink1{}, videolink2{}
}]
и т.д,
}
Select * from wall_1234 where owner_id=Owner_Павел НедуровID Limit 0, 20
How can you pull in "those 20 photos that you added to the wall" if you do not have a user ID anywhere.
If just all the photos on the wall, then like this:
SELECT
w.id,
w.description,
we.*
FROM
wall w
INNER JOIN wall_element we
ON w.id = we.wall_id
What you write in the question, and if you do so, is Hindu code. You don't have to solve SQL problems with PHP or another language. Especially requests to a DB in a cycle/recursion, still somehow.
You should have both with classes - each class is responsible for its own - and in databases, each table is responsible for its own. In the picture table you store everything you need according to the pictures, in the video table, you understand. But in the table of posts there should be a connection with all the components of the post, as well as a connection with the author.
Martin Gruber - Understanding SQL - they say that the translation is not the best. However, I really liked the book, and I recommend it to you.
By the way, sorry for the offtopic, but I just looked at your site briefly. First looked at this:
excalibur.com.ua/blog/php-c3/%D0%B3%D0%B5%D0%BD%D0%B5%D1%80%D0%B0%D1%82%D0%BE %D1%80-%D0%BF%D0%B0%D1%80%D0%BE%D0%BB%D0%B5%D0%B9-%D0%BD%D0%B0-php-i22.html
You are trying determine the one that works faster, and both INSIDE the loop do count($arr) and strlen($chars). Didn't you think that if you take out the calculation of the length of the array and the length of the string in a loop, then your function will speed up by about $length times? And after that you reproach someone for lack of experience?
If I understood correctly.
With one request through joins you pull out all the data. A field which, for example, description= NULL indicates that this is not a record, but a record element (photo, video ...). In the loop, when you process a request, you immediately see how to render it.
That is, the whole idea is that you reduce the result of the request to one format, which you already direct with the code where you need it.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question