H
H
hbuser2014-01-30 11:49:54
memcached
hbuser, 2014-01-30 11:49:54

memcached. What to cache?

Hello.
The theoretical model is this. There is a simple site - social. network. There are two types of pages on the site: those that are permanent, static, and those with dynamic content. Static: data is taken from the database from normalized tables by different JOINs. Dynamic: with various blocks of information, such as displaying the friends of the current user. The site is used by a huge number of people. You need to use memcached (it's just a model to isolate the essence of the issue). There are enough descriptions of the operation of memcached itself on the network, but, unfortunately, I did not find anything suitable about caching strategies.
Memcached is a simple key => value store. That is, as far as I understand, queries to the database are the most common bottleneck in performance. That is, in order to increase performance in this aspect, it is necessary to minimize the number of queries to the database (and they can be complex and heavy, especially in the case of normalized tables). Here memcached acts as an intermediary between the database and the web application. If we talk about static pages in the site model described above, then we can separate them into a separate entity-table "static pages" and make the edited_at field for pages and, comparing the value of this field with the date the data was added to the cache, display the page content either from the cache, or collect queries from the database (based on the principle of creating a simple file cache).
And now other specific and not the most obvious options in terms of implementation.
1) Catalog pages (something that is displayed page by page and often changes: news, lists, product catalogs, although this is not related to the topic, etc.). Specifically, let it be a very frequently updated news page.
2) Pages with, relatively speaking, widgets, i.e. some small blocks of information that, each time they are loaded, may have different content from the previous one. For example, the user's friends list. To be specific, let him be the second option.
What is the best way to implement caching in these two cases? Well, to start with something, here are my thoughts.
1. In this case, the whole page cannot be the object of caching, what I can put in the value of that very pair (key => value). Because, depending on the amount of news, the content of the page may change and the news may "shift". It is necessary to "narrow" the scale of the caching object. News is a separate entity. How they are grouped on the page doesn't matter. This means that the caching object can be news, which in the main table can have an edited_at field, which can change every time this table or tables related to it are changed. Every first time the news is retrieved by queries (let these be complex JOIN queries too, to make it more meaningful) from the database, it goes into the cache. And further, when outputting each news item, the time it was placed in the cache and the time it was modified were compared, and the most recent news item would be displayed. Everything is fine, but it seems there are problems. First, if the table has many links, incl. and many-to-many, it can be tricky to keep track of the edited_at field in the main table. Secondly, in the cycle, when each news is displayed, there will be a check (comparison of the time of the data in the cache and in the database, as described above) of the relevance of the data, and this is also time. Wouldn't this time cost negate the very idea of ​​using the cache in this way? there will be a check (comparison of the time of the data in the cache and in the database, what is written above) of the relevance of the data, and this is also time. Wouldn't this time cost negate the very idea of ​​using the cache in this way? there will be a check (comparison of the time of the data in the cache and in the database, what is written above) of the relevance of the data, and this is also time. Wouldn't this time cost negate the very idea of ​​using the cache in this way?
2. List of friends. Displayed for each user. There can be many requests there (and getting the list itself, information on each, pictures). And this list of requests can be repeated for each open page for each user. It turns out a serious bottleneck in performance. My thoughts are these. Each block of such information is conventionally defined as a widget. Each such widget has its own table entity, and the edited_at field is created for each widget. And... the same circuit as above. Fresh in memcached - I take it from there. No - I erase the entry in the cache, take it from the database, write to the cache. But again I find myself thinking that something is not right here. Okay widget, but what about the friends page, which already displays page by page even more friends, each of which drags a "bunch" of requests?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
W
webportal, 2014-01-30
@webportal

This is actually not going to be enough with a simple answer ... Here is a question for a series of articles ... I can consult via Skype in voice mode. But on the condition that you write an article later))

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question