D
D
Dmitry T.2012-08-29 19:09:01
Database
Dmitry T., 2012-08-29 19:09:01

How to share a large table of private messages?

Yes:
There is a table of personal messages, the number of entries is approaching 100M :( and the speed of working with this table is decreasing and decreasing. The table is often written and often read.

Messages:
msg_id - auto_increment
from_id - sender user ID
to_id - recipient user ID
subj - subject
date - creation time
status - new, read, replied

Question:
How (in what) way can such a table be divided? I would like to hear different opinions about the separation itself and how all this can be implemented in code.
They thought about partitions, they refused due to the fact that restoring the entire table from backup takes a lot of time. Yes, and creating partitions on such a table takes a lot of time.
Thanks for the help. I will answer any additional questions :) Well, I will write an article if it is interesting and relevant on this issue.
UPDATE:
Thanks everyone. So far, I like the following ideas the most.
  • Make archive tables and make a table of contents for them
  • And also split the table into several, leaving only the most necessary fields in the main one

Who will have ideas how it can be improved or done in a completely different way? ..
I thought in the direction of sharding, I have no idea how this can be done with messages. When one object is owned by more than one user :(
UPDATE2:
To the first update of the addition
  • Move all this stuff to InnoDB
  • Consider moving to uuid

Answer the question

In order to leave comments, you need to log in

6 answer(s)
J
jarvis, 2012-08-30
@jarvis

Wow, well, you tightened up with scaling! I will share a link to the Pinterest architecture presentation , in which they talk about how they overcame this problem and share code examples. Perhaps it will help.

I
Ivan, 2012-08-30
@Praeses

Automatic (user click) archiving to arch_msg table by date? With the same fields and archiving date entry?

S
sl_bug, 2012-08-31
@sl_bug

As for sharding, I advise you to watch the video univertv.ru/video/informatika/programmirovanie/ruby_on_rails_moscow/rails_do_scale/ there are rails, but sharding is also affected well.

B
bugman, 2012-08-30
@bugman

It is necessary to stomp on how the selects are spinning on this table. Based on the assumption that most often the client needs to show all his messages, it is logical to do hash partitioning by the recipient ID. The number of partitions to pick up empirically. If the muscle supports double partitioning, the second level of partitions can be sliced ​​by date. The most distant partitions can be dropped or transferred to the archive.

E
edogs, 2012-08-30
@edogs

For some reason, there is a feeling that you do not have indexes on the table.
100Mb, while 99.99% of your selections (if admins do not monitor the PM by keywords) then the selection by from_id to_id, if there are indices on these fields, everything should fly at 100mb, then everything, even if you have users by 10% personals generating.
In extreme cases, you can cut the table into 2 parts. Load auto_id into one, from_id to_id, into another auto_id and everything else.

T
TimTowdy, 2012-08-31
@TimTowdy

Real examples of slow queries, they explain what type of table (innodb/myisam), what indexes, how performance changes under load, have you tried query profiling?
Surprised by the local telepaths who give advice without knowing the conditions.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question