M
M
meta42012-05-28 17:23:16
MySQL
meta4, 2012-05-28 17:23:16

What is the optimal number of partitions for a large table in MySQL?

I have a table that will potentially contain a lot of data (~10^8 rows) - something like a log. The table has a "grouping" field, on the basis of which selections are made. I want to use standard MySQL partitioning using hash(grouping field). Accordingly, the question arises: how many partitions should be chosen? MySQL allows up to 1024 partitions. Is it true that the more theme partitions, in theory, the higher the performance but takes up more space? Or in some other way?
And in general, I've heard rumors about problems with MySQL built-in partitioning (sudden "collapse" of partitions under unknown circumstances), has anyone had a negative experience using MySQL's built-in partitioning mechanism?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
Z
ztxn, 2012-05-29
@ztxn

>> Is it true that the more theme partitions, in theory, the higher the performance but takes up more space?
no, it's not true. In the general case, partitioning is fraught with a drawdown in performance. Only in special cases it can give a win.
As a rule, the performance gain is obtained when selecting on a predicate with low selectivity (high value of the ratio of the number of selected rows by the predicate to the number of rows in the source set), for which the use of an index is less efficient than a full scan of the dataset. If such a selection predicate is included in the partition key, the full scan is the less, the more partitions we have.
Also, a performance gain can be obtained due to the fact that it becomes possible to scatter sections across different disks. In this case, two sessions scanning different sections physically located on different disks do not compete for access to the disk, which gives a very tangible profit, because. disk operations are currently one of the most expensive.
>> The table has a "grouping" field
It's not entirely clear what you mean here. If you are grouping by a field that is a partition key, you will most likely have to scan the entire recordset, all partitions, and the first performance benefit case I described is definitely not for you. I very much doubt that MySql is capable, like oracle, of parallelizing the execution of the statement, therefore the second described case of winning is also unlikely about you.

I
ivnik, 2012-05-28
@ivnik

Of the problems that we encountered is the problem with unique indexes: "every unique key on the table must use every column in the table's partitioning expression".
I can be wrong because of prescription, but it seems that mysql opens a file for each partition, i.e. you need to increase the limit on the number of open files (ulimit).

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question