I
I
Ivan Ivanov2017-03-02 15:22:23
MySQL
Ivan Ivanov, 2017-03-02 15:22:23

Which subd to choose for large amounts of data (tens of gigabytes - a hundred gigabytes)?

It was necessary to add a couple of indexes in the innodb mysql table with a size of 60 GB, I used pt-online-schema-change, but for almost a day a new table with new indexes has been created, and the table size has not even exceeded 4 GB yet. Perhaps there is a DBMS where changing the structure of the table (adding fields, creating indexes) will not be so painful? You also need support from the kohana framework for ORM to work - very convenient, and to have something similar to PhpMyAdmin. What do you advise?

Answer the question

In order to leave comments, you need to log in

4 answer(s)
A
Alexander N++, 2017-03-02
@sanchezzzhak

postgers added fast support for changing tables + there are no problems such as in mysql
you delete data and the database file does not collapse ...
well, I decide in this way
create a table
add new fields (immediately create indexes (then it will not be real) )
then set auto increment with a margin of 100k+ higher.
then we make a copy script

$limit = 10000;
        $count = 0;
        $lastId = 1;  // последний id можно менять ручками ( если скрипт зависнит)
        $endId = 70170509;  // максимальный ид в таблице, докуда копируем

$sqlTemplate = "insert into new_post ( select id, user_id, text)
from post where id > :lastId: and id < {$endId} order by id ASC limit {$limit})";

  $sql = str_replace(':lastId:', $lastId, $sqlTemplate);
        while ($res = $connection->createCommand($sql)->execute()) {
            $lastId = $connection->createCommand('SELECT id FROM new_post ORDER BY id DESC limit 1')->queryScalar();  // получаем ласт запись в новой таблице

            $count += $limit;

            file_put_contents($file,
                "processed " . number_format($count, 0, '.', ' ') . " rows\nlast id " . $lastId . "\n\n", FILE_APPEND);

            $sql = str_replace(':lastId:', $lastId, $sqlTemplate);
        }
        file_put_contents($file, "--done---\n\n", FILE_APPEND);

then we do change the names of the table, this operation is fast.
30 gigs overtakes in 20 minutes.
look at the difference and before copying the rest,
it's ala analogue of the tool from percona-tool, but it doesn't slow down))
Run the script from the console, it's best to call
`screen` and do it in the background in case the terminal depends or the internet crashes.
Choose a database for the task, my table had 120 gigs of statistics, I chose an analytical database and I don’t know any troubles.

S
sim3x, 2017-03-02
@sim3x

There are not so many DBMS to choose from
There is postgres, there is muscle, there are corporate subds, a license for which you can’t pull
But since you use phpmyadmin, then the problem is not in the subd, but in you
Hire an administrator who will solve the problem

A
Artem, 2017-03-02
@devspec

In principle, any modern database can handle the storage of the specified amount of data.
The question is what exactly do you want to do with this data.
If you accumulate, rarely access and the speed of sampling does not matter much, then any SQL database will do.
If you accumulate and access frequently and quickly, you need to look towards NoSQL databases.
If you need full-text search, you need to look towards ElasticSearch and / or Lucene.
In general, you need to focus not on the amount of stored data and indexes, but on specific tasks.

D
Draconian, 2017-03-03
@Draconian

Among other things, I would advise you to first optimize the existing structure to the end: for example, partition this table, create the necessary indexes for each partition separately, etc.
Because it looks like the problem is not in the DBMS at all.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question