M
M
Maxim Kadochnikov2016-01-19 09:27:03
MySQL
Maxim Kadochnikov, 2016-01-19 09:27:03

MariaDB 10.1.7: Why do queries that require a table fullscan with a MEDIUMTEXT field take an incredibly long time to complete?

There is a table
id (int),
url (varchar)
html (mediumtext)
There are other fields - but it doesn't matter.
The average size of an html field is 1 MB. There are 20,000 entries in total.
If you make a query to this table that will cause a fullscan of the table, then this query will run for ~ 30 minutes. For example (just for theory) SELECT * FROM `table_name` WHERE url RLIKE " http://no_domain " LIMIT 1.
If we explicitly indicate that we do not need the html field in the output, then the request will be completed in ~ 10 milliseconds. For example: SELECT id FROM `table_name` WHERE url RLIKE " http://no_domain " LIMIT 1
It seems that during a fullscan of a table, the database pulls out TEXT type fields from disk for each row, even if it is not needed at all to perform a search (in this case, only the url appears in the request). In my opinion, it would be better if she first found the necessary lines and, when issuing them, gave them with fields of the TEXT type, because I only need one line. But the base stubbornly sorts out the volume of the entire table for this query.
I understand that in most cases this can be avoided by using indexes for a query. But what if the request is made once an hour, and building an index for it is too expensive or not possible at all, in the case of a regular expression search?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
M
Max, 2016-01-19
@makcbrain

SELECT * FROM `table_name` WHERE id IN ( SELECT id FROM `table_name` url RLIKE "http://no_domain" LIMIT 1)
?

W
wol_fi, 2016-01-19
@wol_fi

Didn't quite understand your problem. First pull the id through SELECT id FROM `table_name` WHERE url RLIKE " http://no_domain " LIMIT 1 , and then everything through SELECT * FROM table_name WHERE id = previos_result_id .
And the fact that * requests pull out all fields, and in the case of a request without using an index, each row is taken and compared - this is described in the documentation.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question