A
A
Anton Shelestov2016-11-09 11:47:10
MySQL
Anton Shelestov, 2016-11-09 11:47:10

Which query is more profitable in terms of performance?

Hello.
Brief description:
The database contains a table with all the coordinates of the Moscow metro (id, name, lat, lng) - lat and lng are latitude and longitude coordinates.
There is also another table in the database, for example, a table with organizations, it has a lot of fields and also has lat and lng fields (coordinates of this organization)
Task:
Get all metros within a radius of 2000 meters from the organization.
And here the question is: which of the solutions is better in terms of performance?
Solution 1: We
immediately calculate the entry into the radius directly in the sql query: the
example is written under laravel, I think someone who knows will figure it out

$metro = \App\Metro::select(\DB::raw('*, (
                                                6378137 * acos (
                                                  cos ( radians(' . $res['lat'] . ') )
                                                  * cos( radians( lat ) )
                                                  * cos( radians( lng ) - radians(' . $res['lng'] . ') )
                                                  + sin ( radians(' . $res['lat'] . ') )
                                                  * sin( radians( lat ) )
                                                )
                                              ) AS distance'))
            ->having('distance', '<=', $res['radius_metro'])
            ->orderBy('distance')
            ->get();

With this option, we will always have a fresh result, for example, if a new metro station is added.
Solution 2:
Enter the result of the search for the nearest metro in a separate table and periodically update it by cron, for example.
I'm interested in which option is better to use, I like the 1st.
A heavily loaded portal is planned
. Any thoughts?
Thank you!

Answer the question

In order to leave comments, you need to log in

6 answer(s)
S
Stanislav Pochepko, 2016-11-09
@DJZT

Think about caching. I think the first option is better, but you need to enable caching. And if something related changes (New metro station) then clear the cache.

M
Maxim Timofeev, 2016-11-09
@webinar

The idea with the table is nonsense, but the direction is right. It is necessary to use your request, but to cache its results. I don’t know how in laravel, but in Yii you can definitely make the cache dependent on the count () request, for example, which will return the number of stations. And the cache can already be stored in a database or in a file - this is the second question, it seems to me that filecache is more suitable for this. In principle, you can make the cache eternal and clear it when a new station is added.

E
Eugene Wolf, 2016-11-09
@Wolfnsex

Everything really depends on what you personally mean by the definition of "heavily loaded portal" and on many other factors, including the hardware used, the version of MySQL (and / or its fork) and final parameters and settings, and so Further. And also, from the "quality" of the administrator who will be entrusted with this database.
Subjectively, evaluating your question, I would recommend something like the following:
On the "Internet", the approach of caching data is actively practiced, no matter what kind of data it is. Everything is cached, from individual blocks on the site to entire pages with cookies, pictures and scripts.
Many factors speak in favor of the cache. Against caching, two main arguments are usually given:
a) Excessive consumption of RAM and / or hard disk
b) The cache should not be used in cases where the resource costs (machine) for caching are higher than the resulting savings effect from obtaining data from the cache (usually these are cases where the number of write cycles is less, or slightly more than the number of data read cycles
) In your case, caching is possible in two ways:
1. MySQL (unlike some other databases) itself caches query results, and it does this by default, but caching settings can be (and most often need to) be configured additionally
2. Using a cache with using programs specially designed for this purpose, such as Redis or Memcached. PHP (and Laravel including, if my memory serves me) are great friends with them.
Continuing the thought, in your case, you can use one of two options: either limit yourself to the fact that MySQL itself will cache the results of many queries and will take them from the cache itself (and this applies not only to queries regarding coordinates, but also to almost all other queries),
or , You can use to store such data (for example, the relationship of coordinates with search results) either in memory tables (MySQL has a "MEMORY" table type), or store this data in Redis/Memcached/etc, globally the essence will not change, the speed the performance of both options (In-Memory Tables vs Redis/Memcached) will be approximately comparable.
You can update the data in the storage based on events, the implementation of which is sufficiently present in Laravel. So, briefly, it looks like this:
1. Updates (adds, changes) some data block associated with the company, this in turn generates some event (one of the events)
2. The event listener receives the data and makes the appropriate changes in the data cache, thus the data in the cache are always up to date (or the data desynchronization is rather meager, within about a second) and no CRONs that will torment the database or cache storage for no reason are needed.

S
sim3x, 2016-11-09
@sim3x

The DBMS should give a simpler API, without arctangents and fast
Postgres can do this out of the box
gis.stackexchange.com/questions/77072/return-all-r...
Perhaps the muscle too
At least compare their performance on your case
No need to optimize beforehand

V
Viktor Vsk, 2016-11-09
@viktorvsk

How often is a new subway added or company address changed?
You have a table with metro data, a table with company data and a table with a link to the company, to the metro and the distance between them
. Enter data in the last table when creating / updating the company / metro. Of
course, to save money, only records in the last table should be stored, which satisfy the condition (distance <= 2000 meters), but 2000 meters you will very soon become 1500 meters or 2500 meters.
It is not clear at all why there is talk about caches and high workload if you have a choice (according to the condition of the problem) there will always be one organization and it does not matter to you how many companies there will be

A
Andrzej Wielski, 2016-11-09
@wielski

You are talking about a large project but you are using MySql. Do not do it this way.
Dig towards PostgreSQL, it has great, powerful tools for working with geo-data.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question