Beautiful URLs and database searches - how do you work with this?

NewTypes2013-11-14 23:04:19

MySQL

NewTypes, 2013-11-14 23:04:19

I mean the classic task - to find the page slug, the field in the database table responsible for pretty URLs. Let's say I need a page site.com/pretty-url and I start looking for "pretty-url" in the slug field. But after all, this is expensive (or will the database itself solve this problem of a quick search for me? Considering that IDs were initially set for each record).

I don’t know if I sucked this problem or not, but in my layman’s opinion this is a headache for large databases, which would be less if I accessed immediately by ID without beautiful urls.

Answer the question

In order to leave comments, you need to log in

10 answer(s)

schmooser, 2013-11-14
@schmooser

If there is an index on this field, then the search speed will be about the same as when accessing by id. There is exactly the same index, only not by text but by number.

timatecue, 2013-11-14
@timatecue

Your concern is understandable. To get started, put an index on this field so that a query with a page search by its URL passes faster (anyway, you are looking for data, if not by this field, then by id). Ideally - caching.

DrNemo, 2013-11-14
@DrNemo

The main disadvantage of storing a beautiful url in the database is an additional request. if there are not many pages that require such a URL, then in my opinion it is easier to store it in a simple array, in a separate file

return array('my_castom_url_page' => 'real_url_page', ...);

Andrew, 2013-11-14
@kaasius

If the site traffic is not several million people a day, there is no need to worry. I know this not from theory, but from practice, my sites have been working this way since about 2000, there are no problems with search performance. If there are a lot of visitors, you will probably want to do caching in redis, for example. It's better to think about what happens when you want to make a hierarchy (/uri, /uri/suburi, etc.). Here, too, there is no particular problem, but the task is definitely more interesting.

FanatPHP, 2013-11-15
@FanatPHP

These ideas about the difference of indices are not very clear. Yes, an index of four bytes of an integer field will be smaller and therefore faster than an index of the first, say, 20 characters of a text field. But the difference will not be that significant.

For a computer, both a number and a string are a set of bytes. What exactly is written in these bytes is all the same to him. A search on an ordered set of bytes will be performed in the same way.

You don't have to worry about performance. This is one problem.

AGvin, 2013-11-15
@AGvin

In this case, what for at once to a DB to climb? After all, there is such a thing as caches. Use "prefix" + CRC32(URL) as a key.

grendel, 2013-11-15
@grendel

Alternatively, you can work with a url like this site.com/id/pretty-urland just ignore the "pretty-url" part. Look at the news sites - some do just that.

Upd. Didn't see your comment above. Then yes, indexing by the url field will work tolerably well.

afiskon, 2013-11-15
@afiskon

SELECT ... FROM articles WHERE slug = .... (LIMIT 1) and the index on the slug field, of course!

Alexey Sundukov, 2013-11-17
@alekciy

Just do not forget that the text field index is limited in length to X characters. If X is small (the value depends on the URLs used), then the gain from the index may not be received.

romeo7, 2013-11-29
@romeo7

For a quick search in the database, if the selection is made by a text field, in your case by a "beautiful" url, I offer two options:
Option 1 . Field indexing as FULLTEXT with a fixed length, for example 255 characters.
Cons: Redundant index + MySQL DBMS problem in using FULLTEXT index on InnoDB engine (MySQL >= 5.6 is available).
Option 2 . Add add an additional field with the binary hash of this url.
How to do it:
Suppose there is a url field in the articles table , create an additional url_hash field with type "binary" of length 16. Request to add a binary hash
Binary hash query

SELECT * 
FROM articles
WHERE url_hash =  UNHEX(MD5('pretty-url'));

If there is a lot of data, then in order to avoid collisions, you can add a prefix to the hash, for example, the name of the table:
Pluses: Fast index with a fixed length.
Cons: Redundant data in the form of an additional field.
This option is considered Best Practice:
MySQL binary against non-binary for hash IDs
Benchmark