Answer the question
In order to leave comments, you need to log in
How to understand that the user forgot to switch the language (wrong layout)?
There is a search bar. Sometimes the user may make a mistake and enter a Russian word on the English layout. How to understand this without performing unnecessary queries (without performing a search)? The first thing that comes to mind is to check for the presence of characters that are in place of the Cyrillic alphabet (khzhebyu). But these characters can sometimes occur in a regular query (especially the apostrophe). Are there any ready-made algorithms or functions (php)?
Answer the question
In order to leave comments, you need to log in
As a result:
1. Collected all the lines that can be found (it turned out about 100 megabytes).
2. Built a list of all combinations of two letters (aa, ab, ac, ..., zz).
3. I searched for all combinations from point 2 in the words from point 1 and saved a list of those combinations that were not found (never occur). It turned out 165 non-existent combinations.
Now I'm looking for these non-existent combinations in the search strings. If something is found, then I will convert the request to another layout.
I use a sphinx to search, stupidly if something is found in another layout, but not in the current one, then I give out results for a different layout.
Up to a million records, sphinx produces results in less than 1 millisecond.
It is most correct, of course, if there are no results, to perform a search in a different layout, but if you absolutely do not want to make a second request, you can, for example, determine the language by bigrams (this will be faster than searching by the words of a particular language).
As an option to delve into the sources of XNeur , also here, on Habré, an article slipped with a strange implementation of this algorithm)
Why not do it in javascript?
$('#txt').keyup(function(e){
if($(this).val().match(/([а-яёА-ЯЁ]+)/)){
alert('Смените раскладку');
}
});
The easiest way is in case of no search results, resampling by running the input string through the layout association function of type str_replace(['q', 'w', 'e', ....], ['th', 'q' , y'', ...], $queryString)
Although the analysis of combinations is more correct, so it's better to combine.
automatically change the layout when it detects those combinations of letters that are not in the given language
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question