Answer the question
In order to leave comments, you need to log in
How to make a text with a mixed alphabet normal?
What other tags to put - I do not know.
Answer the question
In order to leave comments, you need to log in
I solved such a problem, but for a long time, more than 10 years ago.
Here is a link to my article on Habré: https://habr.com/ru/post/86303/
We
assume that Cyrillic and Latin cannot be mixed together in one word. The word must consist of either only Cyrillic or Latin. If there is a mixing of alphabets, then you need to bring the word to the desired encoding.
The idea is simple: the program tries to determine the language in which the word is written by determining the occurrence of unambiguously Russian letters, such as Е, Ж, З, Ф, Я, etc., and the same for English: F, L, Q, S, V, W, Z, etc.
After that, all ambiguous letters (A, O, E, Y, Y, X, X ...) are forcibly replaced in the word with the corresponding letters of the language that we have defined.
You can go the other way. Bring the word first to the Latin encoding, then to the Latin alphabet. And check each of the words in the dictionary. If such a word is found there, then apply this word. It will be necessary to refine that my algorithm, I'll do it somehow.
I hope I explained clearly.
The code from this article helped. Made addition textnormalizer .
The easiest option is to replace all Latin characters with Cyrillic characters. But this method has a significant drawback - it will replace the letters for the same in normal words written in Latin.
A more difficult option is to find words in which Cyrillic and Latin are mixed, and apply the replacement only to them.
But with a dictionary, you can even cooler - when replacing, check the word and its word forms in the dictionary, and if it is not found, then display a warning or the original spelling in brackets, for example, or whatever is more convenient for you.
If reading from a browser, then you can write an extension or a user script. If from editors like microsoft word, then you can also write VBA scripts there. And surely some screen readers have an API for plugins.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question