Answer the question
In order to leave comments, you need to log in
How to write a regular expression taking into account Cyrillic and Unicode?
There is such a regular expression:
$se = preg_replace('%[^A-Za-zА-Яа-я0-9]%', '', $se);
It strips all special characters from a string, leaving only letters and numbers. We need to rewrite it in such a way that it works on machines that do not know about the Cyrillic alphabet. To do this, if I understand correctly, the Cyrillic range should be written as hex-sequences (like \x0410), I just can’t find how to do it correctly.
help me please
Answer the question
In order to leave comments, you need to log in
Firstly, you need to use the u modifier (if the file in UTF-8 "a-z" ranges will work), and secondly, you can use constructions like \p{xx} .
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question