K
K
Kirill Zhilyaev2016-01-24 19:21:44
PHP
Kirill Zhilyaev, 2016-01-24 19:21:44

How to solve the problem with encodings in the preg_match_all function?

I wrote a regular expression for parsing all Russian words (/([A-Yaa-z]+)/). Source text encoding - UTF-8. preg_match_all flatly refused to work with UTF-8 and therefore had to recode the text and the regular expression in cp866. Everything seems to work, but as soon as emoji appear in the text, the regular expression does not find anything. How to remove emoji, or make regex work with them. The text is taken from the VKontakte API.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
A
Andrey, 2016-01-24
@kirill_782

use regular expression /([A-Z]+)/ui and preg_match_all miraculously learn to work with UTF8 (u modifier) ​​and ignore case (i modifier) ​​By the
way, why, for example, "blood red grapes" is 3 words, not two?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question