Answer the question
In order to leave comments, you need to log in
Problem with encodings in php and different Linux
Hello!
Today I was brought to white heat by the following problem with encodings. There are thousands of topics on the Internet about such things, and I have already come across a million times, but right there - it sucks!
I had to find all the words in the content of the page and replace the necessary ones (keywords) with links. It would seem like a no-brainer. And it really was so while this plugin was spinning on Debian + php 5.2.
The search for words was carried out using a primitive regular expression:
'/\b'.$keyword.'\b/'
I put this plugin on CentOS, it does not search for Russian words. I understand what needs to be done
setlocale(LC_ALL, 'ru_RU.UTF-8');
. Everything starts to work. I put on another server with CentOS does not work there! And it stops working on Debian!!!
I'm doing so
setlocale(LC_ALL, 'ru_RU.CP1251', 'rus_RUS.CP1251', 'Russian_Russia.1251', 'ru_RU.UTF-8');
funny isn't it! Result works on CentOS stops working on Debian.
I'm already on the edge! I was really worried)
then I thought, what if we replace the regular season with two, so to speak, from a cannon on sparrows!
removed locales and made $regex = array('/\b'.$keyword.'\b/', '/\s'.$keyword.'\s/');
And yes! Comrades earned! But I do n't like it!
Question: how to be so that it is universal and not clumsy ?!
Thank you all for your replies =)
Answer the question
In order to leave comments, you need to log in
In general, I got such a topic:
This is the code that I tortured in every way
preg_match_all('#\bпхп|php\b#', 'пхп php', $m);
var_dump($m) -> array(1) { [0]=> array(1) { [0]=> string(3) "php" } }
var_dump($m) -> array(1) { [0]=> array(1) { [0]=> string(3) "php" } }
setlocale(LC_ALL, array('ru_RU.cp1251'));
var_dump($m) -> array(1) { [0]=> array(2) { [0]=> string(6) "пхп" [1]=> string(3) "php" } }
sad when there are cons and no comments! Somehow this is pidorski not human!
By the way, don’t you want to tell the regular expression that it works with Unicode by specifying the modifier “u”
To find out the locale supported by the system, you need to execute “locale -a”, and not guess
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question