S
S
skvot2012-02-11 22:10:54
PHP
skvot, 2012-02-11 22:10:54

Php work with case of Russian characters in UTF-8?

Greetings, habr!
I encountered a problem with the correct operation of the functions for changing the case of characters when working with Russian text in UTF-8 encoding.
Here is the function I came up with. It works, but, IMHO, it looks extremely ugly:

function reverseStringCharactersCase($string)
{
    $reversedString = '';
    $string = iconv('UTF-8', 'cp1251', $string);
    
    for ($i = 0; $i < strlen($string); $i++) {
        if (isUpperCase($string[$i])) {
            $reversedString .= mb_strtolower($string[$i], 'cp1251');
        } else {
            $reversedString .= mb_strtoupper($string[$i], 'cp1251');
        }
    }

    return iconv('cp1251', 'UTF-8', $reversedString);
}

What I tried to do but didn't work:
1. I tried using normal string processing functions, not from the multibyte library. These functions did not react to Russian strings at all, various use cases setlocale()did not lead to anything (ubuntu server 10.10).
2. I tried to use mb_strtoupper with the second argument 'utf-8', but this option did not help either.
I want to achieve beautiful code, without using multibyte functions and explicit encoding conversion using iconv(). Forgive me if the question is noobish, I hope to help the audience of this wonderful IT resource.
Thanks in advance!

Answer the question

In order to leave comments, you need to log in

3 answer(s)
E
edogs, 2012-02-11
@skvot

1) No matter how beautiful a simple function looks, if it works and is universal, that's enough.
2) Do you have utf-8 locale installed? And check if you call it correctly when setting setlocale: locale -a in the console.
3) If you have php as an apache module, then read the warning to php.net/setlocale, about the fact that neighboring threads of the same process can change the locale for you as well. with this configuration, the locale is global.

V
Vladimir Chernyshev, 2012-02-11
@VolCh

Я хочу добиться красивого кода, без использования multibyte функций и явного преобразования кодировки средствами iconv().
In fact, this means that you want to write your own utf-8 parser for at least the Russian subset of characters? :) What for?

E
Evgeny Nikolaev, 2020-03-13
@nikolaevevge

How about using the following class: blog.ivru.net/?id=187 Usage
examples:
mystrto::lower("ABCDABCD"); result: abcabcd
mystrto::upper("abcabcd"); result: ABCDABCD.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question