A
A
Alexey Krupsky2014-07-15 15:19:24
PHP
Alexey Krupsky, 2014-07-15 15:19:24

How to change part of text in different encoding in php?

There is some text with a bit of a messed up encoding. Part of the string in utf-8, the other in windows-1251.
Actually the question is how to align the encoding of all text in utf-8?
Decision:

function mbe_detect_encoding($string, $enc = null)
{

    static $list = array('utf-8', 'windows-1251');

    foreach ($list as $item) {
        $sample = @iconv($item, $item, $string);
        if (md5($sample) == md5($string)) {
            if ($enc == $item) {
                return true;
            } else {
                return $item;
            }
        }
    }
    return null;
}

$text = preg_split('!([ ,<>="\':])!ism', $text, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
foreach ($text as $key => $c) {
    if (mbe_detect_encoding($c) == 'windows-1251')
        $text[$key] = iconv('WINDOWS-1251', 'utf-8', $c);
}
$text = implode('', $text);

Answer the question

In order to leave comments, you need to log in

2 answer(s)
M
Max, 2014-07-15
@Snickersmix

You can break the text into words and use iconv(mb_detect_encoding($str),'UTF-8',$str)for each word

D
Dmitry Entelis, 2014-07-15
@DmitriyEntelis

Cut into 2 parts, convert the part that is in win1251 to utf-8,
glue the lines back

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question