E
E
Eugene Ordinary2021-11-21 17:05:56
PHP
Eugene Ordinary, 2021-11-21 17:05:56

Why does preg_match determine the position incorrectly if there are diacritics in the string?

$str = 'ab'. mb_convert_encoding( '́', 'UTF-8', 'HTML-ENTITIES' ). 'cdef'; 
// $str = ab́cdef
preg_match( '#de#ui', $str, $matches, PREG_OFFSET_CAPTURE );
// $matches = Array ( [0] => Array ( [0] => de [1] => 5 ) ) 
$subs = mb_substr( $str, $matches[0][1], null, 'UTF-8' );
// $subs = ef


The position of occurrence of de given by preg_match should be 4, not 5. As a result, the mb_substr function copies the substring from the wrong position. Why is that? How to coordinate the work of preg_match and mb_substr?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
R
rPman, 2021-11-21
@rPman

try mb_ereg_match instead of preg_match as it works with bytes in the string and not multibyte like all mb_...

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question