Answer the question
In order to leave comments, you need to log in
How to correctly trim a string of type string in utf8 in c++?
There is a string of type string, it contains a string encoded in utf8 (Russian, English letters, numbers)
How can it be correctly trimmed or partially copied into a new variable by limiting, say, 10 characters?
Answer the question
In order to leave comments, you need to log in
Unicode characters or UTF-8 bytes?
In any case, UTF-8 bytes fall into three categories…
• Initial: 0x00…0x79, and 0xC0…0xF4
• Optional (doesn't occur at the beginning): 0x80…0xBF
• Forbidden: 0xF5…0xFF. For our purposes, it can also be attributed to the initial ones.
If the task is to get 10 characters, then we find the 11th initial character and cut off before it.
If the task is to get 10 bytes and the 11th (s[10], if there is one, of course) is not the initial one, we start trimming the string until we cut off the initial character.
How to cut it correctly or partially copy it into a new variable, limiting it to, say, 10 characters?
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question