I
I
Ivan Kryazhev2017-11-01 08:37:02
PHP
Ivan Kryazhev, 2017-11-01 08:37:02

What is the difference between utf8mb4, UCS-4BE', 'UTF-32' PHP/MySQL?

You need to define the numeric number of the UTF character.
unpack('N', mb_convert_encoding($utf8Character, 'UCS-4BE', 'UTF-8'))[1]; - it all works.
also some suggest converting to 'UTF-32BE' - why?
in MySQL data is stored in utf8mb4. that's 4 bytes per character.
I want to understand why it is necessary to convert to UCS-4BE, if we already have data stored in 4 bytes guaranteed.
If you do not convert, the design does not work - why?
And I didn’t quite understand what is the difference between UCS-4BE and UTF-32?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
R
Rsa97, 2017-11-01
@t9221823420

MySQL stores data in utf8mb4 . that's 4 bytes per character

Not four, but one to four.
UTF-32 is a subset of UCS-4. UCS-4 supports characters with codes from 0 to 7FFFFFFF, UTF-32 - with codes from 0 to 10FFFF. The suffixes LE and BE determine the byte order, Little-Endian and Big-Endian.

E
egor_nullptr, 2017-11-01
@egor_nullptr

In utf8mb4 encoding, the character length varies from 1 to 4 bytes. In UCS-4 and UTF-32 encodings, the character length is always 4 bytes. How they differ can be found in Wikipedia https://en.wikipedia.org/wiki/UTF-32

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question