B
B
bubbleboom2015-08-10 19:46:16
go
bubbleboom, 2015-08-10 19:46:16

Why does golang only encode a string character in byte(uint8) and rune(int32)?

Good afternoon.
Question with an example, we have:

charLine := "слово" 
fmt.Println(charLine) // кодирует кириллицу в по 2 байта на каждый символ. По дефолту тип элементов(символов) byte т.е. uint8

Question n1.
1) Why can't I explicitly specify what type of elements to use in a variable?
----
Okay, let's go the other way. Slicing a text string:
sliceLine := []byte(charLine)
fmt.Println(sliceLine) // byte это алиас на тип uint8, т.е. сейчас видим бинарное представление данных, как сделано для типа string с приведением к строке. Делая срез мы можем указать какой тип использовать: uint8 или int32 (rune)

Question n2
Why for a slice, I can't specify the type uint16. For the Cyrillic alphabet, enough for the eyes. And then just as usual to bring from a number to a line with a standard string() function?
The question has some practical value. Let's say I'm a big miser :) and I'm sorry to allocate 4 bytes in memory for each character using rune. Therefore, I want to allocate only 2 bytes using uint16 (I think it makes no sense for a character to take into account sign bit => Cyrillic will fit), but golang does not understand me. Or I did it :)
Yes, uint8 (byte) does not suit the fact that it encodes Cyrillic by several bytes and it is not convenient to work with it as with an array of characters.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
D
David Mzareulyan, 2015-08-11
@bubbleboom

What are you actually trying to achieve with these actions?
When it comes to memory, Go's internal representation of strings is UTF-8. When you write charLine := "слово", you are specifying a UTF-8 string.
In Go, a string can be converted (back and forth) to only two types - to []byte and to []rune. []byte is a mutable copy of the bytes of the immutable string, []rune is the result of parsing the UTF-8 bytes of the string into 4-byte unicode codes.
The question why only these two, and not, say, []float, is meaningless. Because that's how language is made. If you need []uint16 - well, write your own converter, it's not difficult.
It's all described in the documentation here: golang.org/ref/spec#Conversions_to_and_from_a_stri...

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question