Why is char 1 byte and character literal ('A') 4?

D

David Park2021-09-21 11:13:16

C++ / C#

David Park, 2021-09-21 11:13:16

sizeof(char) = 1 byte
sizeof('A') = 4 bytes.

I realized that what we call characters is actually a numeric code, and therefore character literals are allocated the same amount of memory as the int type (4 bytes).
But I didn’t quite understand how a four-byte character fits into a single-byte char?
And when I declare char test = 'A'; then how much memory was allocated in the computer: 1 byte or 4?

(If you try sizeof(test), it will still be 1. But isn't 'A' 4 bytes?)

Reply

Answer the question

In order to leave comments, you need to log in

5 answer(s)

M

Mercury13, 2021-09-21
@Xproz

And now I will tell you the correct answer.
In C, a character literal is of type int and therefore its sizeof is 4 bytes.
In C++, its type is char and 1 byte. Therefore, those who created the CPP file did not see the problem. Obviously, it has to do with function overloading: somehow you don't want the version for int to be called in foo('A').

#include <stdio.h>

int main()
{
    int sz = sizeof('A');  // латинское
    printf("sz = %d\n", sz);
    return 0;
}

C: 4
C++: 1
When written , there will be 1 byte on the stack (+ alignment). Here C, roughly speaking, performs type conversion - right at compilation. If you write , it will say that the compilation conversion ushort→char will truncate the result from 1049 to 25. char test='A'char test=L'Й'

S

Saboteur, 2021-09-21
@saboteur_kiev

I realized that what we call symbols is actually a numerical code

Everything in a computer is stored as bits grouped into bytes.
A character is an abstraction to simplify programming, and there are various encoding tables for converting bytes to characters when displayed.
The number of bytes needed per character depends on the encoding itself.
In old encodings, one byte meant one character, in modern UTF, the number of bytes can be different (up to 6 bytes per character in the form of a hieroglyph).

and therefore character literals are allocated the same amount of memory as the int type (4 bytes).

Use typeid to clarify the data type

But I didn’t quite understand how a four-byte character fits into a single-byte char?

no way, it's not char.
in C by default char is a single byte character in ascII

And when I declare char test = 'A'; then how much memory was allocated in the computer: 1 byte or 4?

You yourself specify the type when declaring. I had to post the entire code.
(If you try sizeof(test), it's still 1. But isn't 'A' 4 bytes?)
'A' is a value, not a type. Maybe it's an int?

G

galaxy, 2021-09-21
@galaxy

And there is nothing to stuff Russian letters into char

In (1), if c-char is not a numeric character sequence and is not representable as a single byte in the execution character set, the character literal is conditionally supported, has type int and implementation-defined value .

https://en.cppreference.com/w/cpp/language/charact...

G

GavriKos, 2021-09-21
@GavriKos

#include <iostream>  
#include <typeinfo>  //for 'typeid' to work  

using namespace std;

int main () {  

    std::cout << typeid('A').name() << std::endl;  
     std::cout << sizeof('A') << std::endl;  
 }

Answer:
c
1
How did you get 4 bytes?