Why are these lines written to memory differently?

S

sddvxd2019-05-15 21:16:47

C++ / C#

sddvxd, 2019-05-15 21:16:47

Good afternoon!
In an attempt to figure out why I can successfully send and receive requests to a web server in C language and why I cannot do the same in assembly language, I noticed this feature:

request db 	"GET / HTTP/1.1\r\nHost: myip.ru\r\n\r\n"

char* c = "GET / HTTP/1.1\r\nHost: myip.ru\r\n\r\n";

Absolutely identical lines with a GET request are stored in RAM in different ways, namely, this is how the hex code of the first line from the program that the assembler assembled for me looks like:

47 45 54 20 2f 20 48 54 54 50 2f 31 2e 31 5c 72 5c 6e 48 6f 73 74 3a 20
6d 79 69 70 2e 72 75 5c 72 5c 6e 5c 72 5c 6e

And so on C:

47 45 54 20 2f 20 48 54 54 50 2f 31 2e 31 0d 0a 48 6f 73 74 3a 20 6d 79
69 70 2e 72 75 0d 0a 0d 0a

This is a string in utf-8, and the online hex-to utf8 converter gave the following result:

GET / HTTP/1.1\r\nHost: myip.ru\r\n\r\n

It seems to be what you need - but it didn’t work with this request.
Now I check how the C compiler processed this line:

GET / HTTP/1.1
Host: myip.ru

They look different, there are no control characters, and it is with this line that you can successfully get the main page of the site. Question: why does this happen, why are control characters escaped in the first case, and how can this be avoided?

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

Z

zed, 2019-05-15
@sddvxd

Because in C, this sequence "\r" is an escape sequence and it is written to memory as 0xd (one byte). The compiler does this work.
And in the assembler, as you wrote down the line, so it lies in memory, i.e. there the character "\" lies as 0x5c, and the character "r" also lies next to it. But for an HTTP request, it is a line feed (sequence 0xd, 0xa) that is needed, and not the text "\r".