G
G
Grigory Makhanko2022-03-07 12:15:53
Unicode
Grigory Makhanko, 2022-03-07 12:15:53

The data from the site was not received and saved correctly, the links do not work, what am I doing wrong?

I received a task at work to create a site parser, but now I have problems with the data received:
1. The links received from the site to the product cards do not work, the addresses that come are different from the addresses on the site. An example of a link on a site: 6225c761c2e18839363404.png
What is saved in csv and displayed in the terminal:
6225c85a393bc639189703.png
Code:
6225c9ee0461e896100854.png
According to the addresses of the received links, it says "The address is typed incorrectly, or such a page on the site no longer exists."
2. This is Cyrillic. Instead of understandable Russian letters, I get abra-kadabra. Here someone already asked a question, they suggested a solution: r.encoding = r.apparent_encoding - it worked, but only in the terminal, saving the same incomprehensible characters in csv. In terminal:
6225cb4497696766653004.png
In csv:
6225cb94d0e47305255812.png
3. These are numbers.
Example of received data: '1\xa0030\xa0000\xa0₽'.

Dear, please help, to be honest, I googled a lot, but something without success, thank you in advance.

Answer the question

In order to leave comments, you need to log in

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question