R
R
rodion_ilnitskiy2022-02-09 01:05:31
Python
rodion_ilnitskiy, 2022-02-09 01:05:31

JSON does not write the received data correctly. How to solve the problem?

I am trying to parse the page by running the code:

import scrapy


class NewSpider(scrapy.Spider):
    name = "new"
    start_urls = [
        'https://tsum.ua/ua/nova-kolekcija.html',
    ]
    def parse(self, response):
        for category in response.xpath('//div[@class="product-detail-content"]'):
            yield{
                'name': category.xpath('h5[@class="product-item-brand-name"]/a[@class="product-item-link"]/text()').extract(),
                'deskription': category.xpath('h5[@class="product name product-item-name"]/a[@class="product-item-link"]/text()').extract(),
            }


Launching
scrapy crawl new -o alles.json
I get what is needed:
{'name': ['\n                                                REDValentino                                            '], 'deskription': ['Джинси']}

However, in the JSON file itself, everything is written like this:
{"name": ["\n                                                REDValentino                                            "], "deskription": ["\u0417\u0430\u043c\u0448\u0435\u0432\u0456 \u043c\u044e\u043b\u0456"]},

how to make description look like what i got in terminal?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
V
Vindicar, 2022-02-09
@rodion_ilnitskiy

Well, firstly, \u-notation is found not only in python , but also in JSON as such . It is enough to drive into the terminal of the python

print("\u0417\u0430\u043c\u0448\u0435\u0432\u0456 \u043c\u044e\u043b\u0456")
to make sure the string is ok. The machine will read it without problems.
But if you still want a human-readable view (albeit at the cost of complicating the program), then you need to read about the ensure_ascii parameter , and at the same time do not forget to open the target file in utf-8 encoding or the like.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question