N
N
Ninzalo2021-11-26 13:36:08
Python
Ninzalo, 2021-11-26 13:36:08

What library to use for parsing a large number of pages?

Tried different ones - aiohttp+asyncio+bs4 / grequests+bs4 / requests+bs4 / multiprocessing+requests+bs4 / multithreads+requests+bs4.
Now there is a task to parse data from 2 million+ pages, and now I don’t know what will cope with the task faster and better.
I would also like to hear comments about Scrapy - speed / convenience

Answer the question

In order to leave comments, you need to log in

1 answer(s)
K
Kirill Gorelov, 2021-11-26
@Ninzalo

Yes, do it already on at least something))))
Well, you will have a difference in speed between the instruments for an hour or two .....
And so I'm for hardcore. clean requests.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question