Answer the question
In order to leave comments, you need to log in
What library to use for parsing a large number of pages?
Tried different ones - aiohttp+asyncio+bs4 / grequests+bs4 / requests+bs4 / multiprocessing+requests+bs4 / multithreads+requests+bs4.
Now there is a task to parse data from 2 million+ pages, and now I don’t know what will cope with the task faster and better.
I would also like to hear comments about Scrapy - speed / convenience
Answer the question
In order to leave comments, you need to log in
Yes, do it already on at least something))))
Well, you will have a difference in speed between the instruments for an hour or two .....
And so I'm for hardcore. clean requests.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question