Answer the question
In order to leave comments, you need to log in
What components to use for multi-threaded HTML parsing on VC++ using proxies?
Statement of the problem : it is necessary to parse a large number of websites on a daily basis (> 100 sites, > 1000 pages) and extract information about products from them. Let's say online shopping. We need multi-threaded work using a proxy (one page (not a site) - one proxy).
Actually the question - please advise a complete "binding" of the final solution with a focus on:
Answer the question
In order to leave comments, you need to log in
Processed 1k pages in Python 2.7, used from multiprocessing import Pool. Look at my farrows, there were links somewhere.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question