Answer the question
In order to leave comments, you need to log in
Tornado crawler (AsyncHTTPClient). Can it be easier?
Good evening everyone.
I am rewriting a fast crawler from [curl multi & c-ares] to tornado.httpclient.AsyncHTTPClient .
Rummaged through the documentation, created a simple script
@tornado.gen.coroutine
def test():
def handle_response(response):
print 'handle %s' % response.code
num_of_try, num_of_conn = 10000, 500
tornado.httpclient.AsyncHTTPClient.configure("tornado.curl_httpclient.CurlAsyncHTTPClient", max_clients=num_of_conn)
http_client = tornado.httpclient.AsyncHTTPClient()
responses = yield [http_client.fetch("http://ya.ru/", callback=handle_response) for i in xrange(num_of_try)]
if __name__ == '__main__':
tornado.ioloop.IOLoop.current().run_sync(test)
@tornado.gen.coroutine
def test():
def handle_response(response):
print 'handle %s' % response.code
num_of_try, num_of_conn = 10000, 500
tornado.httpclient.AsyncHTTPClient.configure("tornado.curl_httpclient.CurlAsyncHTTPClient", max_clients=num_of_conn)
http_client = tornado.httpclient.AsyncHTTPClient()
keys = set(range(num_of_try))
for i in keys:
http_client.fetch("http://ya.ru/", callback=(yield tornado.gen.Callback(i)))
while keys:
key, res = yield yieldpoints.WaitAny(keys)
handle_request(res)
keys.remove(key)
if __name__ == '__main__':
tornado.ioloop.IOLoop.current().run_sync(test)
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question