A
A
Alexey Samoilov2016-01-19 13:11:24
Python
Alexey Samoilov, 2016-01-19 13:11:24

How to speed up http (GET) data loading?

I have the following program

import requests

# Длина списка urls = 35000+
urls = get_urls()

for url in urls:
    request = requests.get(url)
    response = request.text
    do_something(text)

Each requests.get(url) takes 1-2 seconds to complete.
How can I speed up the spript? Is it possible to somehow start parallel execution of queries?
thank.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
A
asd111, 2016-01-19
@tibhar940

Haven't tested, but should work.
chriskiehl.com/article/parallelism-in-one-line

import requests
from multiprocessing.dummy import Pool as ThreadPool  #Thread pool

# Длина списка urls = 35000+
urls = get_urls()
pool = ThreadPool(4)     # создаем 4 потока - по количеству ядер CPU
results = pool.map(my_mega_function, urls) # запускает параллельно 4 потока и в каждом потоке  обрабатывает одно из значений из списка urls с помощью функции my_mega_function. results - то что возвращает функция my_mega_function
pool.close() 
pool.join()

def my_mega_function(url): #Функция, которая будет вызываться для каждого запроса
    request = requests.get(url)
    response = request.text
    do_something(text)

N
nirvimel, 2016-01-19
@nirvimel

Is it possible to somehow start parallel execution of queries?
  • threading
  • gevent

U
un1t, 2016-01-19
@un1t

either make a lot of python-rq
workers or anyncio asynchronously

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question