S
S
stayHARD2015-10-30 20:21:22
Python
stayHARD, 2015-10-30 20:21:22

How to optimize this code for less consumption of system resources?

import asyncio
import time
from concurrent.futures import ProcessPoolExecutor
from grab import Grab
import random
import psycopg2

# Open connection to the database
connection = psycopg2.connect(database="<....>",
                              user="<....>",
                              password="<....>",
                              host="127.0.0.1",
                              port="5432")

# Create a new cursor for it
c = connection.cursor()

# Select settings from database
c.execute("SELECT * FROM <....> WHERE id=1;")
data = c.fetchall()

# Get time starting script
start_time = time.time()

def operation(link):
    # open a new connection to the database
    conn = psycopg2.connect(database="<....>",
                                user="<....>",
                                password="<....>",
                                host="127.0.0.1",
                                port="5432")
    curs = conn.cursor()
    # init grab framework
    g = Grab()
    # try to find some elements on the page
    try:
        # open link
        g.go(link)
    except:
        pass
    conn.close()


@asyncio.coroutine
def main(item):
    yield from loop.run_in_executor(p, operation, item)

# Create async loop, declare number of threads
loop = asyncio.get_event_loop()
p = ProcessPoolExecutor(data[0][13])  # =200

# Init tasks list - empty
tasks = []

# Select all urls which need to process
c.execute ("SELECT url FROM <....> ORDER BY id;")

# Forming tasks
for item in c.fetchall():
    tasks.append(main(item[0]))

# Close main connection to the database
connection.close()
# Run async tasks
loop.run_until_complete(asyncio.wait(tasks))
# Get script finish time
print("--- %s seconds ---" % (time.time() - start_time))

Actually the question is in the title of the topic. I don't know what to do to optimize (4 gb RAM loaded to the limit, CPU - 90%+). The server simply lays down from such loading. What can be done?

Answer the question

In order to leave comments, you need to log in

3 answer(s)
D
Denis, 2015-10-30
@Ayahuaska

How many database connections do you have there?

R
Roman Kitaev, 2015-10-30
@deliro

Collect these bits and pieces in a class and start one connection.
Yes, here, in fact, nothing is optimized.

A
angru, 2015-10-30
@angru

You should not run all tasks at the same time, use some kind of pool of workers. Connections have already been discussed.
PS accidentally noticed that you even import ProcessPoolExecutor but don't use it in any way.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question