A
A
Andrew2013-02-17 15:01:49
Python
Andrew, 2013-02-17 15:01:49

Organization of parallel requests

Good time of the day. Please help with the task. The conditions are as follows:
1. The "black box" system writes data to MS SQL Server, data from 2 to 4 million records daily. The developers of the "black box" provided that the service writes every day to a new table with the names: "table-2013-02-01", "table-2013-02-02", "table-2013-02-03" ... and so Further.
2. For the search, I wrote a simple script in python using Pyodbc, in which the input data is: a period from such and such a number, to such, and what to search for in which field. Now the script is executed in a loop (or using "map") where the table name is changed in the query. And the results are added to the list.
3. How can I parallelize a query in python, for example, 30 queries at a time (to 30 tables) to the database?

PS Due to little experience with Python, I tried to use PP (Parallel Python), by analogy:
f ('name_table') - the function takes the name of the table, returns a list.
period = ('table-2013-02-01','table-2013-02-02',...)
jobs = [job_server.submit(f,(input,), (,),("pyodbc", )) for input in period]
for job in jobs:
job()
But apparently I use it out of place, and I constantly “catch” an error like “unpickle” ...

Answer the question

In order to leave comments, you need to log in

4 answer(s)
G
gelas, 2013-02-17
@AndrewFoma

And why not shift the task to SQL Server by making, for example, union all 30 tables?

A
alz, 2013-02-17
@alz

If you have a large idle time for i / o, then you can probably parallelize on ordinary threads

M
MrFrizzy, 2013-02-17
@MrFrizzy

Try to look at the multiprocessing library, I use with python 2.7

from multiprocessing import Pool
import os

def main():
  pool = Pool(os.sysconf('SC_NPROCESSORS_ONLN'))
  result = pool.map(f, range(10))

def f(x):
  return x*x

almost standard example.
I similarly parallel simple ssh actions on different machines when there is no access to puppet.
True, in your case, everything can rest on the speed of reading from one database, but several parallel queries should keep ...

N
Nikolai Turnaviotov, 2013-02-17
@foxmuldercp

I would also think that the old tables closed by dates can be taken out to the database, for example, on another machine, and make queries there + the main server, if necessary, the current day.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question