C
C
coderisimo2019-09-25 14:17:13
Python
coderisimo, 2019-09-25 14:17:13

How to correctly work with one object when using Pool (multiprocessing)?

Tell the noob why sometimes the same values ​​are taken when working in several threads, although I specifically check them to avoid duplication.
That is, I need to have a sheet that all processes can modify. Before executing the task, the process climbs into the list, takes an element from there. When the task completes, the element is returned to the list and will be available to other processes.
Below is a synthetic simplified example. But it captures the essence to some extent.

import random 
from multiprocessing import Pool

my_list = [1,2,3,4,5,6,7,8,9]
used = []

def test(i):
  indx =  random.randint(0,len(my_list)-1)
  while my_list[indx] in used: #ищу элемент, который ранее не использовался
    indx = random.randint(0,len(my_list)-1)
  used.append(my_list[indx]) #добавляю элемент в список используемых, чтобы избежать повторного использования
  print(my_list[indx]) #вывожу уникальный элемент на печать
  
with Pool(4) as p:
  p.map(test, [1,2,3,4,5,6,7])

for example , I get :
9
7
6
6
3
1
4
two 6s in a row ((((( Multithreading, damn it)))))
Thank you!

Answer the question

In order to leave comments, you need to log in

1 answer(s)
I
Ivan Yakushenko, 2019-09-25
@coderisimo

Because you are not using threads , but processes . Each process has its own environment, so the variables process_1from process_2. To share data between processes, you need to use Manager .
Here is an example of a simple rotator:

from multiprocessing import Pool, Manager


def rotator(data_list):
    data = data_list.pop(0)
    data_list.append(data)
    return data


def print_data(data_list):
    data = rotator(data_list)
    print(data)


if __name__ == "__main__":
    manager = Manager()
    data_list = manager.list()

    for x in range(5):
        data_list.append(x)
    
    with Pool(4) as pool:
        for _ in range(10):
            pool.apply_async(print_data, [data_list])
        pool.close()
        pool.join()

The principle is simple as a stick: there is a list of elements, when the method is called, the first available element is removed from the list and added to the end so that the next process cannot take the same element and so on in a circle.
Here is the output:
0
1
2
3
4
0
1
2
3
4

Just keep in mind that if processes, like threads, start in a certain order, it does not mean that they will end in the same order. Those. it is quite normal that the process that started 4th worked N-ms faster and ended first, as a result, the output can be like this:
4
2
0
1

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question