B
B
Boris [A-Z][a-z]+2019-12-01 11:55:40
Python
Boris [A-Z][a-z]+, 2019-12-01 11:55:40

Why is multiprocessing.pool not working?

Hello! I am engaged in parsing and then it occurred to me to check the Pool () object from multiprocessing.pool.
Ran the following code:

from multiprocessing.pool import Pool


def a(s):
    with open('test.txt', 'a', encoding='utf-8') as f:
        f.write(s + '!\n')


def main():

    arr = []
    for i in range(1000):
        arr.append(str(i))

    with Pool(20) as p:
        p.map(a, arr)


if __name__ == '__main__':
    main()

And what did he give me?
The txt file has 500 something lines (:
I tried to set a different number of threads. To no avail - always less than 1000 lines.
It showed correctly only with Pool(1) :)
Why does this module work incorrectly and how to fix this problem?
Thanks to all!

Answer the question

In order to leave comments, you need to log in

1 answer(s)
S
ScriptKiddo, 2019-12-01
@7esoterik7

Alas, it’s impossible to write to a file from different processes just like that.
Either use FIleLock to synchronize processes, or switch to SQLite or another DBMS.
SQLite locks the database when writing from the same process. After the write is completed, it unlocks and makes it possible for other processes to write.
https://www.sqlite.org/threadsafe.html
Snippet for Filelock:

from filelock import Timeout, FileLock

lock = FileLock("high_ground.txt.lock")
with lock:
    open("high_ground.txt", "a").write("You were the chosen one.")

https://pypi.org/project/filelock/

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question