A
A
Artur Bekerov2015-01-16 15:02:16
Python
Artur Bekerov, 2015-01-16 15:02:16

How to improve the performance of a python script?

Greetings. There is a simple loop that iterates over images with a size of about 50 MB.
OpenFile is the library with which I open files. If you open 20 files in order, it works quickly, after 50 files it starts to slow down. I thought this caching was some kind of local, but the speed starts to drop in proportion to the number of files.
what to do?

for file in files[:20]:
            print file
            dataset = OpenFile(file)
            data = ReadAsArray()
            print data [3000, 5000]

Answer the question

In order to leave comments, you need to log in

4 answer(s)
Y
Yuri Shikanov, 2015-01-16
@dizballanze

Open files in a `with` construct so that when the body of the construct exits (and the file is finished), the resources are freed.
I also advise you to read this post on finding memory leaks.

V
Vladimir Abramov, 2015-01-16
@kivsiak

https://docs.python.org/2/library/profile.html start with this

A
Alexey Cheremisin, 2015-01-16
@leahch

Use numpy to work with large arrays, it will be good! There is also scipy and working with images, it may come in handy ...

import numpy
...
for file in files[:20]:
            print file
            dataset = OpenFile(file)
            # data = ReadAsArray()
            data = numpy.fromfile(dataset, dtype=numpy.dtype(numpy.int16))
            dataset.close()
            print data [3000, 5000]

U
unclechu, 2015-01-17
@unclechu

If it’s hardcore, then you can run it through the process asynchronously for each file, or at least a bunch of files, for example, 10 pieces. The process processed, returned some result, ended. And the main script at this time operates with these processes and collects the data received from them.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question