How to optimize a function on OpenCV?

A

Alexey Belov2019-05-21 14:30:01

Python

Alexey Belov, 2019-05-21 14:30:01

Hello connoisseurs, there is a function that works quite slowly, I have 72 cores at my disposal, I would like to hear optimization recommendations, rgba not grayscale comes into it.

def color_transfer_init(image_with_alpha):

    image, image_mask = image_with_alpha[..., :3], image_with_alpha[..., 3]
    _, mask_hard = cv2.threshold(
        image_mask, MASK_THRESHOLD, 1.0, cv2.THRESH_BINARY)

    _x, mask_crop = cv2.threshold(
        image_mask, 0.01, 1.0, cv2.THRESH_BINARY)

    img_lab = cv2.cvtColor(image, cv2.COLOR_RGB2LAB)

    mask = np.uint8(mask_crop)
    cnts = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2]
    cnt = sorted(cnts, key=cv2.contourArea)[-1]

    x,y,w,h = cv2.boundingRect(cnt)

    if h <= w:
        h = int(h * 1.8)
    elif h * 1.2 <= w:
        h = int(h * 1.3)
    elif h * 0.8 <= w:
        h = int(h * 1.3)

    mask_hard = mask_hard[y:y+h, x:x+w]
    img_lab = img_lab[y:y+h, x:x+w]
    img_rgba = image_with_alpha[y:y+h, x:x+w]
    
    img_mean, img_std = cv2.meanStdDev(
        img_lab, mask=mask_hard.astype(np.uint8))

    return img_lab, img_mean.reshape((3,)), img_std.reshape((3,)), img_rgba

Reply

Answer the question

In order to leave comments, you need to log in

3 answer(s)

F

f4f, 2019-05-21
@Alenorze

It seems to me that the order of conditional operators should be changed (h * 1.2 > h > h * 0.8). In the current version h * 1.2 will not be (because the first condition will be fulfilled).
You can not sort the cnts array, but simply find the maximum element, depending on the nature of the masks, how many contours will usually be there, it can be completely imperceptible.
From the obvious, with regard to opencv - there is no point in translating the whole picture into lab, if then roi is cut out. First cut out roi and then translate into lab. You can do the same with mask_hard (although if you use adaptive binarization methods, the result may be slightly different at the edges of the roi)
Not as familiar with opencv for python, but the c++ version defaults to parallelization where possible. It makes sense to build a library with openmp / tbb and see what kind of growth gives.
If a batch of images is being processed, then explicitly run in several threads, aggregating the results, if necessary.

A

Andrey Dugin, 2019-10-04
@adugin

1) Allocate memory for the result in advance and pass the dst parameter to the OpenCV functions, it is almost everywhere.
2) If you need one contour with the maximum area, then you do not need to sort the entire list sorted(cnts, key=cv2.contourArea)[-1], use max(cnts, key=cv2.contourArea).
3) ROI makes sense to cut out the very first step, and only then do the transformations.
4) The mix of conversion to uint8 and threshold 1.0 is not very clear, can't it be done differently?
5) If Linux, then it makes sense to try the pillow-simd library, this is a fork of PIL, sharpened for vector processor instructions.