I
I
ImVeryStupid2019-04-19 08:29:22
PHP
ImVeryStupid, 2019-04-19 08:29:22

Optimize the code or how to allocate all the computing power of the PC to its execution?

There is an array of integers. Task:
Represent all values ​​as ranges: value .. (value + 50)
Take the first value (range)
cut off all the ranges below it that intersect with it
take the second
cut off the intersecting
next
and so on
Preserving the order is mandatory.
The code:

<?php
$arr = [100,125,75,175,25,300,275,325,375];
$step = 50;

for ($i = 0; $i <= count($arr)-1; $i++) {
    $value = $arr[$i];
    $arr = array_values(array_filter($arr, function($v) use ($value,$step) {
        return ($value+$step < $v or $v+$step < $value or $v == $value);
    }));
}

var_dump($arr); // [100, 175, 25, 300, 375]
?>

Process 100k numbers with this code for a day.
To iterate without functions "manually"?
Or what else can be done?

Answer the question

In order to leave comments, you need to log in

6 answer(s)
D
dollar, 2019-04-19
@ImVeryStupid

The first thing that catches your eye is the repeated copying of the array . Imagine that when sorting, after every two or three permutations, we would make a complete duplicate of the array. This is horror! And this is a weak point, a constant re-allocation of large memory.
The second, which is also important, is the complexity of O(N*N) . In your case, this is critical, because there are many elements in the original array.
I propose to slightly change the algorithm. We make one pass, but slightly increase the memory consumption in which we store the intervals. This way, we get rid of constant copying of the array, and also reduce the complexity to about O(N) .
And a little optimization trick - the search for the interval occurs by index, that is, O (1). It takes a little thought to figure this out, but overall it's simple.

The code
<?php
$arr = [100,125,75,175,25,300,275,325,375];
$step = 50;

$b = []; //-1 - deny, 0 - not set, 1 - has interval
$int = []; //intervals if necessary 
$step2 = intdiv($step,2);
$arr = array_values(array_filter($arr, function($v) use ($step2,&$b,&$int) {
    $i = intdiv($v,$step2);
    $mod = $v % $step2;
    $res = true;
    if (isset($b[$i])) {
        if ($b[$i] === -1) $res = false;
        elseif ($mod < $int[$i][0] or $mod > $int[$i][1]) $res = false;
    }
    $b[$i] = -1;
    $b[$i+1] = -1;
    $b[$i-1] = -1;
    if (!isset($b[$i+2])) {
        $b[$i+2] = 1;
        $int[$i+2] = [$mod,$step2];
    } elseif ($b[$i+2] === 1) {
        if ($int[$i+2][0] < $mod) {
            $int[$i+2][0] = $mod;
            if ($int[$i+2][0] >= $int[$i+2][1]) $b[$i+2] = -1;
        }
    }
    if (!isset($b[$i-2])) {
        $b[$i-2] = 1;
        $int[$i-2] = [0,$mod];
    } elseif ($b[$i-2] === 1) {
        if ($int[$i-2][1] > $mod) {
            $int[$i-2][1] = $mod;
            if ($int[$i-2][0] >= $int[$i-2][1]) $b[$i-2] = -1;
        }
    }
    return $res;
}));

var_dump($arr); // [100, 175, 25, 300, 375]
?>

By rewriting the algorithm in C++, you get an additional 50x speed increase.

J
jcmvbkbc, 2019-04-19
@jcmvbkbc

Copy source array A to array B, pad each element with index in source array, sort by element value. Get an array C of the same size as the original one, fill it with zeros.
Make the first element current A.
Loop start.
Mark in C the current element. Find the current element in the array B element. View neighboring elements of B and mark in C as clipped all those whose ranges intersect with the current one. Find the next unchecked element C, make it current, go to the beginning of the loop, if all elements of C are checked, finish.
The number of actions depends on the number of intersecting segments, it will be more than O(N log N) but less than N^2.

S
Sergey Tikhonov, 2019-04-19
@tumbler

The task is to single out 2 dynamic minutes from each video.
VLC player counts and records the number of changes in adjacent frames.

The most dynamic will be 120 seconds of video with the maximum amount of changes. Accordingly, we take a window with a width of 120 seconds, and by passing through the entire array we calculate the sum of numbers (at each iteration +1 new and -1 old value). On the way, we note the time step of the maximum.
By a linear pass through the entire array, we keep this top up-to-date (inserting into a sorted list, cutting off the smallest value), at the end of the pass, re-sort the top by seconds timestamps.

S
Stalker_RED, 2019-04-19
@Stalker_RED

find the maximum value
if there are several identical ones - count how many. it's all in one pass.
write them in the result as ranges. all.

T
ThunderCat, 2019-04-19
@ThunderCat

The task is to single out 2 dynamic minutes from each video. ... I
summarize the number of changes in each second by shifting the frame: the
sum is 0-30...

maybe the sum is 0 - 120*30 and so on?
or do you need cutting from second pieces?
Or 2 minutes in a row?

S
SagePtr, 2019-04-19
@SagePtr

Greedy solution for 120*(n+30) iterations:
1) We run through the array with a window of 30 frames, find the maximum amount, remember (put the result in the array).
2) We assign a negative weight to the received 30 frames (for example, equal to -30 * (max value in the array), so that it definitely does not fall into the next sample).
3) Repeat n 1-2 120 times until we get 120 non-intersecting ranges (if the negative value in step 2 is chosen correctly), sorted in descending order of weight. Naturally, if we store them as a pair (time, weight), then we can then sort back by time.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question