S
S
Sergey Mironov2020-10-23 11:24:46
PHP
Sergey Mironov, 2020-10-23 11:24:46

How to detect a strong deviation in an array?

There is an array with prices:

$prices = array(
'10300',
'10200',
'1250',
'1260',
'1240',
'1140',
'20',
'30'
);


It is necessary to identify the deviation from the average price.
Those. the values ​​10300, 10200, 20, 30 must be removed.
Help with the algorithm.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
D
Developer, 2020-10-23
@samodum

Mathematical expectation
standard deviation

D
dollar, 2020-10-23
@dollar

Ideally, you need to somehow calculate the cost. By weight, by product material, by its brand, country of assembly, etc. Then add conditional 20%. This will be the "red" price (not to be confused with the average).
Next, you need to decide what deviation is acceptable.

  • For example, it can be expressed as a percentage of the red price. Let's say plus or minus 40% - approx.
  • Another example is when we look at neighboring prices, and if the next price jump exceeds 10%, then we consider that the seller has become overgrown, which means that everyone after him is also greyhounds - we cut off.
  • Any other numeric criterion. It can be a combination of the above methods or an even more complex formula.

In statistics, such awkward values ​​are called outliers . It is customary to exclude them from the sample, because they spoil the statistics themselves with their complete inadequacy. But the criterion of this very inadequacy is determined by you. For this, the nature of the phenomenon under study is important, and not just bare numbers. Specifically, about your numbers, it is known that these are prices . On this fact the logic of my answer is built.

W
wassapman72, 2020-10-23
@wassapman72

from the comments: if you remove all the numbers whose deviation is more than 100% from the average, then 20 and 30 will remain.
so just choose the Threshold you need, you can tie it to $avg.
$prices = array(
'10300',
'10200',
'1250',
'1260',
'1240',
'1140',
'20',
'30'
);
$avg = array_sum($prices)/count($prices);
$threshold = 2100;
$result = array_filter($prices, function ($p) use ($avg, $threshold) {
return $p < ($avg + $threshold) && $p > ($avg - $threshold) ? $p : false;
});
var_dump($result);

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question