S
S
Serhii Silov2017-01-13 23:06:59
Probability theory
Serhii Silov, 2017-01-13 23:06:59

How to calculate the variance (cannot generate pseudo-samples)?

There are 100 observations.
For each of the observations,
one of four events can occur:
event A - 15.9%
event B - 13.0%
event C - 31.0%
event D - 40.1%
(together - 100%)
In total, we get 100 values ​​(for example, A happened 16 times, B - 13,C - 31,D - 40)
You need to find the SD (sigma) for each event to find out, for example, if there would be a significant deviation from the norm if the event C occurred 37 times and not 31.
(pseudo-samples cannot be generated)

Answer the question

In order to leave comments, you need to log in

1 answer(s)
R
Rsa97, 2017-01-14
@Bioinformator

EMNIP, the dispersion of a random variable is the expectation of the squared deviation of a random variable from its mathematical expectation
D(X) = M((XM(X)) 2 )
That is, the concept of "dispersion" is not applicable to your problem.
Here you are talking, rather, about testing a statistical hypothesis.
k = 100 - number of trials
m = 37 - number of events C
p = 0.31 - hypothetical probability of event C
ε = |m/kp| = |0.37-0.31| = 0.06
The probability that this happened is estimated as
P{|m/kp| ≥ ε} ≤ p∙(1-p)/ε 2 /k = 0.31∙0.69/0.06 2 /100 ≈ 0.59
So this result is quite probable.
If we assume that there were 1,000,000 trials and 370,000 times C fell out, then the probability of such a result will already be ≤ 0.000059, which is extremely unlikely.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question