What are the algorithms for issuing a result from a given list with a certain probability?

V

vitvad2015-04-17 16:07:02

Algorithms

vitvad, 2015-04-17 16:07:02

Task:
There is an initial set of data in the form of an array. For each element of the array, the probability of its occurrence in the result is indicated.
You need to write a function that each time it is called will return one random element from the array with a given probability.
application examples:
- AB testing - there are 3 pages that should be shown to users when entering the site with a probability of 50%, 35% and 15% respectively
- A dictionary that will show one of the words contained in memory every 3 minutes. In this case, new words will be shown more often than old ones.
The only solution that comes to my mind is the "forehead". SIMPLIFIED version below (JavaScript):

function probabilityRandom(chanceA, chanceB, chanceC, items) {
  var totalWeight = chanceA + chanceB + chanceC;
  // generate random number in range from 1 to `totalWeight`
  var tmp = 1 + (Math.random() * totalWeight);

  if(0 < tmp && tmp <= chanceA){
    return items[0];
  }
  if(chanceA < tmp && tmp <= chanceA + chanceB){
    return items[0];
  }

  if(chanceA + chanceB < tmp && tmp <= chanceA + chanceB + chanceC){
    return items[0];
  }
}

probabilityRandom(50, 35, 15, ["first", "second", "third"]);

Actually the question is - are there any algorithms for this type of problem, and how would you solve them?
links to articles describing similar things, pseudocode and tips are welcome.
PS.

Reply

Answer the question

In order to leave comments, you need to log in

5 answer(s)

[

[email protected]><e, 2015-04-17
@vitvad

Walker method (see here , below). The good thing is that after linear precomputation, you can answer the request in O (1).

B

bobrovskyserg, 2015-04-17
@bobrovskyserg

The most difficult math:
on the x-axis, lay off the segments:
[0.0 .. 0.2) - Vasya
[0.2 .. 0.5) - Petya
[0.5 .. 1.0) - Kuzya Generate
random in the range 0 .. 1
.
Python code:

from random import random
from bisect import bisect_left

data = {"Вася": 2,
        "Петя": 3,
        "Кузя": 5}

scalefactor = sum(data.values())
names, scale = [], [0.]
for name, weight in data.items():
    names.append(name)
    scale.append(scale[-1] + weight / scalefactor)

#  посмотрим, хорошо ли работает
counter = {name: 0 for name in names}

for i in range(10000):
    x = random()
    idx = bisect_left(scale, x) - 1
    name = names[idx]
    counter[name] += 1
print(counter)

A

Adamos, 2015-04-17
@Adamos

The simplest math: generate TWO random numbers.
One is the index of the element in the array.
The second - from 0 to 1, compare with the probability value for this element, given in the same form.
If the generated is greater than the specified one, repeat the algorithm.

K

Konstantin Kitmanov, 2015-04-17
@k12th

Probability and Games , pseudo-code
github://jotson/LootTable.js , JS implementation.

D

DISaccount, 2015-04-17
@DISaccount

Elementary Watson.
With your permission, I will reformulate the problem.
You have a histogram (frequencies of occurrence) of elements. You need that for a thousand experiments, you have the frequency of occurrence of a given element approaching a certain given frequency from the histogram. Then, take an arbitrary range, say from 1 to 1000, and divide it in proportions, according to your histogram. For example. You have 2 elements: one appears with a frequency of 2/3, the second with a frequency of 1/3. Then we divide the interval from 1-1000 into 2 pieces, one from 1 to ~300, the second from ~301 to 1000. Depending on which area the randomly generated value fell into, you take that element.