1. From the beginning of the stream, the algorithm counts every new element that arrives. It tracks the counting using the variable named n. When the algorithm gets into action, the value of n is equivalent to k. 2. Now, new elements arrive, and they increment the value of n. A new element arriving from the stream has a probability of being inserted into the reservoir sample of k/n and a probability of not being inserted equal to (1 – k/n). 3. The probability is verified for each new element that arrives. It’s like a lottery: If the probability is verified, the new element is inserted. On the other hand, if it isn’t inserted, the new element is discarded. If it’s inserted, the algorithm discards an old element in the sample according to some rule (the easiest being to pick an old element at random) and replaces it with the new element. a simple example in Python so that you can see this algorithm in action
1. From the beginning of the stream, the
that arrives. It tracks the counting using the variable named n. When the
algorithm gets into action, the value of n is equivalent to k.
2. Now, new elements arrive, and they increment the value of n. A new element
arriving from the stream has a probability of being inserted into the reservoir
sample of k/n and a probability of not being inserted equal to (1 – k/n).
3. The probability is verified for each new element that arrives. It’s like a lottery:
If the probability is verified, the new element is inserted. On the other hand,
if it isn’t inserted, the new element is discarded. If it’s inserted, the algorithm
discards an old element in the sample according to some rule (the easiest
being to pick an old element at random) and replaces it with the new element.
a simple example in Python so that you can see this algorithm in action
Step by step
Solved in 3 steps with 1 images