#!/usr/bin/python
# -*- coding: utf-8 -*-
"""Attempt to numerically solve a statistics problem posed by Nick Johnson.

As he originally posed it:

> So, I have a stats problem similar to the card collector’s
  paradox. Suppose each item is an n-bit bit vector. Each bit has
  probability 0.5 of being set to 1. How many items do I need to
  collect such that between them, there is a 0 in every position of
  the vector?

> Now, how does that change if the probability of each bit being 1 can
  vary?

In particular, how many items do you need to collect to get to
probability 0.5?

I proposed that it would be easy to solve it with the method of
secants, but that turns out to be surprisingly tricky — the
exponentials involved easily provoke division-by-zero problems and
loss of precision in the method of secants.  Consider the case where n
= 256 and each bit probability is 0.25.  For less than six trials, the
probability of success is indistinguishable from 0 if you subtract 0.5
from it — as you would do to get a function whose root you want to
find.  But for 44 or more trials, the probability of success is 0.999
or greater.  That means that dividing by the slope, as one does,
projects one rather far out — easily into an effectively constant
region.

Moreover, the straightforward expression, Πᵢ 1 - (1-pᵢ)ʳ (where now r
is the number of trials), misbehaves for r negative, which poses
problems for the method of secants.  Consider ∀i: pᵢ = ½ as before and
evaluate for r = -10; we get 1 - 1024 = -1023, and thus depending on
whether we have an odd or even number of bits in the bitvector, we get
a large negative or a large positive answer.  Either way, we get
exponentially large numbers and entirely unhelpful behavior out of the
method of secants.

Even if we hack around that, we still have the problem that the
function is a good approximation of constant 0 for values that are too
small, then a good approximation of constant 1 for values that are too
large.  For ∀i: pᵢ = 0.025 and n = 256, the success probability is
between 0.001 and 0.999 only from m = 144 to m = 492.  This means that
to get the method of secants to converge at all, we need to start by
guessing the right answer to within about an order of magnitude.  (But
then it finishes converging to machine precision within about 10
iterations, and to within an error of 1 in about 7 iterations.)

The standard way to use Newton–Raphson iteration is by first using
binary chop to get “close enough” to guarantee convergence, then start
the Newton–Raphson algorithm.  It seems that, on functions like this,
the method of secants has the same problem.

A different problem — maybe an easier one to approximate
numerically — is to find the *expected* number of items you need to
collect to get a success at every bit position; what is discussed
above is the *median* number.  The expected number is what you get if
you iterate through the item-collecting procedure many times and
divide the total number of items you’ve collected across all the
iterations by the number of times you went through the procedure.  For
some distributions, like the Cauchy distribution, such a number
doesn’t even exist; that quotient does not tend toward some
well-defined limit.  This distribution does have a well-defined
expectation — it’s the geometric distribution in the case n = 1, and
even thinner-tailed for n > 1 — but I don’t know what it is, and it’s
probably not the median.

"""
import itertools


def p(ps, m):
    """Calculate universal success probability for m trials with probabilities ps.

    That is, we have some vector of success probabilities ps for n
    different Bernoulli processes, and we go through m trials.  What
    is the probability that we have at least one success in every
    process?

    """
    if m < 0:
        # This is a bogus case, so let’s return something continuous
        # with the right sign and a reasonable derivative to appease
        # the method of secants.
        return -(-m)**0.5

    success = 1.0
    for p in ps:
        success *= 1 - (1 - p)**m
    return success


def sec_seq(f, x0, x1):
    "The method of secants for finding a zero of a continuous function."
    y = f(x0), f(x1)

    while y[1] != y[0]:
        yield x1
        x0, x1 = x1, x1 - y[1]*(x1 - x0)/(y[1] - y[0])
        y = y[1], f(x1)


if __name__ == '__main__':
    f = lambda x: p([0.025 for i in range(256)], x) - 0.5
    for x in itertools.islice(sec_seq(f, 50, 500), 0, 20):
        print(x)