#!/usr/bin/python
# -*- coding: utf-8 -*-
"""Optimal addition chains.

In Stepanov's Three Algorithmic Journeys, he poses the problem
(Exercise 1.4 on p. 18, the 27th page, of draft 0.3) of finding
optimal addition chains for multiplying by N, for any N under 100.  He
illustrates the problem by providing a multiply-by-15 algorithm:

    b = a + a + a
    c = b + b
    c + c + b

This uses five additions, while the standard Rhind-papyrus
multiplication algorithm (aka mediation and duplation, Ethiopian
multiplication, or Russian peasant multiplication, and used more or
less universally in modern computers) uses six:

    d = a + a + a
    e = d + d + a
    e + e + a

This seems like a problem that's eminently suited to dynamic
programming.  But it turns out that it doesn't yield quite so easily;
in the above case, c is 6a, and is achieved in only 4 additions, but
in a way that, it turns out, pleasantly computes 2a and 3a rather than
the usual 2a and 4a as its components.  Computing 6a as 2a + 4a rather
than 3a + 3a is just fine as an optimal addition chain for 6a, but the
second one is better when you want to compute 15a, because if you
computed 12a in a way using four additions that gives you 3a for free,
then you only need one more addition to get to 15a.

Worse, the best addition chain to use to compute some subpart of your
final sum may not even be optimal for that subpart, I think.  Two
suboptimal chains for different subparts might have enough overlap
that they turn out to jointly comprise the optimal addition chain.

An "addition chain for K" is some set of numbers such that:

- K is in the set.
- Every number in the set, except 1, is the sum of two other members
  of the set.

This represents a procedure for calculating K times a number x:
calculate each of the multiples nK, except for 1K, by adding two of
the multiples you've already calculated.  A smallest such set
represents an optimal addition chain for K; if it has N members, it
represents a chain needing N-1 additions.

A brute-force approach needing no worse than 2**(K/2) time is simply
to search.  For any partial addition chain, you can search through all
the possible ways to complete it, recursively, starting by trying to
complete {1, K-1}, unless it's already complete, and then trying
{2, K-2}, and so on, until those things cross.  Then you just take the
shortest one of those K/2 possibilities and add K to it.

In that form, it's pretty much exactly 2**(K/2) (it needs 2872
recursions to find the optimal chain for 15), but with memoization,
it's probably better (needing only 302).  But with memoization, it
needs potentially 2**(K/2) space.

An optimal chain for 63, needing 8 additions, is [1, 2, 3, 6, 12, 48,
24, 60, 63].  The Rhind algorithm instead uses the chain [1, 2, 4, 8,
16, 32, 48, 56, 60, 62, 63], needing 10 additions.  I expect these
cases of 2^N-1 to be a kind of worst case for the Rhind algorithm,
while 2^N should be the best case.

This program is able to calculate optimal addition chains for numbers
up to 40 almost instantly, and up to 75 in about a minute.  It gets to
90 in about 5 minutes, and 100 in 143 minutes (just over 17 minutes of
CPU time).

What is this useful for?  Probably nothing, in practice, because it
counts doubling a number as being just as expensive as adding two
arbitrary numbers.  But, in circuitry, doubling a number — or actually
multiplying it by any power of 2 — is just a matter of hooking up the
wires a little to the left.  And for manual decimal calculation,
multiplying 2830 by 63 is probably easier by using the multiplication
table:

        2830
        * 63
        ----
        8490
    + 16980
    --------
      178290

than by using the addition chain:

      2830 + 2830  =   5660  (2x)
      5660 + 2830  =   8490  (3x)
      8490 + 8490  =  16980  (6x)
     16980 + 16980 =  33960 (12x)
     33960 + 33960 =  67920 (24x)
     67920 + 67920 = 135840 (48x)
    135840 + 33960 = 169800 (60x) (note that in base ten this is 6x shifted 1)
    169800 + 8490  = 178290 (63x)

But I thought it was an interesting problem, anyway.

"""
import sys

def main(n):
    for ii in range(1, n):
        print sorted(optimal_chain(set([ii])))

chains = {}
def optimal_chain(start):
    start = frozenset(start)

    # Avoid uselessly duplicating work (and storage); 1x will always
    # be known, so there's no use considering, and memoizing, the case
    # where we don't compute it.
    if 1 not in start:
        start = frozenset(start | set([1]))

    #print "trying to reach", start
    if start in chains:
        return chains[start]

    if is_chain(start):
        # Empirically, memoizing this case to avoid calls to is_chain
        # slows the program down.
        return start

    K = max(start)

    candidates = []
    reduced = start - set([K])
    for ii in range(1, K):
        Ki = K - ii
        if Ki < ii:
            break
        candidates.append(optimal_chain(reduced | set([Ki, ii])))

    result = min(candidates, key=len) | set([K])
    chains[start] = result
    return result

def is_chain(xs):
    # This is an O(N^2) algorithm, and since N gets up to about 10,
    # you might conceivably be able to get a speedup of up to around
    # 10 by optimizing it.
    return all(c == 1 or any(c - a in xs for a in xs) for c in xs)

if __name__ == '__main__':
    main(int(sys.argv[1]) if len(sys.argv) > 1 else 101)