#!/usr/bin/python3
"""Manipulate Boolean expressions as Жега́лкин polynomials.

Overview
========

In 01927 Ива́н Ива́нович Жега́лкин ("Ivan Ivanovich Zhegalkin"),
<http://www.mathnet.ru/php/archive.phtml?wshow=paper&jrnid=sm&paperid=7433>
reformulated Boolean algebra as the theory of the polynomial ring of
ℤ/2ℤ.  This module implements these Zhegalkin polynomials (aka
"algebraic normal form"), which are a particularly simple way to
calculate Boolean functions, though one that is not as efficient as
OBDDs (and, arguably, no simpler).  Actually, it implements Zhegalkin
*expressions*, not limited to polynomials.  It implements an embedded
domain-specific language in Python for them using the standard Python
bitwise operators, which it compiles down to mod-2 addition and
multiplication.

Usage
=====

You can import some variables into your namespace with the
``symbols()`` function, similar to the sympy function of the same
name::

    >>> x, y, z = symbols('x y z')
    >>> x + y
    <zhegalkin.Sum object x + y>
    >>> ~x
    <zhegalkin.Sum object 1 + x>

That's x XOR 1, which is NOT x::

    >>> x + 1
    <zhegalkin.Sum object 1 + x>

Note that the polynomial has been normalized.  If we XOR with 1 again
we get the original variable::

    >>> ~x + 1
    <zhegalkin.Var object x>

Equality testing is supported using this polynomial normalization::

    >>> x + 1 == x ^ 1
    True

Variable equality is by name rather than by identity::

    >>> x == Var('x')
    True

(Variable names can be whatever you want, but the output will be
ambiguous and confusing if you name your variables things like "x +
1".)

XORing in a variable again cancels it out::

    >>> x + y + 1 + y
    <zhegalkin.Sum object 1 + x>

Boolean disjunction is implemented in terms of AND and XOR::

    >>> x | y
    <zhegalkin.Sum object x + y + x * y>

.distribute()
-------------

Nested polynomials are not automatically distributed out::

    >>> ~x | ~y
    <zhegalkin.Sum object x + y + (1 + x) * (1 + y)>

For that there is a ``.distribute()`` method which guarantees
normalization, so that intensional equality of the polynomials
corresponds exactly to extensional equality of the Boolean function
expressed::

    >>> (~x | ~y).distribute()
    <zhegalkin.Sum object 1 + x * y>

That is, NOT x OR NOT y is 1 XOR x AND y.  In this way we can check De
Morgan's laws::

    >>> (~(~x | ~y)).distribute() == (x & y).distribute()
    True
    >>> (~(~x & ~y)).distribute() == (x | y).distribute()
    True

.subst()
--------

There's a ``.subst()`` method that allows us to evaluate such an
expression for a given value::

    >>> (x | y).subst({x: 0})
    <zhegalkin.Var object y>
    >>> (x | y).subst({x: 1})
    <zhegalkin.One object 1>

Or to see what happens if two inputs are tied together::

    >>> (x | y).subst({x: y})
    <zhegalkin.Var object y>

Or if we replace one of them with some other function::

    >>> (x | y).subst({y: z & y})
    <zhegalkin.Sum object x + y * z + x * y * z>

Full adder example and function-call syntactic sugar
----------------------------------------------------

In this way we can define, for example, a generic half-adder::

    >>> sumbit = x ^ y
    >>> carry = x & y

And then we can chain it off itself to get a full adder by
reconnecting its inputs to something else that isn't x and y, then
combining the carries::

    >>> sumbit.subst({x: sumbit, y: z})
    <zhegalkin.Sum object x + y + z>
    >>> (carry.subst({x: sumbit, y: z}) | carry).distribute()
    <zhegalkin.Sum object x * y + x * z + y * z>

We can see that this is correct: in cases where only one input bit is
1, all three terms will be 0; in cases where two input bits are 1, one
of the terms will be 1; and in cases where all three inputs are 1, all
three terms will be 1, and so too will their XOR.

For convenience we can also write this with Python function-call
syntax with named arguments::

    >>> sumbit(x=sumbit, y=z)
    <zhegalkin.Sum object x + y + z>
    >>> (carry(x=sumbit, y=z) | carry).distribute()
    <zhegalkin.Sum object x * y + x * z + y * z>

.st()
-----

There's an ``.st()`` method to produce a short string representation.
This can be ambiguous if you have multi-letter variable names.

    >>> (~(carry(x=sumbit, y=z) | carry)).distribute().st()
    '1 + xy + xz + yz'

Further examples
================

Full adder with Python functions
--------------------------------

Of course the usual complement of Python conveniences is available, so
we can also define a half adder as a real Python function::

    >>> halfadder = lambda a, b: (a ^ b, a & b)
    >>> halfadder(x, y)
    (<zhegalkin.Sum object x + y>, <zhegalkin.Product object x * y>)
    >>> s1, c1 = halfadder(x, y)
    >>> s2, c2 = halfadder(s1, z)
    >>> c3 = c1 | c2                  # carry out of a full adder
    >>> s2
    <zhegalkin.Sum object x + y + z>
    >>> c3.distribute()
    <zhegalkin.Sum object x * y + x * z + y * z>

We can use ``.subst()`` to evaluate this for particular cases if we're
concerned that it might be wrong::

    >>> c3.subst({x: 1, y: 0, z: 1})
    <zhegalkin.One object 1>

Or, again, we can tie two inputs together::

    >>> c3.subst({x: y})
    <zhegalkin.Var object y>

That is, the carry out of a full adder two of whose inputs are tied
together is the value found on those two inputs.

Multiplexers
------------

In the same way, we can write a function that computes a 2-input
multiplexer::

    >>> mux2 = lambda s, d0, d1: s & d1 | ~s & d0
    >>> mux2(x, y, z).distribute()
    <zhegalkin.Sum object y + x * y + x * z>

We can see that this expression is correct, even if unfamiliar; if x
is 0, then only the first term y can be nonzero, because it's the only
x-free term, while if x is 1, then the second term cancels out y from
the output, and the third term gates z through.  We can also
mechanically verify this with ``.subst()``::

    >>> mux2(x, y, z).subst({x: 1})
    <zhegalkin.Var object z>
    >>> mux2(x, y, z).subst({x: 0})
    <zhegalkin.Var object y>

We can compose three such 2-muxes to make a 4-mux::

    >>> u, v, w = symbols('u v w')
    >>> mux2(u, mux2(v, w, x), mux2(v, y, z)).distribute()
    <zhegalkin.Sum object w + u * w + u * y + v * w + v * x
                        + u * v * w + u * v * x + u * v * y
                        + u * v * z>
    >>> m4 = _
    >>> [m4.subst({u: ui, v: vi}) for ui in [0, 1] for vi in [0, 1]]
    [<zhegalkin.Var object w>, <zhegalkin.Var object x>,
     <zhegalkin.Var object y>, <zhegalkin.Var object z>]

Satisfiability and performance
------------------------------

Because intensional equality of normalized Zhegalkin polynomials is
equivalent to extensional equality, you can evaluate the
satisfiability of an expression by normalizing it and checking to see
if it normalized to 0; all unsatisfiable expressions will normalize to
zero::

    >>> ((~x | ~y) & (x | ~z) & (y | ~z) & z).distribute()
    <zhegalkin.Zero object 0>

Though we can't fault Jorge Fernandez Davila for writing ZPSAT, this
does not prove that P = NP because a CNF expression can explode to an
exponentially large size in this process.  In the following case
``.distribute()`` is noticeably slow even with a very modest number of
terms due to a lack of attention to efficiency in the normalization
process, though that could be remedied::

    >>> ((~u|v) & (~v|w) & (~w|x) & (~x|y) & (~y|z))
    <zhegalkin.Product object (1 + u + v + v * (1 + u))
                            * (1 + v + w + w * (1 + v))
                            * (1 + w + x + x * (1 + w))
                            * (1 + x + y + y * (1 + x))
                            * (1 + y + z + z * (1 + y))>
    >>> _.distribute()
    <zhegalkin.Sum object 1 + u + v + w + x + y + u * v + u * w
                            + u * x + u * y + v * w + v * x + v * y
                            + w * x + w * y + x * y + y * z
                            + u * v * w + u * v * x + u * v * y
                            + u * w * x + u * w * y + u * x * y
                            + u * y * z + v * w * x + v * w * y
                            + v * x * y + v * y * z + w * x * y
                            + w * y * z + u * v * w * x
                            + u * v * w * y + u * v * x * y
                            + u * v * y * z + u * w * x * y
                            + u * w * y * z + v * w * x * y
                            + v * w * y * z + w * x * y * z
                            + u * v * w * x * y + u * v * w * y * z
                            + u * w * x * y * z
                            + u * v * w * x * y * z>

Primitive binary function enumeration
-------------------------------------

We can compute the normalized Zhegalkin polynomial for an arbitrary
truth table with three lines of code; here we list all the two-input
binary functions::

    >>> for n in range(16):
    ...    print('{:04b}'.format(n),
    ...           sum([(x + (~i >> 1 & 1))
    ...              * (y + (~i & 1))
    ...              * ((n >> i) & 1) for i in range(4)]).distribute())
    ... 
    0000 0
    0001 1 + x + y + x * y
    0010 y + x * y
    0011 1 + x
    0100 x + x * y
    0101 1 + y
    0110 x + y
    0111 1 + x * y
    1000 x * y
    1001 1 + x + y
    1010 y
    1011 1 + x + x * y
    1100 x
    1101 1 + y + x * y
    1110 x + y + x * y
    1111 1

If we assume that each XOR or AND operator has unit cost, then of
these, the normalized polynomial is a cheapest way to compute all but
``1 + x + y + x * y``, which can be computed in only three operations
instead of four::

    >>> print(((x+1)*(y+1)).distribute())
    1 + x + y + x * y

We can look at this as calculating a given row as a mod-2 sum of some
subset of the following four Zhegalkin-monomial rows::

    1111 1
    1010 y
    1100 x
    1000 x * y

Though you might think so, this is not the Hadamard-Walsh matrix of
order 2 (note the missing 1 in the last row); rather, it's a symmetric
Sierpiński triangle matrix::

    >>> import functools, operator
    >>> monom = lambda n, vs: product(vs[i] for i in range(len(vs)) if 1 & (n >> i))
    >>> product = lambda xs: functools.reduce(operator.mul, xs, one)
    >>> tbl = lambda f, vs: ''.join(str(f.subst({vs[i]: 1 & (n >> i) for i in range(len(vs))})) for n in range(2**len(vs)))[::-1]
    >>> row = lambda f, vs: (tbl(f, vs), f)
    ... 
    >>> for i in range(8): print('{} {}'.format(*row(monom(i, [x, y, z]), [x, y, z])))
    ... 
    11111111 1
    10101010 x
    11001100 y
    10001000 x * y
    11110000 z
    10100000 x * z
    11000000 y * z
    10000000 x * y * z

We can think of the Zhegalkin polynomials as being a linear function
of a coefficient vector *a* so that the polynomial in, say, three
variables is a₀1 + a₁x + a₂y + a₃xy + a₄z + a₅xz + a₆yz + a₇xyz.  The
truth table is a linear function of that coefficient vector (it's the
product of the matrix above and the coefficient vector), and since the
matrix is evidently nonsingular, the coefficient vector is also a
linear function of the truth table.

We can easily rearrange that into a lower triangular matrix, which is
evidently easy to solve with linear algebra for any desired truth
table::

    10000000 x * y * z
    11000000 y * z
    10100000 x * z
    11110000 z
    10001000 x * y
    11001100 y
    10101010 x
    11111111 1

Or you can just solve it by brute force::

    >>> for n in range(8): print('{:08b}'.format(1 << n),
    ...                          sum([(x + (~i & 1))
    ...                             * (y + (~i >> 1 & 1))
    ...                             * (z + (~i >> 2 & 1))
    ...                             * (((1 << n) >> i) & 1)
    ...                               for i in range(8)]).distribute())
    ... 
    00000001 1 + x + y + z + x * y + x * z + y * z + x * y * z
    00000010 x + x * y + x * z + x * y * z
    00000100 y + x * y + y * z + x * y * z
    00001000 x * y + x * y * z
    00010000 z + x * z + y * z + x * y * z
    00100000 x * z + x * y * z
    01000000 y * z + x * y * z
    10000000 x * y * z

That is, if you want the truth table 10010010, you can add together
the polynomials for each of the 1 bits:

    >>> f = ( x * y * z
    ...     + z + x * z + y * z + x * y * z
    ...     + x + x * y + x * z + x * y * z)
    >>> f
    <zhegalkin.Sum object x + z + x * y + y * z + x * y * z>
    >>> print(*row(f, [x, y, z]))
    10010010 x + z + x * y + y * z + x * y * z

The two ``x * z`` occurrences cancel out.

Some notes on the implementation
================================

The module attempts to do shallow normalization of algebraic
expressions immediately upon creation, but not distribution.  All the
operators we're using are commutative and associative, so we normalize
N-ary sums and products to be sorted.  That is, both ``x + y`` and ``y
+ x`` are represented as ``x + y``, and similarly for ``x * y`` and
``y * x``.  Also, all Python's operators are left-associative except
exponentiation, which we aren't using, so we also normalize them to
associate to the left.  That is, both ``(x + y) + z`` and ``x + (y +
z)`` are converted into ``(x + y) + z``.  That way shallow equality
can be preserved.

Making it possible to sort a heterogeneous sum means we need an
ordering on not just variable names but arbitrary expressions.  This
in turn means we need to be careful to exclude sub-sums from the
ordering in a sum, and sub-products from the ordering in a product.

The function call syntax thing only works because variables are
compared by name instead of identity.

Now that I've written this, an obvious improvement would be to use
N-ary sums and N-ary products rather than binary ones.  The 0-ary sum
is 0; the 0-ary product is 1; and it becomes easy to sort the terms
and factors.  Normalization into a flat polynomial (where none of the
factors are sums) can then proceed more rapidly by maintaining a
"context stack" of factors that we are currently multiplying by on the
left as we iterate over the terms of the sum to the right, or
something like that.

"""

class Expr:
    # XOR group delegates to + for overriding
    def __add__(self, other):
        other = as_expr(other)
        if other == self:
            return zero         # x ^ x == 0

        # normalize x + (a + b) to (x + a) + b
        if isinstance(other, Sum):
            return self + other.a + other.b

        # insertion sort, base case of two terms
        if not isinstance(self, Sum) and other < self:
            return other + self

        return Sum(self, as_expr(other))

    def __radd__(self, other):
        return as_expr(other) + self

    def __xor__(self, other):
        return self + other

    def __rxor__(self, other):
        return as_expr(other) ^ self


    # AND group delegates to * for overriding
    def __mul__(self, other):
        other = as_expr(other)
        if other == self:
            return self         # x & x == x

        # normalize x * (a * b) to (x * a) * b
        if isinstance(other, Product):
            return self * other.a * other.b

        # insertion sort, base case of two factors
        if not isinstance(self, Product) and other < self:
            return other * self

        return Product(self, as_expr(other))

    def __rmul__(self, other):
        return as_expr(other) * self

    def __and__(self, other):
        return self * other

    def __rand__(self, other):
        return as_expr(other) & self


    # OR and NOT are defined in terms of the above.
    def __or__(self, other):
        return self + other + self * other

    def __ror__(self, other):
        return as_expr(other) | self


    def __invert__(self):
        return self + 1


    def __zhegalkin__(self):
        "Magic method for converting things to Zhegalkin expressions."
        return self


    def str_in_env(self, left_prec, right_prec):
        "Format for syntactic environment of specified precedences."
        s = str(self)
        # Assume all operators are left-associative, as the relevant ones
        # are in Python.
        return s if left_prec < self.precedence >= right_prec else '(%s)' % s

    def st_in_env(self, left_prec, right_prec):
        "Same, but for short output."
        s = self.st()
        # Assume all operators are left-associative, as the relevant ones
        # are in Python.
        return s if left_prec < self.precedence >= right_prec else '(%s)' % s

    def __repr__(self):
        return "<zhegalkin.%s object %s>" % (self.__class__.__name__, self)

    # Python 3 eliminated traditional __cmp__, so here we reimplement
    # it so as to avoid needing to write even *more* cases.  Sigh.
    def __le__(self, other):
        return not self > other
    def __ge__(self, other):
        return not self < other
    def __ne__(self, other):
        return not self == other
    def __lt__(self, other):
        return cmp(self, other) < 0
    def __eq__(self, other):
        return cmp(self, other) == 0
    def __gt__(self, other):
        return other < self

    def __call__(self, **kwargs):
        return self.subst({Var(k): v for k, v in kwargs.items()})


def as_expr(obj):
    "Coerce an object to a zhegalkin.Expr, if possible."
    if hasattr(obj, '__zhegalkin__'):
        return obj.__zhegalkin__()
    if obj == 0:
        return zero
    if obj == 1:
        return one
    raise ValueError(obj)

try:
    cmp
except NameError:
    def cmp(a, b):
        "Reimplementation of some of Python 1/2 cmp()."
        if type(a) is type(b):
            if type(a) is str:
                return -1 if a < b else 0 if a == b else 1
            if type(a) is int:
                return a - b
        return a.__cmp__(b)


class Atom(Expr):
    precedence = 20

    def distribute(self):
        return self

    def st(self):
        return str(self)


class Zero(Atom):
    def __str__(self):
        return '0'

    def __cmp__(self, other):
        if not isinstance(other, Expr):
            raise TypeError(other)
        if isinstance(other, Zero):
            return 0
        return -1               # everything else is bigger

    def __mul__(self, other):
        return self

    def __add__(self, other):
        return as_expr(other)

    def subst(self, _):
        return self

zero = Zero()


class One(Atom):
    def __str__(self):
        return '1'

    def __cmp__(self, other):
        if not isinstance(other, Expr):
            raise TypeError(other)
        if isinstance(other, Zero):
            return 1            # Zero is littler
        if isinstance(other, One):
            return 0
        return -1

    def __add__(self, other):
        other = as_expr(other)
        if other is zero:
            return self
        if other is one:
            return zero

        return super().__add__(other)

    def __mul__(self, other):
        return as_expr(other)

    def subst(self, _):
        return self

one = One()


class Var(Atom):
    def __init__(self, name):
        self.name = name

    def __str__(self):
        return self.name

    def __cmp__(self, other):
        if not isinstance(other, Expr):
            raise TypeError(other)
        if isinstance(other, Zero) or isinstance(other, One):
            return 1            # Constants are the littlest XXX backwards?
        if not isinstance(other, Var):
            return -1           # Vars are the littlest except constants
        return cmp(self.name, other.name)

    def __hash__(self):
        return -hash(self.name)

    def subst(self, bindings):
        b = bindings.get(self, None)
        if b is not None:
            return as_expr(b)

        return self

def symbols(s):
    "Convenience function similar to sympy.symbols()."
    return [Var(si) for si in s.strip().split()]


class Sum(Expr):
    precedence = 10

    def __init__(self, a, b):
        self.a = a
        self.b = b

    def __cmp__(self, other):
        if not isinstance(other, Expr):
            raise TypeError(other)
        if isinstance(other, Atom):
            return 1            # Vars and constants are littler
        if isinstance(other, Sum):
            return cmp(self.a, other.a) or cmp(self.b, other.b)
        return -1               # Sums are littler than anything else

    def __add__(self, other):
        other = as_expr(other)

        # recursively flatten
        if isinstance(other, Sum):
            return self + other.a + other.b

        # self-XOR not at the end of a list
        if other == self.b:
            return self.a

        # insertion sort, recursive case
        if other < self.b:
            return self.a + other + self.b

        return super().__add__(other)

    def __str__(self):
        return '%s + %s' % (self.a.str_in_env(0, self.precedence),
                            self.b.str_in_env(self.precedence, 0))

    def distribute(self):
        return self.a.distribute() + self.b.distribute()

    def subst(self, bindings):
        return self.a.subst(bindings) + self.b.subst(bindings)

    def st(self):
        return '%s + %s' % (self.a.st_in_env(0, self.precedence),
                            self.b.st_in_env(self.precedence, 0))

        
class Product(Expr):
    precedence = 15
    def __init__(self, a, b):
        self.a = a
        self.b = b

    def __mul__(self, other):
        other = as_expr(other)

        # recursively flatten
        if isinstance(other, Product):
            return self * other.a * other.b

        # self-AND not at the end of a list absorbs
        if other == self.b:
            return self

        # insertion sort, recursive case
        if other < self.b:
            return self.a * other * self.b

        return super().__mul__(other)

    def __cmp__(self, other):
        if not isinstance(other, Expr):
            raise TypeError(other)
        if isinstance(other, Atom) or isinstance(other, Sum):
            return 1            # Vars, constants, and sums are littler
        if isinstance(other, Product):
            return cmp(self.a, other.a) or cmp(self.b, other.b)
        return -1               # Products are littler than anything else

    def __str__(self):
        return '%s * %s' % (self.a.str_in_env(0, self.precedence),
                            self.b.str_in_env(self.precedence, 0))

    def distribute(self):
        if isinstance(self.a, Sum):
            return (self.a.a * self.b + self.a.b * self.b).distribute()

        if isinstance(self.b, Sum):
            return (self.a * self.b.a + self.a * self.b.b).distribute()

        return self.a.distribute() * self.b.distribute()

    def subst(self, bindings):
        return self.a.subst(bindings) * self.b.subst(bindings)

    def st(self):
        return '%s%s' % (self.a.st_in_env(0, self.precedence),
                         self.b.st_in_env(self.precedence, 0))


if __name__ == '__main__':
    import doctest
    doctest.testmod(optionflags=doctest.NORMALIZE_WHITESPACE)