The library jaakko
====

I was thinking about APL, Codd's relational model, modal logic, and Pandas,
and I designed a little library called jaakko.

In [1]:
import jaakko

Let's consider a quantity whose value is 5 in the possible world where x is 3 and y is 4.  Such conditional
quantities are called "columns", by analogy to database table columns.

In [11]:
r = jaakko.Column(('x', 'y'), {(3, 4): 5})
r

x,y,Unnamed: 2
3,4,5


You can add new values to a column; perhaps we want $r$ to take the value 10 when $x = 6 \wedge y = 8$.

In [12]:
r.append((6, 8), 10)
r

x,y,Unnamed: 2
3,4,5
6,8,10


You can do the usual arithmetic operations on column values.

In [7]:
r * r

x,y,Unnamed: 2
3,4,25
6,8,100


In [9]:
r + 2

x,y,Unnamed: 2
3,4,7
6,8,12


You can also compute aggregate functions.  For this purpose, let's look at a different column, one from Codd's
paper from 01971: the quantity $q$ requested of part number $p$ from supplier $s$ by project $pj$.

In [14]:
q = jaakko.Column(('s', 'p', 'pj'), {
    (1, 2, 5): 17,
    (1, 3, 5): 23,
    (2, 3, 7): 9,
    (2, 7, 5): 4,
    (4, 1, 1): 12,
})
q

s,p,pj,Unnamed: 3
1,2,5,17
1,3,5,23
2,3,7,9
2,7,5,4
4,1,1,12


We might want to know, for example, the total number of parts required by each project.  This requires
summing $q$ across values of $s$ and $p$.  Note that this is the opposite of how we formulate the problem
in SQL.

In [15]:
q.aggregate(('s', 'p'), sum)

pj,Unnamed: 1
5,44
7,9
1,12


More plausibly, we'd like to know the total cost of the parts required by each project.  Perhaps the price
depends on both the supplier and the part number.

In [26]:
price = jaakko.Column(('s', 'p'), {
    (1, 2): 1.99,
    (1, 3): 2.79,
    (2, 3): 3.09,
    (2, 7): 8.99,
    (4, 1): 0.49,
})
price

s,p,Unnamed: 2
1,2,1.99
1,3,2.79
2,3,3.09
2,7,8.99
4,1,0.49


We can simply multiply the two columns, even though they have different keys;
the relevant subset of the key is used.

In [27]:
q * price

p,pj,s,Unnamed: 3
1,1,4,5.88
2,5,1,33.83
3,5,1,64.17
3,7,2,27.81
7,5,2,35.96


And from there we can easily calculate the totals as before.

In [29]:
def materials_budget():
    return (q * price).aggregate(('s', 'p'), sum)

materials_budget()

pj,Unnamed: 1
1,5.88
5,133.96
7,27.81


But suppose — not so implausibly — that projects require different numbers of parts in different years.

In [34]:
q = jaakko.Column(('s', 'p', 'pj', 'year'), {
    (1, 2, 5, 2025): 17,
    (1, 3, 5, 2025): 23,
    (2, 3, 7, 2025): 9,
    (2, 7, 5, 2025): 4,
    (4, 1, 1, 2025): 12,
    (1, 2, 5, 2026): 15,
    (1, 3, 5, 2026): 30,
    (2, 3, 7, 2026): 10,
    (2, 7, 5, 2026): 16,
    (4, 1, 1, 2026): 6,
})
q

s,p,pj,year,Unnamed: 4
1,2,5,2025,17
1,3,5,2025,23
2,3,7,2025,9
2,7,5,2025,4
4,1,1,2025,12
1,2,5,2026,15
1,3,5,2026,30
2,3,7,2026,10
2,7,5,2026,16
4,1,1,2026,6


This new dependency implicitly flows through our existing code:

In [35]:
materials_budget()

pj,year,Unnamed: 2
1,2025,5.88
1,2026,2.94
5,2025,133.96
5,2026,257.39
7,2025,27.81
7,2026,30.9
