A paper algorithm notation

By Kragen Javier Sitaker, originally written 2014-08-26, last substantive update 2014-10-15, this paragraph added 2016-11-30.

Pseudocode on paper is an important thinking tool for a lot of programmers, and on the whiteboard for programming teams. But our programming languages are very poorly suited for handwriting: they waste the spatial arrangement, ideographic symbols, text size variation, and long lines (e.g. horizontal and vertical rules, boxes, and arrows) that we can easily draw by hand, instead using textual identifiers and nested grammatical structure that can easily be rendered in ASCII (and, in the case of older languages like FORTRAN and COBOL, EBCDIC and FIELDATA too.) This makes whiteboard programming and paper pseudocoding unnecessarily cumbersome; even if you do it in Python, you end up having to scrawl out class and while and return and self. self. self. in longhand. So this page shows some examples of a paper-optimized algorithmic notation this author has been working on, on and off, over the last few years, to solve this problem.

Mathematics has been pencil-optimized for a long time, so this notation rips off as much as it can from those guys.

Example: a simple “point” class

point @x @y

@r = @x²+@y² @θ = atan2 @y @x

r

↑ @r

θ

↑ @θ

⼻ Δx Δy

@x, @y ← @x + Δx, @y + Δy @r ← @x²+@y² @θ ← atan2 @y @x

Alternatively you could write that last method this way, which is probably how this author would normally do it:

⼻ Δx Δy

@x += Δx @y += Δy @r ← @x²+@y² @θ ← atan2 @y @x

Actually, if caching @r and @θ is warranted, writing a method to recompute them would be better, but the code is left duplicated to have something to put into the constructor-body.

Perhaps the notation should use indentation or a left or right vertical rule joined with the underline to delimit the extent of the class or function, so that you can have a function below a class without it looking like a method of that class.

Complex-number arithmetic

Let’s look at a somewhat more complete and useful example: a complex-number class.

ℂ @re @im

+ z

↑ ℂ (@re+z.@re) (@im+z.@im)

neg

↑ ℂ (-@re) (-@im)

- z

↑ self + z.neg

· z

re = @re · z.@re - @im · z.@im im = @im · z.@re + @re · z.@im ↑ ℂ re im

conj

↑ ℂ @re (-@im)

÷ z

n = self · z.conj d = z.@re² + z.@im² ↑ ℂ (n.@re ÷ d) (n.@im ÷ d)

See, that’s compact enough that you could quite reasonably write it down in the margin of your textbook as you’re riding the train, or on the whiteboard in your job interview, or whatever.

Loops and conditionals

So far, though, there’s nothing particularly “algorithmic” here; this is nearly just a set of equations describing arithmetic on complex numbers, although they do happen to be effective. So let’s look at some things that contain loops. Here’s Euclid's Algorithm for ﬁnding the greatest common divisor of two numbers, which is sometimes claimed to be the first algorithm. A vertical line separates the while-loop condition on the left from the body of the loop on the right:

gcd a b

b	a, b ← b, a % b

↑ a

Here’s Imran Ghory’s FizzBuzz. This uses a conditional, which is written as a vertical line crossed with horizontal lines to separate the cases, with the condition for each case on the left and its consequent on the right. The last case has an empty condition, indicating that it’s always taken; it’s an “else” case.

i 1:101

i % 5 ∧ i % 3	i
i % 5	“Fizz”
i % 3	“Buzz”
	“FizzBuzz”

And here’s a linked-list lookup subroutine which returns the list node at which the match was found, if any.

ﬁnd ℓ k

ℓ

ℓ.k == k	↑ ℓ
	ℓ ← ℓ.next

↑ Λ

You need at least two cases so that the lines cross; the “else” case can be empty. This version is equivalent because the consequent contains an early return:

ﬁnd ℓ k

ℓ

ℓ.k == k	↑ ℓ

ℓ ← ℓ.next

↑ Λ

This author thinks the OCaml and Python versions are more understandable, but a lot slower to write with a pencil and paper or on a whiteboard.

Slices and Quicksort

This author likes using Golang-style slices, which are subsequences that don’t copy the original sequence; but the notation uses subscripts for array indexing and slicing. Here’s Quicksort:

qsort S

#S > 1	p = part S qsort S_:p qsort S_p+1:

part S

i = 0

j :#S

S_j ≤ S_^	S_i, S_j ← S_j, S_i i++

↑ i - 1

This version of Lomuto’s algorithm is somewhat simpler than its usual form, but it is equivalent.

With regard to the weird S_^ thing, Python’s ability to refer to the last element of a list easily by saying s[-1] is endearing, but it’s bug-prone and sort of hard to read for non-Python people; really what we want is some kind of other number for from-the-end counts, one you can’t accidentally reach by erroneous address arithmetic going off the beginning of the list. Thus S_1̂, S_2̂, and so on, for the last few elements of the list, with S_^ being shorthand for S_1̂. Is this a good notation?

If you wanted an actually practical version of Quicksort, you’d need to avoid its quadratic worst-case behavior on sorted input. If we use “?”, APL-style, to give us a random integer less than a limit and greater than or equal to 0, then we get a practical in-place Quicksort with a tiny amount of extra pseudocode:

part S

r = ?#S S_r, S_^ ← S_^, S_r i = 0

j :#S

S_j ≤ S_^	S_i, S_j ← S_j, S_i i++

↑ i - 1

That’s probably the simplest practical implementation of a general-purpose sorting algorithm. There are lots of tweaks you can make to make it more efficient or robust (Hoare partitioning, median-of-N pivot selection, dual pivots, recursion elimination with choice of direction to worst-case limit stack depth, and fallback to insertion sort on small arrays, to name a few), but this is the minimum.

Once you have sorted arrays, one of the most basic things to do with them is to binary-search them, which gives you a dictionary. Binary search is a notoriously bug-prone algorithm; it’s most cleanly stated in recursive form. Note the three-way if-elseif-else here.

bsearch S k

p = #S // 2

S_p < k	↑ p+1 + bsearch S_p+1: k
S_p == k	↑ p
	↑ bsearch S_:p k

↑ 0

This version of binary search returns the position at which the element should be inserted if it’s not present; if the caller is doing an exact lookup, they need to check whether the element at that index is the correct element or not. This allows the same binary search routine to be used for exact-search, insertion, and ﬁnding-next-highest-after. That’s why it returns 0 in the case of an empty sequence.

But you could reasonably complain that it is unreasonable for a mere binary search routine to be not only recursive but even non-tail-recursive, so that it necessarily allocates space and can therefore crash. The above can be easily transformed into this iterative version, which solves these problems:

bsearch S k

b = 0

p = #S // 2

S_p < k	b += p + 1 S ← S_p+1:
S_p == k	↑ p + b
	S ← S_:p

↑ b

Comparison with traditional algorithmic notation: an extended meditation on heapify

Here we have the Max-Heapify algorithm as CLRS describes it. This is probably the most common way that algorithms are described to people studying algorithms; the language is essentially Algol-60, minus type declarations and the tricky bits like call-by-name.

Fair use analysis: this copying, for the nonprofit purpose of scholarship, is transformative because the copied work is the subject of commentary; the copyrighted work is a published, factual work; the amount used is a quarter-page of a work of over a thousand pages; and it has no significant effect on the market for the original work. As all four fair-use factors weigh overwhelmingly in favor of the legality of this copying under US law, it is indisputably legal. It’s ridiculous that this author even feels that they must worry about this fucking bullshit, but they do.

In the paper-oriented notation presented here, it takes somewhat less ink. CLRS uses this separate “heap-size” pseudo-function to keep track of how much of your array contains the heap, but that’s unnecessary if we package up upper and lower bounds and a reference to the array into a single “slice” or “range” object that we pass to the algorithm; as shown before with Quicksort, this yields some simplification.

max-heapify A i

ℓ, r, m = left i, right i, i

ℓ < #A ∧ A_ℓ > A_i	m ← ℓ

r < #A ∧ A_r > A_m	m ← r

m ≠ i	A_i, A_m ← A_m, A_i max-heapify A m

There are a couple of annoying things about this version, though. It uses mutation and recursion in a totally unnecessary fashion. The recursion is tail-recursion, so it can be replaced with iteration. It is better to write it as follows, using conditional expressions and an infinite loop with an early return in the middle, showing that it runs in constant space.

heapify A i

ℓ, r = left i, right i a =

ℓ < #A ∧ A_ℓ > A_i	ℓ
	i

b =

r < #A ∧ A_r > A_a	r
	a

b == i	↑

A_i, A_b ← A_b, A_i i ← b

But if we use a filtered list-comprehension instead, it becomes substantially clearer, although less efficient if you implement it without doing loop fusion (between the listcomp and the argmax) and loop unrolling (to avoid testing a superfluous loop counter):

heapify A i

J =

j [i, left i, right i]

j < #A	j

m = argmax _{k ∈ J} A_k

m == i	↑

A_i, A_m ← A_m, A_i i ← m

The giant parentheses here are to avoid visual ambiguity about whether the “J =” is part of the loop condition or is a declaration of a variable to receive the results of the loop. This is especially helpful on loops, since loops don’t have a horizontal line to delimit their horizontal extent the way conditionals do.

That may be easier to read, but which one would you rather scribble on a whiteboard or a napkin?

Dijkstra’s algorithm

dijkstra ⌂ e

d = { ⌂: 0 } a = {} p = { ⌂ }

n = argmin _{x ∈ p} d_x

o, d' e_n

o ∉ d.keys	p.add o

c = d_n + d'

o ∉ d.keys ∨ c < d_o	d_o ← c a_o ← n

p.remove n

↑ d, a

Compared to Python on some random real code

How does this notation compare to, say, Python for messy real-world code, rather than textbook algorithms? A first function randomly selected from a randomly selected file from /usr/lib/python2.7 ended up selecting quopri.encode, which is long enough that it would be a lot of work. A second random selection was quopri.needsquoting:

needsquoting c quotetabs header

c ∈ “␣␉”	↑ quotetabs
c == “_”	↑ header
	↑ (c == ESCAPE ∨ ¬ (‘␣’ ≤ c ≤ ‘~’))

There is a linebreak inside that final parenthesized expression that doesn’t indicate sequencing of statements, but it would have unbalanced parentheses if you ended the statement there, not to mention a trailing logical-or.

(You may be wondering why there’s a special case for ESCAPE, since the ESC character comes before space and the other condition should catch that, but it turns out that quopri.ESCAPE is “=”, the quoted-printable escape character, not the ASCII ESC character. This is one of those examples where using a named constant reduced clarity rather than increasing it.)

If you were actually writing this in longhand on paper, though, you’d probably write something like this:

nq? c qt h

c ∈ “␣␉”	↑ qt
c == “_”	↑ h
	↑ (c == “=” ∨ ¬ (‘␣’ ≤ c ≤ ‘~’))

Here’s another random function from the Python standard library, mhlib.MH.openfolder. (Did you have any idea Python’s standard library included functions for operating on MH mailboxes? Have you even heard of MH? Are you very old?)

openfolder name

↑Folder self name

Here’s another random function, ftplib.FTP.getline, which was the next random choice after imputil.Importer._reload_hook turned out to be shit.

getline

line = @file.readline

@debugging > 1	print “get” (sanitize line)

¬line	raise EOFError

line_2̂: == CRLF	line ← line_:2̂
line_1̂: ∈ CRLF	line ← line_:1̂

↑line

Pattern matching

Symbolic, rather than numerical or I/O, computation often involves pattern-matching on data structures. Common examples include computer algebra systems, compilers and interpreters, and database query engines. These all involve case analysis on the structure and partial contents of data structures. Doing that kind of pattern-matching with explicit conditionals is tedious and hard to read, so functional languages like ML and Haskell provide special syntax to make it easier.

This notation writes pattern matching just like any other conditional, with the conditions on the left (although now they are patterns) and consequents on the right, with the addition of a horizontal line across the top with the expression to be matched against the patterns above it.

A very simple example is a function to turn a nonempty list of strings into an English list using the Oxford comma:

commalist S

↑

S
[a]	a
[a, b]	a \|\| “ and ” \|\| b
[a, b, c]	a \|\| “, ” \|\| b \|\| “, and ” \|\| c
[a, b, c, d...]	a \|\| “, ” \|\| commalist [b, c, d...]

Here the pattern-match provides values for the variables in the pattern. The first three patterns match lists of specific lengths, while the fourth matches lists of any length of three or more (although it won’t get a chance on lists of length three, since they’ve already been snapped up by the previous case.)

In C, the closest we get to pattern-matching is the switch statement, which is already quite useful. For example, in vicissicalc.c, we have

reactor_loop

refresh

getchar

‘f’

view ← formulas

‘h’

col ←

col == 0	0
	col - 1

‘j’

row ←

row+1 == rows	row
	row+1

‘k’

row ←

row == 0	0
	row-1

‘l’

col ←

col+1 == cols	col
	col+1

‘q’

↑

…

error "Unknown key"