#!/usr/bin/python3 """Parse Douglas Hofstadter’s gridfonts data and render it to PostScript. I got the data from but the format is not well documented. Papers such as (“Letter Spirit: Recognition and Creation of Letterforms Based on Fluid Concepts”, Gary McGraw, 01992-06-11, 41 pages) talk about what the domain it represents: a 2×6 grid of squares with edges and diagonals (“quanta”), each of which can be present or absent to encode a letterform, a single bit of information. Each square has four edges and two diagonals, but 10 of the horizontal edges and 6 of the vertical edges are shared between two squares, so rather than 2×6×6 = 72 bits of information we only have 2×6×6 - 16 = 56 bits. McGraw’s paper says, “So far, about 600 gridfonts have been designed”, so the 287 gridfonts in the file (117 from 01991, 169 from 01994, and one undated) are only a drop in the bucket. He also presents a Wang-tile-like “pipeface” transformation which converts gridfonts into more conventional novelty fonts “using straightforward and totally mechanical shape-for-shape substitutions”. The file begins as follows: font : noname2 creator : doug create date : Tue Feb 19 15:39:48 EST 1991 last edit : a 048006000B0000 b 04810F00230000 c 08400C00000020 Each letter is associated with 14 hexadecimal digits, which would encode 56 bits, and clear vertical correlations are visible, so it seems likely that each bit describes whether a given quantum is drawn (1) or not drawn (0). But which? Fortunately the “Benzene Right” gridfont in McGraw’s paper is included in the file, providing a Rosetta stone. The letters “c” and “d” are identical except for the vertical stem of the “d”, and, promisingly, they are also very similar in the gridfont data: c 04800100090000 d 04809300090000 This suggests that the three vertical segments of the “d” are these bits: X 00009200000000 The “c” is spatially pretty localized (all in the middle two rows of boxes), but its 5 quanta are spread over 4 different nibbles (in 4 different bytes), so whatever the bit ordering is, it doesn’t have a lot of horizontal locality. The “d” stem is considerably more localized, suggesting that there may be some vertical locality. The “c” and the “o” differ only by one of those three quanta and by only one bit: c 04800100090000 o 04800300090000 So this tells us that this quantum, which is indeed one of the “X” bits, is the third vertical quantum down from the top on the right border: Y 00000200000000 The simplest letter is the “r”, with only three segments set, the three top ones of the “c”: r 04000100080000 c 04800100090000 This gives the following representation for the bottom two segments of the “c”: Z 00800000010000 The difference between “o” and “q” is also appealing; it’s the other three vertical quanta of the right border that we don’t know yet: o 04800300090000 q 04800349090000 [ 00000049000000 Here are the pieces of the right border we know: [ 00000049000000 Y 00000200000000 X 00009200000000 So that tells us that the whole right border being set would be { 00009249000000 This is definitely not six consecutive bits; it is six bits spread across sixteen, in the pattern 1001001001001001, which is an appealingly regular pattern. Two hypotheses occur: this could mean that the right-border bits are interspersed with other vertical quanta, or it could mean that they’re interspersed with other quanta in the right column of boxes. But if they were other quanta in that column, which ones would they be? A more likely representation is that the quanta of a single orientation are contiguous. The difference between “o” and “p” consists of two vertical quanta, the ones below in the left border. Let’s compare to our right-border characters: o 04800300090000 p 04800324090000 ] 00000024000000 { 00009249000000 (all six right-border quanta) Y 00000200000000 (the gap in the C, the third one down) Yes, they’re also 3 bits apart, interspersed with the right-border bits, occurring one bit after them, the second one being two bits before the last right-border bit, so we can probably guess that the 18 vertical-quantum bits are laid out contiguously in a left-to-right top-to-bottom Latin order, filling out the bitfield as follows: { 00009249000000 | 0003FFFF000000 Before them are 14 other bits; after them are 24 other bits. If the quanta are laid out by orientation, which orientations have 14 bits? Each of the 12 boxes has two diagonals, so there are two categories of 12 diagonal quanta. And it’s the horizontals that have 14 bits. This bitfield width reinforces the hypothesis that the quanta are laid out by orientation. “o” and “u” differ by two bits: the tail of the “u” and the top of the “o”. So we should be able to get the top of the “o” by abjunction, o&~u. o 04800300090000 u 00800340090000 } 04000000000000 | 0003FFFF000000 This is indeed in our hypothesized horizontal-quanta bitfield. It’s the sixth bit, as it should be. The “8” in “0480” would then be the common bottom of the “o” and the “u”, with two more 0 bits in between it and the 4 (0100 1000), just as there are two more horizontal quanta. This seems like good enough evidence to be worth trying to code up the horizontals. So here is our horizontals bitfield, with the verticals for context: _ FFFC0000000000 | 0003FFFF000000 How about the diagonals? Are they interleaved by orientation, or first one orientation then the other? The common diagonals between "o" and "u" are the 9, and those are both "/" diagonals rather than "\". There are only two non-drawn "/" diagonals following the first one and preceding the second one (in the conventional order), and there are two 0 bits inside the 9, so it seems likely that we have the 12 "/" diagonals followed by, by process of elimination, the 12 "\" ones. _ FFFC0000000000 | 0003FFFF000000 o 04800300090000 / 00000000FFF000 \ 00000000000FFF So let’s try to decode the Benzene Right gridfont and draw it in PostScript. Okay, that seems to be working! I’ve successfully decoded “benzene right”, “parallels”, “noname2”, and “buxtehude”. I’m not sure they look as intended but at least they look like the correct letters. And I’ve decoded numerous others. The page at displays “bowtie”, which is in the data file. This program decodes “bowtie” correctly, which I think is strong enough evidence that it works. The page also shows “Benzene Left” and “Boat”, which are not in the data file, and it shows a “Checkmark” that differs by one stroke on the “a” from the one this program parses from the data file, but is otherwise identical. """ import sys benzene_right = """ font : benzene right creator : doug create date : Tue Feb 19 15:39:48 EST 1991 last edit : feb 24 94 a 058002400B0000 b 04824B00090000 c 04800100090000 d 04809300090000 e 068001000D0000 f 04002480480000 g 04880348091000 h 04024B00090000 i 00014480A80000 j 00014490A82000 k 07024900090000 l 80012480010000 m 080007000D0000 n 0C000300090000 o 04800300090000 p 04800324090000 q 04800349090000 r 04000100080000 s 07800000090000 t 04002480090000 u 00800340090000 v 08000680010000 w 000003800B0000 x 08400480060000 y 00880348091000 z 048000000F0000 comment : """ def decode(lines, name, scale=8): found = False x, y = 36, 756 # ½" from the upper left for line in lines: line = line.strip() if line == f'font : {name}': found = True if not line: # blank lines separate fonts if found: return if not found: continue if line[1] != ' ': # not a character continue print(f'{x} {y} moveto ({line[0]}) show') bits = int(line[2:], 16) # Draw horizontal quanta. for shift in range(42, 56): if not (bits & (1 << shift)): continue yi = y - scale * (7 - (shift - 42) // 2) xi = x - scale if shift & 1 else x print(f'{xi} {yi} moveto {scale} 0 rlineto') # Draw vertical quanta. for shift in range(24, 42): if not (bits & (1 << shift)): continue yi = y - scale * (7 - (shift - 24) // 3) xi = x - scale if shift % 3 == 2 else x + scale if shift % 3 == 0 else x print(f'{xi} {yi} moveto 0 {scale} rlineto') # Draw “/” diagonal quanta. for shift in range(12, 24): if not (bits & (1 << shift)): continue yi = y - scale * (7 - (shift - 12) // 2) xi = x - scale if shift & 1 else x print(f'{xi} {yi} moveto {scale} {scale} rlineto') # Draw “\” diagonal quanta. for shift in range(12): if not (bits & (1 << shift)): continue yi = y - scale * (7 - shift // 2) xi = x if shift & 1 else x + scale print(f'{xi} {yi} moveto {-scale} {scale} rlineto') x += 36 # ½" if x > 576: # ½" from the right x = 36 # back to left margin y -= 72 # down one line raise ValueError('font not found', name) if __name__ == '__main__': print('%!\n/Palatino findfont 12 scalefont setfont\n1 setlinecap\n') if len(sys.argv) == 1: decode(benzene_right.split("\n"), "benzene right") elif sys.argv[1] == '--all': input_file = sys.stdin.read().split('\n') names = [line.strip()[7:] for line in input_file if line.startswith('font : ')] # XXX ridiculously inefficient, takes almost 400 milliseconds, # not gonna fix it for name in names: decode(input_file, name) print('/Palatino findfont 24 scalefont setfont') print(f'36 600 moveto ({name}) show stroke showpage') print('/Palatino findfont 12 scalefont setfont') else: name = sys.argv[1] sys.stderr.write(f'reading gridfont data for {name} from stdin\n') decode(sys.stdin, name) print('stroke showpage\n')