I was thinking about the problem of automatically generating condensed versions of a font, inspired by the difficulties of displaying 80-column text well on cellphones that I tackled with the Dercuano PDF renderer and by Android’s predictive text option display.
This program executes a simple linear-time skeletonization algorithm on the given pixels. (I don’t know who invented it originally, probably Blum; I didn’t understand it until I reinvented it.) For each black pixel, the distance to the nearest white pixel, a sort of signed distance function, is computed. (To really be a signed distance function, some of the white pixels would need to be assigned negative values.) If some pixel’s SDF is, say, 3, then we know from the triangle inequality that no pixel at distance 1 from that pixel can possibly have an SDF of less than 2. The skeleton is the minimal set of pixels whose SDFs are adequate to recompute the SDFs of all pixels in this fashion, the ridges of the SDF. These pixels are highlighted in blue.
Calculating the skeleton from the tabulated SDF in this way is linear time because it merely involves comparing each pixel’s SDF to those of its neighbors, once. The SDF itself is calculated in linear time by calculating the SDFs of the pixels in order of increasing SDF: first the 0 pixels, then the 1 pixels, etc. Each pixel is only enqueued to have its neighbors’ SDFs computed once it itself has had its SDFs computed. Consequently each pixel’s SDF is updated only once. (This code considers the eight pixels around each pixel to be at distance 1 from it.)
I think this algorithm works for any “distance metric” as long as it’s symmetric and respects the triangle inequality, but with a more continuous distance metric, you could imagine needing to use a priority queue that supports the increase-priority operation to ensure that pixels are processed strictly in the order of the SDF. If, instead, you were to use distances of, say, 5 and 7 for orthogonally and diagonally adjacent pixels, you could implement a discrete-priority priority queue as an array of doubly-linked-list stacks, each bucket n containing the pixels whose SDF was known to be no greater than n. You might promote them to an earlier bucket such as n - 2 if you found a shorter path to them via some other pixel. (Basically this is just Dijkstra’s shortest-path algorithm, with the starting node being the white area.) This is still guaranteed to be linear time since the promotion operation is constant time.
I was interested in using skeletonization this way in order to be able to nonuniformly scale the letterforms by moderate amounts without changing their line widths, but I haven’t actually done that yet.
In the process, I may have stumbled upon a better way to factor some important morphological erosion kernels. The sample data when you load this page is a solid circle of radius 9.5. Note that its skeleton, under this goofy distance metric, consists of 13 pixels, out of the 57 pixels it has set across 19 scan lines with a maximum width of 19. If you wanted to erode (or dilate) by it using the Urbach-Wilkinson algorithm I investigated in some depth in Dercuano, it would require 4 pairwise min operations per pixel to set up power-of-2 chords up to 16 pixels wide, 6 pairwise min operations per pixel to calculate erosions with chords of 7, 11, 13, 15, 17, and 19 pixels, and then 18 pairwise min operations per pixel to calculate the erosion by the circle as a whole, for a total of 28 operations.
But this skeleton consists of 13 pixels, of SDF values 4, 6, and 7. The SDF=4 skeleton pixels on the vertical axis induce the length-7 chords at the top and bottom; the SDF=6 skeleton pixels on the vertical axis induce the length-11 chords; and the SDF=7 skeleton pixels on the vertical axis but not in the center induce the length-13 chords. The extra SDF=7 skeleton pixels on the horizontal axis induce a length-15 chord each, the SDF=6 skeleton pixels on the horizontal axis combine to induce two length-17 chords, and the SDF=4 skeleton pixels on the horizontal axis combine to induce seven length-19 chords. That is, this circle is a union of four 7×7 squares, four 11×11 squares, and four 13×13 squares, centered on 12 of the 13 skeleton pixels. So if you had precomputed erosions of the image by square kernels of those sizes, the skeleton of the circle tells you how to use 11 pairwise pixel-min operations per pixel to compute the erosion by this circle.
If you have the 7×7 square erosion computed, you can compute the 11×11 and 13×13 erosions from it in two operations per pixel each. Computing the 7×7 erosion can be achieved by computing a 2×2 erosion from the original image in two operations per pixel, computing a 4×4 erosion from it in two operations per pixel, and then computing the 7×7 erosion from that in two operations. So the whole cascade of square kernels requires 10 operations per pixel.
This skeletonization-based structuring-element factorization strategy factors this circle kernel into 21 pairwise pixel-min operations per pixel rather than the 28 given by the Urbach-Wilkinson strategy. Similar approaches should work well for other shapes that are pretty round and not hollow and consequently have very sparse skeletons in this sense. Horizontal or vertical runs of skeletal pixels with the same value induce a rectangular box rather than a square one; for example, a horizontal run of five SDF=6 skeleton pixels induces a 15×11 box, erosion with which can be computed from erosion with an 8×8 box in two pairwise pixel-min operations, one with a horizontal lag of 7 and the other with a vertical lag of 3. A 4×8 box is indicated by a 6×2 array of SDF=2 skeleton pixels; erosion with the 4×8 box can be calculated from a 4×4 box with a single pairwise min operation per pixel.
I learned some things: