Smolhershey is a 2-clause BSD-licensed C library for vector font rendering. It is suitable for small microcontrollers; it does no heap allocation and is under 400 bytes compiled, and the fonts are only a few kilobytes each. You can download it from smolhershey-1.0.tar.gz (57KB).
It can render adequate scalable vector fonts for English and Russian text and mathematics at a cost about 2–3 orders of magnitude lower than FreeType. It’s even smaller than Kamal Mostafa’s excellent Hershey font library, which is itself very small. It’s suitable for even relatively small microcontrollers, though not the smallest (>4KiB of ROM, >256 bytes of RAM, von Neumann architecture unless you want to hack it). However, so far I’ve only run it on my laptop (sometimes emulating an ARM or RISC-V).
Normally, to render a scalable font, we use outline fonts interpreted with FreeType 2 (785K text size) and a TrueType font such as lmroman12-regular.otf (110K) or Noto Serif (15869K). This exposes a lot of complexity. The FreeType 2 API is 4470 lines of code in 54 header files; libfreetype.so exports 217 entry points. The FreeType 2.12.1+dfsg source base is 170 kloc. It has had 109 security holes discovered since 02006 (or 92 according to SecurityScorecard), though only one in the last year. These staggering costs rule it out for many computing applications; many of them shamble along with ugly bitmap fonts as a result.
The text size of Smolhershey itself is under 1K, rather than 785K, and compiled with size optimization for Cortex-M4, it’s under 200 bytes. The Hershey fonts it uses average 7K per font, rather than 110K or 16 megs, and the Optima-like font is under 4K. Its API is 20 lines of code in one header file. It’s 57 lines of code instead of 170000 lines of code. It might have some security holes in it, but probably not more than two or three.
Here’s the code size compiled for different architectures with -Os, with GCC 12.2.0 unless otherwise specified:
Bytes | Instructions | CPU |
---|---|---|
188 | 74 | Cortex-M4 |
250 | 79 | RV64G (with -msave-restore) |
296 | 74 | Cortex-A53 (without -mthumb) |
381 | 85 | AMD64 |
474 | 237 | AVR (with avr-gcc 5.4.0) |
There are smaller TrueType (and other) outline font engines than FreeType; Vidar Hokstad’s Skrift is only 600 lines of Ruby, and Thomas Oltmann’s libschrift, which Skrift is based on, is only 1500 lines of C. But Smolhershey is still an order of magnitude cheaper even than these.
Since I wrote Smolhershey the other morning, I have used it to produce ASCII art, HP-GL plotter art, and PostScript, and to visualize a vector font from the IMLAC PDS-1D from the 01970s (a font which is 2.5K in Hershey form). So far I’m impressed with how good the results have been with how little effort on my part, though there are some problems.
lines | words | bytes | sloc | filename |
---|---|---|---|---|
82 | 521 | 3100 | 20 | smolhershey.h, the API |
65 | 414 | 2350 | 37 | smolhershey.c, the implementation |
141 | 460 | 3193 | 95 | smolhersheyexample.c, the ASCII art example |
150 | 636 | 4470 | 92 | smolhersheyhpgl.c, the HP-GL plotter art example |
25 | 127 | 874 | 17 | smolhersheymin.c, the super-minimal PostScript example |
120 | 373 | 2849 | 85 | smolhersheyspecimen.c, a utility to produce font specimens in PostScript |
The public-domain Hershey fonts were created by Dr. Allen Vincent Hershey in 01967, after some eight years of work using some of the most powerful computers in the world, such as the NORC and Stretch, in order to make his math papers look better. Chris Lott at Hackaday claims that he was originally plotting them with the period (.) character of the Stromberg-Carlson SC4020 Charactron microfilm optical printer installed on the NORC, but Lott may not be a reliable source; he claims Hershey was using James Hurt’s file format (see below).
There are several books about Hershey fonts, including Wolcott and Hilsenrath’s “Contribution to Computer Typesetting Techniques” in 01976, Patrick Michael Doyle’s master’s thesis in 01977, and David MacMillan’s “Exploring Dr. Hershey’s Typography” in 02003–02006. For years, the US National Technical Information Service provided copies of the fonts on demand for a nominal copying cost, but requested no further redistribution in the same format.
In 01986, an ad-hoc group known as the Usenet Font Consortium republished most of the Hershey fonts in a new James Hurt Format, or JHF. Most copies of the Hershey fonts in use today derive from their work.
Since the 01970s, Hershey fonts have been largely supplanted by alternative approaches to computerized typography which can produce higher-quality results. The most powerful computers in the world in the early 01960s could only manage about a million instructions per second and only had a few hundred K of RAM and a few megs of disk, and programming was either assembly or Fortran, so Hershey spent enormous effort to construct fonts that could provide high-quality results with what we now consider minimal computing resources. This required compromises to visual quality.
Still, Hershey fonts are still supported today in Inkscape, plotutils, R, VMD, GrADS, IDL, etc.
The Hershey fonts are stroke fonts rather than outline fonts; rather than defining an area to fill by describing its boundary, they describe how to move a pen to draw a shape. Some of them describe letterforms with thin single-stroke lines and thicker lines built up with multiple parallel pen strokes, while others do not.
Being stroke fonts gives them a real advantage in compactness, though less so for these multi-pass strokes; these four relatively elaborate glyphs are drawn with 189 coordinate pairs describing 135 lines. Each coordinate is encoded in a single byte.
Aside from compactness of representation, the single-thickness fonts have advantages in flexibility that outline fonts lack: you can distort their geometry (for example to oblique or condense the font) without changing their stroke weight, you can adjust the stroke weight at will, and you can even vary the stroke weight by stroke, for example depending on the stroke angle.
For the above ampersands, I’ve just rerendered a single HP-GL output
file several times with hp2xx
with different arguments for -p
,
plotter pen width. We can see that the graphical output quality of
hp2xx
itself leaves something to be desired with respect to line
joins.
Here’s an example of using Smolhershey in C without any error checking, omitting everything that isn’t essential to making it run; this shows that in 17 lines of code you can get high-quality graphical output from Smolhershey, if you don’t mind missing fonts being reported via segfaults.
// This will segfault if run on a system where the font is not
// installed.
#include <stdio.h>
#include "smolhershey.h"
// This is the callback that Smolhershey will invoke for each line
// that needs to be drawn.
void draw_postscript_line(sh_point start, sh_point end, void *userdata)
{
// These PostScript commands add a new line segment to the current
// path, which will be drawn when `stroke` is invoked at the end of
// the file.
printf("%d %d moveto %d %d lineto\n", start.x, start.y, end.x, end.y);
}
int main()
{
// Stack-allocate an 8-kilobyte buffer for the font file contents
// and a pointer array for use in rendering the font. Smolhershey
// leaves all memory applications up to the caller; it never
// allocates anything dynamically itself.
u8 buf[8192], *glyph_pointers[97]; // One extra for EOF
// This is the other data structure the caller needs to allocate.
sh_font my_font = { .lines = glyph_pointers, .n = 97 };
// Read in the font with stdio. Since it’s plain ASCII text, "rb"
// would be undesirable. `fread` returns the number of items read,
// which tells `sh_load_font` how big the file is.
FILE *f = fopen("/usr/share/hershey-fonts/timesr.jhf", "r");
sh_load_font(&my_font, buf, fread(buf, 1, sizeof buf, f));
// This graphics context specifies what font to use, what function
// to call for each line, and what the current point is. It’s
// important to create it with an initializer `= { ... }` to ensure
// that the current point is zero-initialized.
sh_gc gc = { .font = &my_font, .draw_line = draw_postscript_line };
// This is PostScript code to move the origin onto the page, flip
// the Y-axis (because Hershey coordinates increase downwards), and
// make line ends rounded and lines thick.
puts("%!\n100 400 translate 1 -1 scale 1 setlinecap 1.5 setlinewidth");
// As long as you don’t change gc.cp, the characters from `sh_show`
// get displayed one after the other.
for (char *p = "hello, world"; *p; p++) sh_show(&gc, *p - ' ');
// Finally, emit the PostScript commands to draw the built-up path
// and end the page.
puts("stroke showpage");
}
If you bundle the font into your executable, as with C23 #embed
(reference), you can avoid opening and reading files at runtime.
In earlier C standards, there isn’t a way to include an external file at
build time (though some hack was pretty much always possible),
but the JHF format used by
Smolhershey is text, so it’s relatively easy to copy and paste the
font file into your source code as a string; you only have to encode
the newlines.
Smolhershey exports two functions, sh_load_font
, which loads a
Hershey font, and sh_show
, which renders a glyph and advances the
current point. Because it doesn’t do any allocation itself, it also
exports three struct types, sh_font
, sh_point
, and sh_gc
.
sh_font
When you load a font file, Smolhershey builds a rapid-access index in
memory pointed to by an sh_font
; this involves reading the entire
font file and putting a pointer to the line of text for each glyph
into an array:
typedef struct { u8 **lines; int n; } sh_font;
The sh_font
is passed by reference to sh_load_font
and to
sh_show
.
sh_load_font
You must initialize lines
to point to a writable array of pointers
before passing the sh_font
it to sh_load_font
, and you must
initialize its n
to say how many pointers can be safely written.
The return value from sh_load_font
tells you how many glyphs were
found in the font:
int sh_load_font(sh_font *f, u8 *buf, int n);
sh_load_font
also reduces the n
in the sh_font
to say how many
pointers were successfully stored, which is either its original value
or the return value of sh_load_font
, whichever is smaller.
You can use this information in at least three ways:
lines
, as is done
in the example code above, because you know how big the font you’re
loading is. In that case the return value of sh_load_font
only
tells you if there’s some kind of super crazy error where invalid
data was passed. This is difficult to do for fonts chosen at run
time or modified after your program is built.n
to 0 so that no pointers will be written; in this
case the return value tells you how much space you need to allocate
so that you can call sh_load_font
a second time and successfully
load the font.lines
to a very large buffer and set n
to its
size (in pointers); in this case, when sh_load_font
returns, you
know how much of the buffer it needed, and you can safely allocate
the rest of it to other purposes, such as loading another font.sh_point
sh_point
is just an (x, y) pair of integers:
typedef struct { int x, y; } sh_point;
This is used for keeping track of the current point and also to
provide line endpoints to your draw_line
callback.
sh_show
and sh_gc
The sh_show
function is takes a sh_gc
pointer parameter:
void sh_show(sh_gc *gc, unsigned glyph_index);
This is the graphics context which specifies in what font the character will be drawn, at what current point, and how to draw the character:
// Graphics context. Can be safely copied and mutated.
typedef struct {
sh_font *font;
sh_point cp;
void (*draw_line)(sh_point start, sh_point end, void *userdata);
void *userdata;
} sh_gc;
sh_show
consults the font to find the requested glyph, invokes
draw_line
zero or more times to draw the glyph starting at the
position cp
, and updates cp
to be the position after the character
so that multiple successive calls to sh_show
will draw a whole
line of text.
You can point font
at different sh_font
objects to display text in
different styles on the same line, and you can change cp
to display
text in different positions.
For a given glyph index, sh_show
always advances cp
by the same
amount; it does not do, for example, any kerning.
userdata
When draw_line
is invoked, its third argument is the userdata
value from the graphics context, which is a workaround for C’s lack of
closures. In simple cases, this is unnecessary and can be ignored,
and any communication of parameters or drawing state to draw_line
other than the start and end point can be accomplished with global,
static, or thread-local variables; in hairier cases, you might have
multiple graphics contexts active concurrently in the same thread,
each of which has its own userdata.
This should be super simple and take under a second. There are no
dependencies beyond the C standard library, and it’s all standard ANSI
C99 without VLAs, so even Microsoft's crippled C compiler ought to be
able to handle it. Here's what it looks like (my shell prompt ends
with ;
):
: Downloads; tar xf smolhershey-1.0.tar.gz
: Downloads; cd smolhershey-1.0/
: smolhershey-1.0; make
cc -c -o smolhersheyexample.o smolhersheyexample.c
cc -c -o smolhershey.o smolhershey.c
cc smolhersheyexample.o smolhershey.o -o smolhersheyexample
cc -c -o smolhersheyhpgl.o smolhersheyhpgl.c
cc smolhersheyhpgl.o smolhershey.o -o smolhersheyhpgl -lm
cc -c -o smolhersheymin.o smolhersheymin.c
cc smolhersheymin.o smolhershey.o -o smolhersheymin -lm
cc -c -o smolhersheyspecimen.o smolhersheyspecimen.c
cc smolhersheyspecimen.o smolhershey.o -o smolhersheyspecimen -lm
: smolhershey-1.0; ./smolhersheyexample
######## ######## ##### #####
## ## ## ##
## ## ## ##
## ## ## ##
## ## ## ##
## ## ## ##
## ## ## ##
## ## ###### ## ##
## ## ### # ## ##
## ## ## ## ## ##
############### ## # ## ##
## ## ## ## ## ##
(Example output is truncated vertically and horizontally.)
It won’t build with BCC (Bruce’s C Compiler) because it’s not K&R, and BCC only supports K&R C. It won’t build with SDCC 4.2.0 (Sandeep Dutta’s Small Device C Compiler) because SDCC doesn’t support passing structs by value or compound literals. (The SDCC User’s Guide says it does support passing structs by value, except on PIC14 and PIC16, but I haven’t been able to convince it.) It won’t build with cc65 because cc65 doesn’t support C99.
On Debian or related distributions you should install Mostafa’s data
to get most of Hershey’s original fonts in a format Smolhershey can
use in /usr/share/hershey-fonts
:
sudo apt install hershey-fonts-data
I think these are derived from versions in GNU plotutils.
Also, I’ve included a fixed-pitch vector font in JHF format for ASCII in imlac-pds-1-ssvchr.22.jhf, which is 2.5K. This is smaller than any actual Hershey font. It’s the font from Scroll Saver, a program for the IMLAC PDS-1D “graphical minicomputer” (programmable vector-graphics terminal). Its font program is preserved by Tom Uban, extracted from the copy from his IMLAC, at http://www.ubanproductions.com/Imlac/ssv, and has been disassembled by the ITS preservationists. Using IMLAC documentation archived by Bitsavers, I wrote a stupid Python program to extract the vector paths from the IMLAC program (which constructs them incrementally, according to the limitations of the PDS-1 hardware) and produce this JHF file.
David MacMillan has put the original Usenet Font Consortium shar files of the Hershey fonts in JHF format on the web; Dener Rosa Silva’s Hershey-TTF project has another copy of the files. Paul Bourke’s Hershey fonts page also includes occidental and Japanese JHF files. All of these use non-ASCII glyph numberings, and it seems that Smolhershey is having trouble rendering the Japanese font; I get a specimen sheet of character components rather than entire characters, quite different from Bourke’s own visualization.
In two or three hundred bytes, there’s a limit to how much functionality is included. In Smolhershey, the desirable things omitted include the following:
Drawing lines. Hershey fonts are made of lines, but Smolhershey
doesn’t contain any code to draw lines. It doesn’t even have a
concept of a pixel. Instead, Smolhershey invokes your draw_line
callback; your application needs to provide the code to draw a line,
which you can do in whatever way you like. (In the PostScript example
above, the line is drawn by outputting a PostScript command to draw
it.) At this scale, this is not an insignificant omission! Optimized
for size on amd64, Smolhershey is 381 bytes of code, while the
Bresenham line-drawing subroutine in
smolhersheyexample.c is 179 bytes. But many
programs that produce graphical output already have some way to draw a
line at an arbitrary angle.
Reading TrueType fonts. Unless you spend some major quality time with some graph paper or write a Hershey-font editor, you’re pretty much stuck with the 33 fonts Hershey digitized 57 years ago. You could write a program to generate optimized Hershey-font versions of TrueType fonts, but so far nobody has.
Curves. Hershey approximated curves with a number of line
segments with obtuse angles between them. I’ve been thinking about
trying to automatically turn these into Bézier curves, and that is
definitely a thing you can do in your draw_line
subroutine without
changing Smolhershey itself, but that functionality doesn’t exist yet.
Stroke thickness information. It’s great that you can change the
thickness of the strokes, but it would be nice if the font files
contained some sort of base stroke width information about what a
normal stroke width would be. This often results in people drawing
them as thin as they possibly can, which is rarely aesthetically
beneficial, and often exacerbates the lack of curves. Of course, your
draw_line
callback can use whatever thickness it wants, even a
variable thickness.
Combining characters, and therefore support for languages like Spanish, Albanian, French, German, Norwegian, Polish, Finnish, Danish, Romanian, Hungarian, and Portuguese, which would otherwise be well-supported. This could be added with a few lines of code, but then you’d have to add the combining characters to the fonts (which I think was done previously by c’t in Germany).
Unicode. There have been efforts to map out the relationships between Unicode code points and the Hershey glyphs, but Smolhershey does not include them. And there are only a few thousand Hershey glyphs in all.
Kerning. There is no kerning information, so sometimes you get ugly keming; the “Wo” in the introductory cursive graphic and the “W&” in the illustration of multiple parallel strokes are good examples of this. Your application code can apply kerning tables, but none are supplied, and this is not done automatically.
Colors. Emoji need colors. On the other hand, the Hershey fonts don’t include any emoji, even in stroke form.
Vertical escapement. If you’re doing CJKV work, you may want to lay out characters in columns rather than lines, but although the Hershey fonts do include some Japanese characters (all the standard kana but not even all the Joyo kanji), they do not include vertical spacing information.
RTL. For Hebrew, it would be straightforward to define Hershey glyphs with negative horizontal escapement, but they wouldn’t combine properly with LTR characters like Latin letters.
Ligatures. No ligatures, so no hope of supporting Arabic or Devanagari.
Graphical effects. Though one of the advantages I cited above for
stroke fonts is that you can do things like scale them, automatically
make a compressed version, or add calligraphic emphasis at some angle,
Smolhershey won’t do this for you; you have to do it yourself in your
draw_line
function.
Copyright 02024 Kragen Javier Sitaker
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.