/* Actors, based on the Win16/32/64 WndProc mechanism. This is a surprisingly usable and high-performance object system in 11 lines of C89, although its usability benefits greatly from GCC's nested function extension. The basic architecture is that you write a function for each "class" of "objects", which is called every time a "method" is "called" on your "object", and create a struct (whose contents are never used) to identify each "method name". The linker assigns each struct a separate address in memory, and that address is used to identify which "method" is being "called" (with a conditional in your function). In the terminology of the actor model (as I have mangled it), an "object" is called an "actor", the code it runs (the "class") is called the actor's "script", and instead of "calling methods", we "send cues". This example program displays a grid on the screen where you can move around with your cursor keys and interact with different kinds of objects in the grid: editable text, uneditable text, numbers, bits. The "actor" is represented by a struct whose first item is a pointer to its script, and whose other three items are available for the creator to supply an initial state to the actor. In cases where you need more than three fields, or if you want a little more type-safety, you can define a struct for your internal state. Overall structure of example program ------------------------------------ In this example program, we have a grid actor, which contains cell actors; the cell actors use several different scripts, giving them different behavior. One of the cell actors is a user interface for a byte-buffer actor. The output from the grid is passed to a buffered-output actor, which uses the same kind of text-buffer actor to buffer the output before handing it off to the OS. (There's commented-out code for an older buffered-output actor that used stdio.) The main program doesn't interact directly with the grid actor; instead, it interacts with an arrow-key decorator actor that acts as a proxy for the grid, rewriting arrow-key escape sequences from the input stream into single control characters. Interfaces and privacy ---------------------- Because cue types (or methods) are named with memory addresses, rather than strings or vtable offsets, they are subject to all the visibility rules of C: you can say `static cue_t x = { "x" }` so that `x` can be called only from within the same module; you can pass a `cue_t` as an argument to a function that wouldn't normally be able to send that cue type; you can conditionally declare them in header files; and so on. Given that, there are no explicit declarations of interfaces. If someone tries to send an `q_fire` cue to an event detector, but instead they send it to a nuclear-weapons launcher, you'd better hope the nuclear-weapons launcher script didn't use the same `q_fire` cue type. You can totally have more than one of them, as long as they're static. If you depend on conditionally declaring cue types in header files, be aware that people can work around that by adding their own declaration. Limited return values --------------------- The response to sending a cue can be a single `void*`, so single-word getters work okay with return values, but for more general "return values", we use, and rely on, the traditional stack of function calls, taking advantage of GCC's nested functions to avoid the need for return values. For example, instead of writing: int len = send(b, &q_get_len, 0); if (len) send(b, &q_truncate_to, (void*)len-1); we manually rewrite the code in explicit continuation-passing style: int len; void get_len(char *s, int thelen) { len = thelen; } send(b, &q_read, (void*)get_len); if (len) send(b, &q_truncate_to, (void*)len-1); For objects larger than single registers, such as strings, it dramatically simplifies lifecycle management. When an integer cell is asked for its string value, it can construct that value in a stack-allocated buffer, then deallocate it once the callback returns; `implement_get_text_as_number` is an example: void * implement_get_text_as_number(int number, const cue_t *cue, void *arg) { if (cue == &q_get_text) { char numbuf[32]; my_itoa(numbuf, number); (*(get_text_callback)arg)(numbuf, strlen(numbuf)); } return 0; } If you were using this scheme with a compiler that doesn't provide nested functions, you'd almost certainly want to increase the number of arguments to the cue from 1 to 2, the traditional Windows wParam and lParam, so that you could pass both a function pointer and a user-data pointer to be passed to it. Allocation, initialization, and pointers ---------------------------------------- The only heap allocation in this program is to handle string buffers, which resize. Although you could certainly allocate actors on the heap, all of the actors in this example program are stack-allocated, a circumstance which should be familiar to C++ users and bizarre to anyone from any other OO language. (The stack allocation means the program as a whole needs about 580 bytes of memory, not counting the never-accessed space for the cue structs.) Two of the actor types (byte buffers and buffered output) take a "destroy" cue to deallocate their memory. This implies that the creator of each actor knows its size, and actually also its initial contents. This is somewhat suboptimal, but I claim it's not as bad as it sounds. Why not? Well, the primary flexibility provided by OO is that when you send a message, it can be handled by any of several different concrete classes, with possibly different implementations; someone can pass you a dummy file output object, or a compressing file output object, and your code will continue to work as before, but with added functionality. This obviously doesn't work if you instantiate the object yourself with some concrete class. If you use dependency injection or factory objects, then the apparent limitation goes away, along with the other inflexibilities that those patterns remove. Type-checking of method parameters and actor state -------------------------------------------------- Nope, sorry. FWIW, although I had a heck of a lot of bugs writing this code, only one of them came from those type errors. Actually, it would be good if there was a way to blow up when you call a method that's not defined, but this code isn't even doing that much type-checking. Performance ----------- I haven't tried doing anything hard with it. It looks like it should be comparably efficient to C++ vtable dispatch. Crudely, I get about 100 million serial cue sends per second on simple classes on a 1.6GHz Atom, so about 16 clock cycles per cue send; presumably you spend about another clock cycle on average per type of cue you handle. You suffer function prologue and epilogue overhead around the "method dispatch" script, but then you don't need to suffer them again; the method body can be inlined into the dispatch function. I specify regparm(3) for the scripts themselves to cut the prologue and epilogue overhead a bit. The nested-functions-for-getter trick is pretty costly, particularly since GCC constructs trampolines for them for you upon function entry whether you're using the trampoline or not. It seems to be fairly space-efficient. The first script in the object file is `textstring_script`, which is a very simple class: it displays an ASCIZ string in a cell of the grid. The entire implementation of this class is a single function that compiles to 9 i386 instructions, 25 bytes. (That happens to be the class I used to test the cue sends.) Possible improvements --------------------- 4. Use functions instead of structs to identify methods? That way the debugger knows to print their names. Provenance ---------- Written by Kragen Javier Sitaker, with many helpful improvements contributed by Darius Bacon. */ #include #include #include #include #include #include #include /******************** Object system. ********************/ typedef struct { char *name; } cue_t; #define SCRIPT(x) __attribute__((regparm(3))) void * \ x(actor *me, const cue_t *cue, void *arg) typedef struct actor actor; typedef SCRIPT((*script)); struct actor { script scr; void *state, *state2, *state3; }; static inline void * send(actor *a, const cue_t *cue, void *arg) { return a->scr(a, cue, arg); } /******************** misc. ********************/ #ifdef CTRL #undef CTRL #endif #define CTRL(c) ((c) & 0x1f) void panic(char *what) { char *err = strerror(errno); write(2, what, strlen(what)); write(2, ": ", 2); write(2, err, strlen(err)); write(2, "\n", 1); kill(getpid(), SIGABRT); _exit(-1); } /******************** Output stream protocol. ********************/ // Cues for output streams: cue_t q_print = { "print" }; typedef struct { char *bytes; int len; } bytes_with_len; cue_t q_flush = { "flush" }; // Convenience functions for q_print static inline void print(actor *actor, char *bytes, int len) { bytes_with_len s = { bytes, len }; send(actor, &q_print, &s); } static inline void print_asciz(actor *actor, char *bytes) { print(actor, bytes, strlen(bytes)); } /******************** grid. ********************/ // The grid is a rectangular array of cell actors; it routes // keystrokes to the actor in the current cell, and redisplays the // grid upon request. // Cues for cells: cue_t q_get_text = { "get-text" }; cue_t q_keystroke = { "keystroke" }; // Cue for the grid (which also handles q_keystroke): cue_t q_draw_grid = { "draw-grid" }; // Convenience function which helps a little with type-checking. static inline void send_key(actor *actor, char key) { send(actor, &q_keystroke, (void*)(int)key); } typedef struct { actor *cells; int width; int height; int x, y; } grid; static inline actor * cell_at(grid *g, int x, int y) { return &g->cells[g->width * y + x]; } static inline actor *current_cell(grid *g) { return cell_at(g, g->x, g->y); } #define IF break; case #define ELSE break; default void grid_handle_key(grid *g, char k) { switch(k) { IF CTRL('f'): g->x = (g->x + 1) % g->width; IF CTRL('b'): g->x = (g->x + g->width - 1) % g->width; IF CTRL('p'): g->y = (g->y + g->height - 1) % g->height; IF CTRL('n'): g->y = (g->y + 1) % g->height; ELSE: send_key(current_cell(g), k); } } void grid_draw(grid *g, actor *output) { int i, j, column_widths[g->width]; for (i = 0; i < g->width; i++) { int maxlen = 0; for (j = 0; j < g->height; j++) { void get_length(char *string, int len) { if (len > maxlen) maxlen = len; } send(cell_at(g, i, j), &q_get_text, get_length); } column_widths[i] = maxlen; } print_asciz(output, "\e[H\e[J"); // clear screen for (i = 0; i < g->height; i++) { for (j = 0; j < g->width; j++) { char *after = ""; if (j == g->x && i == g->y) { print_asciz(output, "\e[47;35m"); // highlight magenta on white after = "\e[0m"; } void emit_output(char *string, int len) { print(output, string, len); for (; len < column_widths[j] + 1; len++) { print_asciz(output, " "); } } send(cell_at(g, j, i), &q_get_text, emit_output); print_asciz(output, after); } print_asciz(output, "\n"); } send(output, &q_flush, NULL); } SCRIPT(grid_script) { grid *g = me->state; if (cue == &q_draw_grid) { grid_draw(g, arg); } else if (cue == &q_keystroke) { grid_handle_key(g, (char)(int)arg); } return 0; } /* SCRIPT(stdio_script) { FILE *f = me->state; bytes_with_len *b = arg; if (cue == &q_print) { fwrite(arg->bytes, arg->len, 1, f); } else if (cue == &q_flush) { fflush(f); } } */ /******************** Buffers. ********************/ // This is an autogrowing buffer of bytes, like stralloc or // std::string. It supports five methods. // Previously, it was a separate struct, but now that the actor // struct has three user-data fields, we use them as follows: // state: char* to the beginning of the buffer // state2: current size of buffer contents // state3: allocated size of buffer cue_t q_truncate_to = { "truncate-to" }; cue_t q_read = { "read" }; cue_t q_append = { "append" }; cue_t q_append_asciz = { "append-asciz" }; cue_t q_get_maxlen = { "get-maxlen" }; cue_t q_destroy = { "destroy" }; void ensure_buf_can_hold(actor *me, int newlen) { int newmaxlen = newlen * 2 + 16; char *ns; if (newlen <= (int)me->state3) return; ns = realloc(me->state, newmaxlen); if (!ns) panic("realloc"); me->state = ns; me->state3 = (void*)newmaxlen; } SCRIPT(buf_script) { char *s = me->state; int len = (int)me->state2, maxlen = (int)me->state3; if (cue == &q_read) { (*(void(*)())arg)(s, len); } else if (cue == &q_truncate_to) { int newlen = (int)arg; if (newlen < len) me->state2 = (void*)newlen; } else if (cue == &q_append) { bytes_with_len *b = arg; ensure_buf_can_hold(me, b->len + len); memcpy(me->state + len, b->bytes, b->len); me->state2 = (void*)(b->len + len); } else if (cue == &q_append_asciz) { bytes_with_len b = { arg, strlen(arg) }; return send(me, &q_append, &b); } else if (cue == &q_get_maxlen) { return (void*)maxlen; } else if (cue == &q_destroy) { free(s); me->state = me->state2 = me->state3 = NULL; } return 0; } /******************** cell types ********************/ // Currently there are textstring_script, strlen_cell_script, // buf_cell_script, buf_len_cell_script, buf_maxlen_cell_script, // bit_cell_script, and integer_script. typedef void (*get_text_callback)(char *string, int len); SCRIPT(textstring_script) { if (cue == &q_get_text) (*(get_text_callback)arg)(me->state, strlen(me->state)); return 0; } static inline char * _my_itoa(char *numbuf, int number) { int more_digits = number / 10; if (more_digits) numbuf = _my_itoa(numbuf, more_digits); *numbuf = '0' + number % 10; return numbuf + 1; } void my_itoa(char *numbuf, int number) { *_my_itoa(numbuf, number) = '\0'; } void * implement_get_text_as_number(int number, const cue_t *cue, void *arg) { if (cue == &q_get_text) { char numbuf[32]; my_itoa(numbuf, number); (*(get_text_callback)arg)(numbuf, strlen(numbuf)); } return 0; } SCRIPT(strlen_cell_script) { return implement_get_text_as_number(strlen(me->state), cue, arg); } SCRIPT(bit_cell_script) { int *my_number_p = me->state; int bit_offset = (int)me->state2; if (cue == &q_get_text) { char *to_display = " "; if (*my_number_p & 1 << bit_offset) to_display = "x"; (*(get_text_callback)arg)(to_display, 1); } else if (cue == &q_keystroke) { *my_number_p ^= 1 << bit_offset; } return 0; } SCRIPT(integer_script) { int *num = me->state; if (cue == &q_keystroke) { switch((int)arg % 4) { IF 0: --*num; IF 1: ++*num; IF 2: *num *= 2; ELSE: *num /= 2; } } return implement_get_text_as_number(*num, cue, arg); } SCRIPT(buf_cell_script) { actor *b = me->state; if (cue == &q_get_text) { send(b, &q_read, arg); } else if (cue == &q_keystroke) { char c = (char)(int)arg; if (c == CTRL('h') || c == '\x7f') { // backspace and delete int len; void get_len(char *s, int thelen) { len = thelen; } send(b, &q_read, get_len); if (len) send(b, &q_truncate_to, (void*)len-1); } else if (c < ' ') { // other control characters // do nothing } else { bytes_with_len buf = { &c, 1 }; send(b, &q_append, &buf); } } return 0; } SCRIPT(buf_len_cell_script) { int length; void get_length(char *s, int len) { length = len; } send(me->state, &q_read, get_length); return implement_get_text_as_number(length, cue, arg); } SCRIPT(buf_maxlen_cell_script) { int length = (int)send(me->state, &q_get_maxlen, 0); return implement_get_text_as_number(length, cue, arg); } /******************** buffered output ********************/ // I added this in part because I thought stdio was too inefficient, // in part to demonstrate the idea, and in part to avoid linking in // the whole stdio implementation. SCRIPT(buffered_output_script) { actor *buf = me->state; int fd = (int)me->state2; int length; void get_length(char *s, int len) { length = len; } if (cue == &q_print) { send(buf, &q_append, arg); } send(buf, &q_read, get_length); if (cue == &q_flush || cue == &q_destroy || length > 4096) { void output(char *s, int len) { write(fd, s, len); } send(buf, &q_read, output); send(buf, &q_truncate_to, 0); } if (cue == &q_destroy) { send(buf, &q_destroy, NULL); } return 0; } /******************** arrow key decorator ********************/ // This is an explicit state machine that rewrites ANSI arrow key // escape sequences (^[[A, ^[[B, ^[[C, and ^[[D, // or the equivalents with O instead of [) into the Emacs // control keys understood by the grid as movement commands. It's a // proxy decorating another actor (which happens to be the grid) and // passes through any cues it doesn't understand unchanged. // // Note that the state-machine state is after the next-actor field, // and its initial value is zero, so that the creator doesn't have to // initialize it explicitly. enum ak_state { ak_normal = 0, ak_esc_seen, ak_lsq_seen }; SCRIPT(arrow_key_script) { actor *next = me->state; enum ak_state state2 = (enum ak_state)me->state2; inline void go_to(enum ak_state next_state) { me->state2 = (void*)next_state; } if (cue == &q_keystroke) { char c = (char)(int)arg; switch (state2) { IF ak_normal: if (c == '\e') { go_to(ak_esc_seen); } else { send(next, cue, arg); go_to(ak_normal); } IF ak_esc_seen: if (c == '[' || c == 'O') { me->state3 = (void*)(int)c; go_to(ak_lsq_seen); } else { send(next, cue, (void*)'\e'); send(next, cue, arg); go_to(ak_normal); } IF ak_lsq_seen: switch (c) { IF 'A': send(next, cue, (void*)CTRL('p')); IF 'B': send(next, cue, (void*)CTRL('n')); IF 'C': send(next, cue, (void*)CTRL('f')); IF 'D': send(next, cue, (void*)CTRL('b')); ELSE: send(next, cue, (void*)'\e'); send(next, cue, me->state3); send(next, cue, arg); } go_to(ak_normal); } } else { send(next, cue, arg); } return 0; } /******************** Example keystroke-driven program ********************/ /* because curses makes me curse */ struct termios my_cbreak(int fd) { struct termios new_termios, old_termios; if (tcgetattr(fd, &old_termios) < 0) panic("tcgetattr"); new_termios = old_termios; new_termios.c_lflag &= ~(ECHO | ICANON); new_termios.c_cc[VMIN] = 0; new_termios.c_cc[VTIME] = 0; if (tcsetattr(fd, TCSANOW, &new_termios) < 0) panic("tcsetattr"); return old_termios; } enum { stdin_fd = 0, stdout_fd = 1 }; int mymain() { // actor stdout_actor = { stdio_script, stdout }; actor output_buf = {buf_script}; actor output_actor = {buffered_output_script, &output_buf, (void*)stdout_fd}; actor buf1buf = {buf_script}; int my_number = 42; actor cells[] = { {buf_len_cell_script, &buf1buf}, {buf_maxlen_cell_script, &buf1buf}, {buf_cell_script, &buf1buf}, {textstring_script, ""}, {bit_cell_script, &my_number, (void*)7}, {bit_cell_script, &my_number, (void*)6}, {bit_cell_script, &my_number, (void*)5}, {bit_cell_script, &my_number, (void*)4}, {bit_cell_script, &my_number, (void*)3}, {bit_cell_script, &my_number, (void*)2}, {bit_cell_script, &my_number, (void*)1}, {bit_cell_script, &my_number, (void*)0}, {integer_script, &my_number}, {textstring_script, "In the end,"}, {textstring_script, "only kindness matters."}, {textstring_script, "I'm sensitive and I'd like to stay that way"}, }; grid mygrid = { .width = 4, .height = 4, .x = 2, .y = 0, .cells = cells }; actor real_grid_actor = { grid_script, &mygrid }; actor grid_actor = { arrow_key_script, &real_grid_actor }; struct termios old_termios = my_cbreak(stdin_fd); send(&buf1buf, &q_append_asciz, "this is a test"); /* cue sending speed test: * / int i; for (i = 0; i < 100*1000*1000; i++) { send_key(&cells[3], 'x'); } //*/ for (;;) { char c; send(&grid_actor, &q_draw_grid, &output_actor); while (!read(stdin_fd, &c, 1)) usleep(20*1000); if (c == 'q') break; send_key(&grid_actor, c); } send(&buf1buf, &q_destroy, NULL); send(&output_actor, &q_destroy, NULL); tcsetattr(stdin_fd, TCSANOW, &old_termios); return 0; } // This is sort of the equivalent of Python's "if __name__ == '__main__':" int main() __attribute__((weak, alias("mymain")));