Last updated 1997-10-31. This paragraph added 1999-02-26.
This paper describes a way to run the enormous installed base of programs written in C, assembly, and other similar languages on such an OS. The implementation is by no means without penalty -- every memory access would incur bounds-checking and data-conversion overhead. My wild-ass guess is that this will slow down most programs by a factor of between 10 and 1,000. See the end for why I guess this.
In C, every parameter to a function, every local variable, every statically-allocated variable, and every object allocated on the free store has an address. These addresses are interchangeable; it must be possible to write a function that takes the address of any of the above types of variables and writes a value to the pointed-to variable.
Furthermore, most C programs assume that different types of data live in the same memory space, and do things such as copying an array of integers using memcpy().
However, there are two exceptions to this everything-must-go-in-one-memory-space rule: function addresses and return values.
While most C implementations put the addresses of functions in the same namespace as the addresses of data, few C programs -- except those that access hardware functions directly -- assume this to be the case. This is fortunate, since I'm not sure how I could implement this in Java otherwise.
Return values are strictly rvalues, and cannot have their addresses taken.
It might be better, under some circumstances, to implement RAM as a sparse array, the way most modern OSes do -- enabling the stack to start at the top of the address space, for instance, and enabling unmapped space to cause errors when accessed. This would be much slower in software than in hardware, however.
However, this approach -- especially if pursued enthusiastically -- would find most pointer bugs sooner or later.
As an optimization, parameters and locals that never have their addresses taken can be passed directly, rather than being pushed onto the slower stack in the memory array. In general, parameters can be passed directly; if the called function needs the addresses of the parameters, it can push them itself.
In general, the memory array should be passed to all C functions; it should not be bound to an atom. This makes it possible to run more than one C program simultaneously, in separate memory spaces.
Calling native Lisp or Java code is likely to require some sort of argument unmarshalling; it may or may not be necessary to pass the RAM to the native code, and it may be necessary to pass other native objects to it. Also, it's likely to require some sort of return-value marshalling, since C doesn't allow rich return values.
However, you can store indices into an array, and (in Lisp) you can store pointers to functions in an array. When linking a C program, you need only insert a reference to each function whose address is taken into this array -- this will work equally well for C and native functions.
In Java, you can't store function pointers, but you can do function indirection through interfaces. Each C function can be declared as a class with a single call() method and no instance variables; an instance of each of these classes can be stored in a function-pointer array. Native-Java methods may require an object to work on; a stub class with a call() function can be generated which takes an object reference as its first argument.
This poses a memory-management problem. The environments I've been discussing generally do garbage-collecting memory management; when an object is no longer referenced by any pointers, it is asked to release any resources it controls, and then it gets returned to the pool of free memory.
Once a native object has been inserted into the native-object array, it may not be possible to determine when it is no longer referenced in the C program. I expect the life story of a native-object pointer will go something like this:
If the last step is forgotten, we have a memory leak. C programs are very prone to memory leaks; further discussion of memory leaks and how to fix them could greatly extend this paper. See http://www.geodesic.com/greatcircle/index.html for more information.
Interpreted languages vary in efficiency. Pre-8.0 versions of Tcl typically had performance penalty factor of about 1,000; byte-compiled Java typically has a performance penalty factor of about 20; threaded Forth code typically has a performance penalty factor of about 5. All of these are off the top of my head, and I haven't checked any of them.
I'm assuming there's no performance penalty for writing the bounds-checking code in Lisp instead of C or assembly. This is somewhat optimistic.
It might be possible to reduce this performance penalty in a few ways.
There are some problems with this strategy, which is not surprising, since it's essentially the traditional UNIX strategy: