Saving initialized data structures into an executable was the traditional way to build large Lisp systems, and was a built-in capability of PDP-10 operating systems in the 1970's. When I was a student at Utah porting PSL (Portable Standard Lisp) to Vax Unix around 1981 we noticed that there was no such capability available. For a while our workaround was to dump a core file by sending SIGCORE (^\) to the process, then start (resume) our system in a debugger. Spencer Thomas, who was also a student at Utah at the time, wrote the function he named "unexec()" to give us a more sensible path to the same functionality. exec() takes a file and turns it into a process, unexec() takes a process and turns it into a file. This code served our needs very nicely at the time, allowing us to load compiled Lisp code into a bare interpreter and save a complete system. Later, this code was incorporated into GNU Emacs for essentially the same purpose.
At the time, building these systems took several minutes, so it really wasn't feasible to expect users to just load everything they needed on startup. It is highly non-portable, of course, and has caused headaches for Lisp builders ever since. Amortizing startup time over a larger amount of work is still the only portable solution I know, along with keeping initialized application state in databases rather than in-memory data structures.
And really, this is just (a hackish implementation of) an image-based runtime, ala Smalltalk. All ELISP is missing is a big list of all the globals it needs to care about saving and restoring (so it can not save all the random other memory-garbage it happens to still be holding onto), a serialize()/deserialize() pair of functions to run those through that result in a standard on-disk representation, and a boot strategy involving deserializing those structs into memory.
If you want to be fancy, you can make the on-disk VM-image format a database (SQLite, LevelDB, whatever) so as to avoid writing it all out every time. Then it becomes cheap enough to write out a differential state that you can make the runtime do it automatically at intervals, after certain operations, manually with a sync(1)-equivalent call, etc.
> If you want to be fancy, you can make the on-disk VM-image format a database (SQLite, LevelDB, whatever) so as to avoid writing it all out every time.
I took this approach in a game engine I developed at one point. Common Lisp has a very general meta-object protocol that allows you do things like this transparently (see e.g. [1]). I believe I used Berkeley DB as the backing store, which supports in-memory caching of objects. With this approach, I didn't need an explicit save-file format, everything was just "there" on disk, automatically. As far as that was concerned, it was pretty cool.
Unfortunately, it was not fast. At one point, I did an experiment where I ripped out the DB and replaced it with a in-memory hash-map implementation. This was about 10x faster, (despite the supposed in-memory caching at the DB layer). I got an additional similar speedup when I ripped out the meta-classes for the objects.
Turns out, these abstractions are expensive. Writing nice sequential code on compact in-memory data structures has substantial benefits (if you want performance).
Was the DB cache a write-through cache or a write-back cache? Memory-canonical persistent databases (e.g. Redis) and disk-canonical persistent databases (e.g. SQLite) have very different persistence strategies. Only the memory-canonical type can really be used sensibly to persistently back (or, really, partially-crash-restore) a OLTP process's "hot spots." Basically you want the same characteristics for such persistence that you want for a logging engine—nonblocking behavior being first and foremost.
EDIT: "There are currently three different data stores that support the Elephant API: Berkeley DB, Postgresql via the postmodern library, and any database supported by the CLSQL library including SQLite3." — so, write-through, then.
I had a similar realisation recently. I had to learn Smalltalk recently (for my new job, believe it or not!), and Smalltalk really does strike me as image-based programming find right. My previous exposure was Common Lisp, but the image and the source code getting desynchronized was a recurring pain. After deleting a function, but missing some uses for example, the code might work fine until you reloaded the source into a clean image. In Smalltalk, that doesn't happen, because the image is the code.
That's because Lisp isn't image based at all. It merely has a live environment, much to the frustration of anybody who wants to dump the running state of their lisp system to disk.
The Lisp Machine keyboard had dedicated open and close parenthesis keys, so you could hold a hefty bag of nitrous oxide in your other hand while you typed s-expressions.
I've been sort of thinking of fiddling with a Lisp on the cog vm for a while. That vm does a bunch of things I want (including the images people are talking about here).
At the time, building these systems took several minutes, so it really wasn't feasible to expect users to just load everything they needed on startup. It is highly non-portable, of course, and has caused headaches for Lisp builders ever since. Amortizing startup time over a larger amount of work is still the only portable solution I know, along with keeping initialized application state in databases rather than in-memory data structures.