- I know that Redis does this (RDB snaphots) too, while the child process saves the actual snapshot, the parent can operate continously relying on the system provided copy on write mechanism
- I have also found this: https://github.com/thomasballinger/rlundo - It overwrites readline library to get undo function in any REPL using it
original_conn = conn
conn = do_something_expensive(conn, params)
conn = original_conn # this "grabs" the earlier state, no matter how complex it is
conn = do_something_else(conn, params)
assert <something having to do with both calls>
You could say that changes to the state are forked from the earlier state.
http://emacshorrors.com/posts/unexecute.html(quite unflattering as the link suggests, but factually accurate nonetheless)
More precisely, the first version of the TeX program (now called TeX78 or TeX80) which was written in the SAIL programming language at Stanford was written in this way, because it was apparently the natural thing to do and lots of programs at the time were written like that. The present version (rewrite) of the TeX program (briefly called TeX82 but now just called TeX), was written in (a language/system based on) Pascal and meant to be portable for different OSes. So in addition to using the system's fork or dump/undump (if available) to snapshot the application state, it contains its own implementation of dumping all the relevant application state into a "format file", and loading from it. In fact, the dumping and undumping were traditionally done by different binaries:
INITEX -> VIRTEX -> TEX
See https://news.ycombinator.com/item?id=13076098 where drfuchs (David Fuchs, who was Knuth's "right-hand man" during his rewrite of TeX) describes it in more detail. (Originally posted on https://news.ycombinator.com/item?id=13073566 "The Emacs dumper dispute", and also quoted in https://news.ycombinator.com/item?id=14140421 "Improving startup time in Atom with V8 snapshots".) See also https://tex.stackexchange.com/questions/417624/installation-... for some elaboration (though it's confusing and now I realize it has at least one error).
This one is also kind of old:
If anyone has any updates on these I'd be interested.
As far as I remember App Engine also did a similar thing to avoid reinitializing a Python VM for every request.
> Crash the already difficult to drive vehicle into a nearby rock
> You died
Why not? (Experimental) versioned file systems exist.
But I think the idea has been around forever.
There's more to checkpoints than just fork(), though; for example, when two processes are sharing memory, checkpointing the process group fork()s both processes and then must explicitly duplicate the shared memory region, otherwise the checkpoint would share memory with the original, which would be disastrous. Unfortunately this means we don't get COW behaviour for shared memory regions; unfortunately the Linux/POSIX APIs are a bit deficient here.
It has another mode where you can record and rewind execution traces, which sounds more like the description, but I'm almost certain that doesn't use fork. https://sourceware.org/gdb/onlinedocs/gdb/Process-Record-and...
LVM - low (block) level, hardware supported, generic, unavare of filesystem layer above (so it may not be optimal for some usecases) but at least this gives some separation of concerns, and still can provide some impressive features: resize, encrypt, snapshots
CoW filesystems (btrfs, zfs, ...) - specialized, carefully designed data structures/layout, can have more advanced features and more optimized to given ussecases (they have more information what is actually happening), but they are not a generic separatable layer - all the fancy advanced features are tied to the high level fs implementation
- I’ve used fork() in Ruby for similar reasons when a library had instability or memory leaks and I wanted to isolate it rather than try to find the bug (I’m looking at you, ImageMagick).
A while back, MRI Ruby got an improved garbage collector that keeps the mark bits separate from the objects just so copy-on-write would (mostly) work in cases like this.
Unfortunately the maintainers didn't like the idea so it won't be merged. But if your project is large enough that ninja takes more than a second to parse your build files every time you build (true for Chromium and Android at least), you might want to try the patch.
On the topic of Windows forking, the RtlCloneUserProcess function does exist and it works. The problem is that Microsoft's implementation of the Win32 API keeps state (such as an IPC connection to csrss.exe) that is invalid in the forked process and there's no easy way to clone or reinitialize that state. The child process will run until its next Win32 API call and then probably die. Since almost every Windows program uses Win32 APIs, RtlCloneUserProcess is not useful for forking Windows programs.
Microsoft could fix this if they wanted to, at least for a useful subset of Win32. But I imagine it requires a lot of hacking in deep, dark parts of Windows that nobody wants to touch.
Obviously this won’t happen unless Bazel adds, like, a million things that Chromium developers think are essential features but still, I wouldn’t be surprised. Bazel already has a ton of developer effort thrown at it to make it work with large code bases well, and recompile large, complicated projects quickly. The forking idea (or at least, the client/server model) is already used by Bazel, and Bazel is capable of managing subprocesses for compiling specific languages. This is how Bazel’s TypeScript support works, for example. When you run Bazel it spawns a server which hangs out in the background, and that server will run a TypeScript compiler in the background. Compiling a TypeScript project with Bazel is noticeably and significantly faster than compiling from the command line, without the hacky / fragile bits that tsc --watch gives you. And obviously none of this would be possible unless the TypeScript compiler were written to heavily cache things and run as a library (see https://youtu.be/f6TCB61fDwY).
When started up, it would do all kinds of slow (at the time) initialization of its data structures. Then the genius move kicked in - it saved the memory image to disk as the executable file! Next time it ran, it was already initialized and started up instantly. It blew my mind.
I liked the idea so much, that years later when I worked on the text editor I use (MicroEmacs), I did the user configuration the same way. Just change the global variables, then patch the executable file on the disk with the changed values. It was so, so much simpler than defining a configuration file, serializing the data, writing/reading the file, and deserializing it. It worked especially great on floppy disk systems, which were incredibly slow.
But then came viruses, malware, etc. Patching the exe file became a huge "malware alert", and besides, Microsoft would mark the executable as "read only" while it was running. Too bad.
Come to think of it, I think there are applications to my own work... I'd have to be careful, though.
Seeing is believing! :)
> I don't want to wait for dmd to know how much faster you made it.
Here are the timings from the video in table form:
Normal compilation (without forking): 9.163 seconds
Full compilation + creating forks: 10.464 seconds
Final step only (code generation + linking): 2.847 seconds
After editing entry point file only: 3.792 seconds
After editing deep dependency (first build): 9.562 seconds
After editing deep dependency (second build): 4.675 seconds
I've been working on building the Zig self-hosted compiler this way from the ground up, except with reference counting the cached stuff rather than using fork(). This lets me do the caching at the function level, even more fine grained than the file level. Here's a 1min demo: https://www.youtube.com/watch?v=b_Pm29crq6Q
If it is not possible with common infrastructure, what do base your estimate on?
Also I'm following Jonathan Blow's streams (Jai language). With his own trivial x64 backend he has/had a very real compilation speed of 80K lines/s, and only the parser was parallelized. I think he indends to improve that to 800K lines/s. Note that his language is quite a bit more work to compile than a basic C-like language (which is about as easy as you can get if you discount the parsing model).
7 seconds is plenty fast enough to make “watch” the cheapest and most reliable solution to the speed problem.
With clang/gcc/msvc, I think I'm more in the ballpark of 3K-30K lines/sec (-O0, 100-1000 lines/file, not counting basic std* includes).
We tried plugging in the rlundo tool https://github.com/thomasballinger/rlundo to IPython, it works ok, see http://ballingt.com/interactive-interpreter-undo/ for the long story.
I wanted something like this for live programming and felt like I needed to write my own interpreter because the behavior you need is so different: http://dalsegno.ballingt.com
I'm basically thinking about ArcGIS Model Builder that I used a lot in university. It was a great way to make complex process pipelines for GIS data, but only re-run the pieces that change. It allowed me to experiment at a very fast pace.
Tangentially it’s not clear to me what relevance D should have for any greenfield projects where there is not already heavy investment in a code base to consider.
First there’s the whole Rust comparison, and then the incredible impact of the comparative size and momentum of ecosystems for various languages.
This reddit thread has a few interesting comments. It was quite surprising to see D’s architect/designer discount the value of memory management issues in a sub link there. Anecdotally what I’ve read, and experienced, is that it’s one of the most important sources of bugs when quantity and time to debug and fix are considered.
Another one of Cybershadow's articles on D that blew my mind is, IMO, one of the greatest arguments for why you should care about D. The beauty, conciseness and flexibility of the implementation here is extremely cool. And this is not a one-off case; there are many programmers in the D community that create awesome stuff like this all the time. Writing in D is like having superpowers, and going back to a less expressive language feels horribly constraining.
Vladimir has one hell of a blog.
1 - It doesn't match one to one to the code and is in a funky non S-expr form where the function name is outside the parans
2 - As expressions are evaluated and you pop up the stack it will dynamically filled in parts of the higher expression so the "stack" is morphing and changing. It's cool and convenient but also disorienting.
3 - Each line/frame can be horribly long (like a whole let or cond expression in one line) and it's not clear which section/term/subexpression is being called in the frames below
I'm using the normal debug-on-entry and I'm definitely no pro, so maybe there is more ergonomic way to debug? (In the little Clojure I've done it seems to be the same)
For example: A depends on B depends on C, so the graph is A->B->C. To compile A, first B must be compiled, and for B, C must first be compiled. The compilation order has to be in reverse of the dependency graph.
>how that's possible?
The compiler first determines dependencies and then compiles. The compilation then happens per file, so that part is trivial.
I haven't tried applying this to either compiler. Being able to perform code generation serially would benefit those the most, because their backends are considerably slower than DMD's, however the template instantiation problem currently prevents that. I don't think it's insurmountable, probably just needs someone very familiar with that part of the compiler to look at it.
It turns out there isn't much purpose to that. DMD can build an AST from source so fast, that building an AST from parsing a serialized version will hardly be any faster.
Vladimir sidestepped that by saving a memory image via fork.
Hacker News threads: like Hacker News fibers, but less cooperative.
It didn't click that the title was also talking about the fork() syscall. I think it's a pun though: "To try it yourself, check out the dmdforker branch in my dmd GitHub fork."
I guess I've learned to forget/ignore the titles on hackernews after using them to decide whether to click through, since they so often mess with them.
That reminds me that "fork" when used as a verb on GitHub's UI should more accurately be "cloud clone".
GitHub's terminology has muddied the waters between the concept of actual forks (longterm divergent development) and a particular implementation of a permission system for branches.
Calling a branch a fork is just a water-muddying term probably coined by some suit in marketing who saw the buzzword somewhere in an article on the Internet Explorer and thought they would re-purposed to use as a product differentiator.
Like I said, "fork" already has a useful meaning with regard to an OSS project, so there's no need to overload it. Is the following sentence not confusing? When you fork on GitHub you create a fork but not every fork is a fork.
I say the former situation is an obfuscated permission system and the latter is a fork.
[ I've gone rather off-topic from TFA so will stop commenting now! ]
However, D's compile-time metaprogramming facilities allowed us to get ambitious in some places... for instance, std.uni precomputes some Unicode lookup tables during compilation, and std.regex makes heavy use of metaprogramming to compile regular expression strings to D code, again during compilation. As a result, making use of those features will result in heavier load on the compiler.
On the other hand, for the little things it would be weird. Would you do that for a single regular expression?