Hacker News new | past | comments | ask | show | jobs | submit login
D compilation is too slow and I am forking the compiler (thecybershadow.net)
251 points by ingve on Nov 21, 2018 | hide | past | favorite | 116 comments

It is interesting to use fork() to snapshot application state. It seems so high-level and powerful (Compared to other C/POSIX level stuff - so I'm not comparing it to Scheme/Lisp/"your favorite high/low level environment" here!). We may collect some other examples here - just for fun:

- I know that Redis does this (RDB snaphots) too, while the child process saves the actual snapshot, the parent can operate continously relying on the system provided copy on write mechanism

- I have also found this: https://github.com/thomasballinger/rlundo - It overwrites readline library to get undo function in any REPL using it

Other examples?

This is, incidentally, one of the built-in advantages of immutable data structures (in the languages that feature them). For example I recently had to write a test that worked with a fairly complex connection object, the execution of the test changed that object but later on I wanted to reset it to its earlier state- this was as simple as doing something like this in Elixir

    original_conn = conn
    conn = do_something_expensive(conn, params)
    assert <whatever>
    conn = original_conn # this "grabs" the earlier state, no matter how complex it is
    conn = do_something_else(conn, params)
    assert <something having to do with both calls>
Basically, if your language is immutable, then any reference to any state (no matter how complex the data structure) is the state it was at that point in time

You could say that changes to the state are forked from the earlier state.

The TeX program (by Donald Knuth) does this.

More precisely, the first version of the TeX program (now called TeX78 or TeX80) which was written in the SAIL programming language at Stanford was written in this way, because it was apparently the natural thing to do and lots of programs at the time were written like that. The present version (rewrite) of the TeX program (briefly called TeX82 but now just called TeX), was written in (a language/system based on) Pascal and meant to be portable for different OSes. So in addition to using the system's fork or dump/undump (if available) to snapshot the application state, it contains its own implementation of dumping all the relevant application state into a "format file", and loading from it. In fact, the dumping and undumping were traditionally done by different binaries:


See https://news.ycombinator.com/item?id=13076098 where drfuchs (David Fuchs, who was Knuth's "right-hand man" during his rewrite of TeX) describes it in more detail. (Originally posted on https://news.ycombinator.com/item?id=13073566 "The Emacs dumper dispute", and also quoted in https://news.ycombinator.com/item?id=14140421 "Improving startup time in Atom with V8 snapshots".) See also https://tex.stackexchange.com/questions/417624/installation-... for some elaboration (though it's confusing and now I realize it has at least one error).

Not exactly a fork, but Emacs is (famous/notorious) for saving its initial environment by basically coredumping and loading the coredump on startup if the startup files haven't changed:


http://emacshorrors.com/posts/unexecute.html(quite unflattering as the link suggests, but factually accurate nonetheless)

Only of historical relevance, but sure gave me a good chuckle of the "omg I can't believe that's how they made that work!"-sort when I first read it: Ultima Online fork()ed its server process(es) for marshalling in-process data into files on NFS, and used that mechanism to persist world state: https://news.ycombinator.com/item?id=9820645

Chrome and Android have both forked() warmed-up processes to speed application startup. Although looking at this 2011 post it's more complicated than I thought in Chrome:


This one is also kind of old:


If anyone has any updates on these I'd be interested.

As far as I remember App Engine also did a similar thing to avoid reinitializing a Python VM for every request.

Web servers did this all the time to skip parsing all of your cgi scripts on each request.


There is a time-travelling debugger that forks the process at each statement, allowing you to go back to a previous statement and continue from that point.

It seems like forking so often would use up a system's resources very quickly. I'd imagine that each fork needs at least a handful of individual pages for OS bookkeeping overhead, so you'd get into the gigabyte range fast even with perfect copy-on-write.

Yes you can't run like that forever. But you could break, step forward and backward and bit, and then resume. I often step forward while debugging, realise that whatever is going wrong has already gone wrong, and would like to step back a few steps.

It's more than a handful of pages - each forked process requires its own set of page tables.

It's cool but you can't un-write to disk, or un-send network packets making its applicability limited.

> you can’t un-write to disk

Why not? (Experimental) versioned file systems exist.

I think the broader point I'm making is that these time traveling debuggers can't work in the general case due to externalities.

Very interesting. Do you have a link where I can read more about this?


But I think the idea has been around forever.

Mozilla's rr (mentioned in the article) is a time-traveling debugger, but as far as I know it doesn't use fork() for snapshots / time-travel.

rr does use fork() internally to create checkpoints, and checkpoints are created implicitly as the program executes to support time-travel. The COW behaviour of fork() is extremely helpful for performance.

There's more to checkpoints than just fork(), though; for example, when two processes are sharing memory, checkpointing the process group fork()s both processes and then must explicitly duplicate the shared memory region, otherwise the checkpoint would share memory with the original, which would be disastrous. Unfortunately this means we don't get COW behaviour for shared memory regions; unfortunately the Linux/POSIX APIs are a bit deficient here.

gdb has checkpoint/restart that uses fork(), though you have to specify the checkpoints manually. https://sourceware.org/gdb/onlinedocs/gdb/Checkpoint_002fRes...

It has another mode where you can record and rewind execution traces, which sounds more like the description, but I'm almost certain that doesn't use fork. https://sourceware.org/gdb/onlinedocs/gdb/Process-Record-and...

The Factorio dedicated server has an 'experimental' fork-to-save option, which is massively helpful when late-game saves take 10-15 seconds.

Alright, I'm geared up, I'm going to go take out this hive-

> Saving

> Crash the already difficult to drive vehicle into a nearby rock

> You died

But I suppose immutable data structures can be more efficiently stored using a heap with byte-size-granularity, rather than by forking, which uses pages of perhaps thousands of bytes.

There is another fairly similar design decision split:

LVM - low (block) level, hardware supported, generic, unavare of filesystem layer above (so it may not be optimal for some usecases) but at least this gives some separation of concerns, and still can provide some impressive features: resize, encrypt, snapshots

CoW filesystems (btrfs, zfs, ...) - specialized, carefully designed data structures/layout, can have more advanced features and more optimized to given ussecases (they have more information what is actually happening), but they are not a generic separatable layer - all the fancy advanced features are tied to the high level fs implementation

Why would you think so? A page is 4096 bytes and it's hard coded into the kernel (and the hardware). With byte sized granularity, you are just introducing extreme large overhead for page table entries and make the TLB a lot less effective.

I wasn't suggesting to make the page size smaller, but instead to not fork the compiler and use a single heap for all versions of the data.

- The Resque background job queue (popular in the Ruby world) forks itself to run each job, to isolate the scheduler from any badness in the jobs.

- I’ve used fork() in Ruby for similar reasons when a library had instability or memory leaks and I wanted to isolate it rather than try to find the bug (I’m looking at you, ImageMagick).

A while back, MRI Ruby got an improved garbage collector that keeps the mark bits separate from the objects just so copy-on-write would (mostly) work in cases like this.

I used this forking trick to speed up the ninja build system too: https://github.com/ninja-build/ninja/pull/1438

Unfortunately the maintainers didn't like the idea so it won't be merged. But if your project is large enough that ninja takes more than a second to parse your build files every time you build (true for Chromium and Android at least), you might want to try the patch.

On the topic of Windows forking, the RtlCloneUserProcess function does exist and it works. The problem is that Microsoft's implementation of the Win32 API keeps state (such as an IPC connection to csrss.exe) that is invalid in the forked process and there's no easy way to clone or reinitialize that state. The child process will run until its next Win32 API call and then probably die. Since almost every Windows program uses Win32 APIs, RtlCloneUserProcess is not useful for forking Windows programs.

Microsoft could fix this if they wanted to, at least for a useful subset of Win32. But I imagine it requires a lot of hacking in deep, dark parts of Windows that nobody wants to touch.

That’s an interesting note about Chromium, I wonder if it will move at some point to Bazel given that Chromium's latest build script generator GN might be called Yet Another Build System That Looks Suspiciously Like Bazel. (Others: https://www.pantsbuild.org/ https://buckbuild.com/ https://please.build/)

Obviously this won’t happen unless Bazel adds, like, a million things that Chromium developers think are essential features but still, I wouldn’t be surprised. Bazel already has a ton of developer effort thrown at it to make it work with large code bases well, and recompile large, complicated projects quickly. The forking idea (or at least, the client/server model) is already used by Bazel, and Bazel is capable of managing subprocesses for compiling specific languages. This is how Bazel’s TypeScript support works, for example. When you run Bazel it spawns a server which hangs out in the background, and that server will run a TypeScript compiler in the background. Compiling a TypeScript project with Bazel is noticeably and significantly faster than compiling from the command line, without the hacky / fragile bits that tsc --watch gives you. And obviously none of this would be possible unless the TypeScript compiler were written to heavily cache things and run as a library (see https://youtu.be/f6TCB61fDwY).

I have successfully reinitialized enough Win32 in a forked process to be able to throw up a MessageBox, but without source I'd never be able to confidently state that every Win32 call would work correctly.

Cool, is your implementation public somewhere? I'd be interested to see it.

Back in the 1970's, I was fascinated by the (now famous) ADVENT game and how it worked, so I studied the Fortran source code.

When started up, it would do all kinds of slow (at the time) initialization of its data structures. Then the genius move kicked in - it saved the memory image to disk as the executable file! Next time it ran, it was already initialized and started up instantly. It blew my mind.

I liked the idea so much, that years later when I worked on the text editor I use (MicroEmacs), I did the user configuration the same way. Just change the global variables, then patch the executable file on the disk with the changed values. It was so, so much simpler than defining a configuration file, serializing the data, writing/reading the file, and deserializing it. It worked especially great on floppy disk systems, which were incredibly slow.

But then came viruses, malware, etc. Patching the exe file became a huge "malware alert", and besides, Microsoft would mark the executable as "read only" while it was running. Too bad.

GNU Emacs does a similar-sounding trick in its build process (called 'dumping' / 'unexec'), but not at runtime.


Come to think of it, I think there are applications to my own work... I'd have to be careful, though.

If you're looking at implementing something along those lines these days, take a look at CRIU: https://www.criu.org/Main_Page

It seems like someone should come up with an "executable+data storage" file format (where virus scanners can know to only check changes in the executable section) Maybe they can make one that is supported cross platform while they're at it...

Impressive demo and lovely writeup. That was fun to watch.

I've been working on building the Zig self-hosted compiler this way from the ground up, except with reference counting the cached stuff rather than using fork(). This lets me do the caching at the function level, even more fine grained than the file level. Here's a 1min demo: https://www.youtube.com/watch?v=b_Pm29crq6Q

Why are the results in a video instead of a table? I don't want to wait for dmd to know how much faster you made it.

> Why are the results in a video instead of a table?

Seeing is believing! :)

> I don't want to wait for dmd to know how much faster you made it.

Here are the timings from the video in table form:

  Normal compilation (without forking):         9.163 seconds
  Full compilation + creating forks:           10.464 seconds
  Final step only (code generation + linking):  2.847 seconds
  After editing entry point file only:          3.792 seconds
  After editing deep dependency (first build):  9.562 seconds
  After editing deep dependency (second build): 4.675 seconds

would have preferred this format as well. I hate when the goods are buried in a video.

7 seconds slow.. I was happy when I decreased the 32 hours build to 4 hours.

7 seconds for a rebuild after a small edit is already a little unergonomic when you are iterating heavily. But it all depends on the project. Simple C-style code should compile at ~1M lines of code per second on today's machines without heavy optimization, and maybe compiling on 4 cores or so. (Not possible with common infrastructure like C, gcc/clang/msvc, large headers per compilation unit, many small files, object file style linking! and the awfully slow linkers (I wonder if they could be faster))

I knew one guy who used C++ but hated STL and most of standard library. He usually implemented data structures in-place as he need them (linked list, growable array, etc). While it seems strange to reimplement similar data structure over and over again, it worked for him. And one particularly wonderful thing was how fast his projects compiled. He had one project with few libraries, around 100k lines and it compiled very fast, something like second from the scratch and it was 10 years ago.

That's the way I learned C++. While I mostly use the STL now as it's much more convenient to write, there is an elegance to writing one's own tools.

you'd also be shocked, SHOCKED at how inefficient some of the STL implementations of basic data structure algorithms are

I've almost purposely avoided studying what the STL does because of that. I have some of my own notions on how some of the data structures ought to be built in order to be efficient, and sometimes I implement them myself to be sure I get the behavior I want.

Of special note is the unordered_map/unordered_set implementations.


imagine all the CS research that relies on the STL expecting it to behave optimally

> ... should compile at ~1M lines of code per second ... > Not possible with common infrastructure

If it is not possible with common infrastructure, what do base your estimate on?

My own experiments. I have recently bought a new PC with an i5-7600 and my own pascal parser, written in straightforward unoptimized C, parses about 2M lines of code per second on a single thread.

Also I'm following Jonathan Blow's streams (Jai language). With his own trivial x64 backend he has/had a very real compilation speed of 80K lines/s, and only the parser was parallelized. I think he indends to improve that to 800K lines/s. Note that his language is quite a bit more work to compile than a basic C-like language (which is about as easy as you can get if you discount the parsing model).

The linkers could definitely be faster! I've seen reports of Rust debug builds going from ~6 seconds to ~2 seconds by moving from gold to lld

Seven seconds is already pretty fast. Might have warranted some profiling and a brief attempt to speed up the parts of the compiler that your usage hits, since that could potentially be useful to everybody and thus magnify the value of cutting one second off a build. But this? Was a cool learning experience, no more, no less.

7 seconds is plenty fast enough to make “watch” the cheapest and most reliable solution to the speed problem.

I'm not sure what you're talking about with respect to the C compiler. C compilers are the fastest around even before being paralllelized - and they can be.

C parsing is rather slow due to the preprocessor (also: huuuge include files) and due to context-dependent syntax (need symbol table). And compiling itself need not be slow, but the popular compilers are not exactly super fast even with -O0. LLVM in particular has a reputation of being somewhat bloated. And finally, due to common practice of using a standard object file format, which is slow to link (at least with common tools), the whole build experience is not exactly blazingly fast.

With clang/gcc/msvc, I think I'm more in the ballpark of 3K-30K lines/sec (-O0, 100-1000 lines/file, not counting basic std* includes).

gold is pretty fast. I don't have a source but those numbers seem very off to me. C compilers are fast. C++ much less so, but C++ is a different language.

I tried to follow gold's author's series on linkers once but gave up. Now that I've written an ELF-64 writer and know a little more about linking I should give it another shot. In any case, I think I've heard of still faster linkers than gold, and at the time I also found gold's object oriented architecture (as it was described in the series) questionable.

gold is sure fast compared to bfd.ld, but then you should try lld from the LLVM project which is much faster in general.

Do not measure other people problems by your scale.

The first part of this reads like an intro to algorithms class where I think they cover all of these things. Directed graphs, back edges (cycles), convert to a DAG (no cycles), strongly connected components, topological sorting, etc. Nice to see it applied to solve the problem at hand.

I wish that Jupyter used some kind of similar trick, to allow quickly recalculating a workbook starting at some cell.

The Jupyter models is hard for this, but if you only allow in-order execution (like a repl, say IPython whose kernel is used in Jupyter) you could do something like this.

We tried plugging in the rlundo tool https://github.com/thomasballinger/rlundo to IPython, it works ok, see http://ballingt.com/interactive-interpreter-undo/ for the long story.

I wanted something like this for live programming and felt like I needed to write my own interpreter because the behavior you need is so different: http://dalsegno.ballingt.com

If a notebook was written functionally without side effects, couldn't each function be explicitly or implicitly decorated with an @memoized ?

That's not really how notebooks are written, at least in my experience. Even if there aren't side effects, work is often done at the lowest level without defining functions at all. Also, the fact that function definitions can change makes it tricky, and I suspect you'd run into lots of subtle issues: e.g. if you modified a function foo and a subsequent function bar calls foo, then I think calls to bar would return stale memoized data.

Yes, agreed. I'm spitballing. I was thinking about how a hypothetical type of notebook could be structured in frames, where each frame can have side effects, but ultimately its input comes from the output of the previous frame. So each frame could be memoized. The UI could indicate what frames are being re-processed.

I'm basically thinking about ArcGIS Model Builder[1] that I used a lot in university. It was a great way to make complex process pipelines for GIS data, but only re-run the pieces that change. It allowed me to experiment at a very fast pace.

[1] http://pro.arcgis.com/en/pro-app/help/analysis/geoprocessing...

there’s a single algorithm (Tarjan’s) that actually solves all your graph needs. It returns a reverse topologically sorted DAG of strongly connected components

Yes, but I need to sort by two criteria, and Kosaraju's algorithm is simpler to implement :)

I used to use your actionscript3 decompiler, thank you cybershadow!!! It was a great tool :)

Ask Fabrice Bellard to write tdc after tcc [1] :-)

[1] https://bellard.org/tcc/

Nice work thank you for sharing.

Tangentially it’s not clear to me what relevance D should have for any greenfield projects where there is not already heavy investment in a code base to consider.

First there’s the whole Rust comparison, and then the incredible impact of the comparative size and momentum of ecosystems for various languages.

This reddit thread has a few interesting comments. It was quite surprising to see D’s architect/designer discount the value of memory management issues in a sub link there. Anecdotally what I’ve read, and experienced, is that it’s one of the most important sources of bugs when quantity and time to debug and fix are considered.


> Tangentially it’s not clear to me what relevance D should have for any greenfield projects where there is not already heavy investment in a code base to consider.

Another one of Cybershadow's articles on D that blew my mind is, IMO, one of the greatest arguments for why you should care about D. The beauty, conciseness and flexibility of the implementation here is extremely cool. And this is not a one-off case; there are many programmers in the D community that create awesome stuff like this all the time. Writing in D is like having superpowers, and going back to a less expressive language feels horribly constraining.

Perhaps are you referring to this article?


Vladimir has one hell of a blog.

Yes, thanks. Forgot to actually post the link.

This is all so complicated, and reminds me of the joy of doing incremental development in Common Lisp with Emacs and SLIME. Edit-Compile-Run-Repeat cycles feel comparably archaic and slow compared to the much, much shorter and faster EditFunction-RunFunction cycles.

My only experience is with ELisp/Clojure and you're right that the turn around time is much faster with editing, but at least in ELisp I find "stepping" through code much more confusing. The call stack (or whatever the equivalent is called) has some weird characteristics:

1 - It doesn't match one to one to the code and is in a funky non S-expr form where the function name is outside the parans

2 - As expressions are evaluated and you pop up the stack it will dynamically filled in parts of the higher expression so the "stack" is morphing and changing. It's cool and convenient but also disorienting.

3 - Each line/frame can be horribly long (like a whole let or cond expression in one line) and it's not clear which section/term/subexpression is being called in the frames below

I'm using the normal debug-on-entry and I'm definitely no pro, so maybe there is more ergonomic way to debug? (In the little Clojure I've done it seems to be the same)

I believe that this is an artifact of elisp being essentially interpreted, but I could be wrong. With Common Lisp and SBCL, you only see function calls in your stack trace.

Gotcha. Thanks. It's probably some issue with my workflow. I'll ask on /r/emacs and see if they give me any tips :)

this is very interesting and nice post. thanks! i love this use of fork. it should be obvious from the description of the function that it can be used as such but really never even came to mind. cool!!

Did anyone else read this title as "D compilation is too slow and I am forking the compiler (in source control because I think I can do a better job writing it)"?

Everybody thought that until they read the article, which was clearly the intention of the author.

It's comparable to go, maybe a bit slower. In comparison to DMD or LDC, rustc feels like endlessly grinding. I haven't used GDC.

I wish somebody would fix the GCC/G++ this way. I was stupid enough to try installing webkitgtk from aur (as I couldn't find what binary Arch package would make it work in Python), many hours have already passed, I have already written the program I wanted with Qt instead of Gtk while waiting (although the whole computer works annoyingly slowly during the process) and it still is compiling...

What would be involved to make ccache [1] support D?

[1] https://ccache.samba.org/

I understood it up to "Note how the compilation order is the reverse of the topological order (we’ll get back to this in a bit)." But then couldn't find where you got back to it, and can't quite understand why it's the reverse of the topological order or how that's possible?

>can't quite understand why it's the reverse of the topological order

For example: A depends on B depends on C, so the graph is A->B->C. To compile A, first B must be compiled, and for B, C must first be compiled. The compilation order has to be in reverse of the dependency graph.

>how that's possible?

The compiler first determines dependencies and then compiles. The compilation then happens per file, so that part is trivial.

What's the story over in gdc and ldc land? I presume since the plumbing for this sort of thing is already there (presumably), they might already be doing these sorts of AOT/incremental builds.

They share the same frontend as DMD, which doesn't have any built-in capacity to hibernate/serialize compilation results (parsed AST) to disk.

I haven't tried applying this to either compiler. Being able to perform code generation serially would benefit those the most, because their backends are considerably slower than DMD's, however the template instantiation problem currently prevents that. I don't think it's insurmountable, probably just needs someone very familiar with that part of the compiler to look at it.

> doesn't have any built-in capacity to hibernate/serialize compilation results (parsed AST) to disk

It turns out there isn't much purpose to that. DMD can build an AST from source so fast, that building an AST from parsing a serialized version will hardly be any faster.

Vladimir sidestepped that by saving a memory image via fork.

The original author of D is suprisingly missing from conversation.

Walter's D9000 computer is monitoring this thread in his stead.

Not polling, I trust.

Hacker News threads: like Hacker News fibers, but less cooperative.

LibreOffice Online also uses a warmed-up-process that forks child processes to execute requests

Note: in this case "fork" means fork() in the sense of creating a new process, not as in duplicating the source tree.

This discussion thread is hilarious (and adorable) to me. Makes me feel old (in a good way) that there are folks for whom the default meaning of "fork" is git(hub), rather than the unix fork syscall.

OHHH. Haha I had to read your comment before I understood the title. I knew they were talking about the fork syscall while reading the article, but I still assumed the title referred to "forking the compiler on GitHub, so that they could make it better and send a pull request."

It didn't click that the title was also talking about the fork() syscall. I think it's a pun though: "To try it yourself, check out the dmdforker branch in my dmd GitHub fork."

Double OHHH. Just realized upon reading your comment that I'd read the title/article the same way, and then forgotten. I guess “forking the D compiler” naturally leads to the source-forking context.

I guess I've learned to forget/ignore the titles on hackernews after using them to decide whether to click through, since they so often mess with them.

Hasn't the term 'fork' been overloaded long before GitHub (or even git)? Project forks have been a thing for as long as there have been open source projects...

Yep. Although it was github with the t-shirts… :-)

I think it is an intentional double entendre

my internal confusion wasn't intentional though

> duplicating the source tree

That reminds me that "fork" when used as a verb on GitHub's UI should more accurately be "cloud clone".

GitHub's terminology has muddied the waters between the concept of actual forks (longterm divergent development) and a particular implementation of a permission system for branches.

As soon as you create your own commits on a clone of the repository you then have a 'fork' in the chain of commits.

We used to just call it a 'branch' because it was just another branch in the distributed revision tree. There is nothing special about a branch you create and commit to, whether it's by clicking on a picture on the screen or typing a command in a terminal.

Calling a branch a fork is just a water-muddying term probably coined by some suit in marketing who saw the buzzword somewhere in an article on the Internet Explorer and thought they would re-purposed to use as a product differentiator.

More likely the founders of GitHub were not expecting world domination and so did not consider every ramification of a word choice.

It's not the same thing though. A GitHub 'fork' has a different origin than a regular branch.

It's the third git repository, which is on your own hard drive, that has those.

That's a second-order observation. We could use any noun for that if and when it's relevant - I suggest alternadoodley.

Like I said, "fork" already has a useful meaning with regard to an OSS project, so there's no need to overload it. Is the following sentence not confusing? When you fork on GitHub you create a fork but not every fork is a fork.

Every fork is a fork. They're just not all long-lived or significant, which seems to be the distinction you're drawing here.

Put it this way, when the user presses the Fork button in the GitHub UI, is their intention to contribute to the project, or is it to create a competitor/alternative to the project.

I say the former situation is an obfuscated permission system and the latter is a fork.

[ I've gone rather off-topic from TFA so will stop commenting now! ]

Isn't that also true of a branch in the same repo?

Yes branching is done in Git by forking the sequence of commits into two? Giving you your own fork of the software.

I thought D was proud of compilation times and claimed to be useful as a scripting language?

Correct. The language is designed to allow code written in it to compile quickly, and the reference implementation is very fast (both the front-end, and back-end). Small, script-size programs compile quickly.

However, D's compile-time metaprogramming facilities allowed us to get ambitious in some places... for instance, std.uni precomputes some Unicode lookup tables during compilation, and std.regex makes heavy use of metaprogramming to compile regular expression strings to D code, again during compilation. As a result, making use of those features will result in heavier load on the compiler.

I don't like code that generates stuff at compile time precisely for this reason - I would much rather use some template language to pre-generate the source (I've even seen php used) and then compile it quickly. Of course that then needs support from the build system to correctly invoke the generator when the templates change and you run the risk of some developer changing the generated source instead of the template and wreaking havoc. I would very much like to have my cake and eat it - powerful code generation facilities in the language itself, but with a compiler that's as smart as the build system and can cache these for reuse when recompiling the same file. That would also allow easy integration with IDEs. For now, I've decided that the better balance is generating code using an external generator, checking everything in source control and requiring responsible people to do code review. YMMV of course, depending on what you're compiling and who you're working with.

I agree with you. It feels natural for lexer/parser generators for example (Yacc, Bison, AntLR, etc). Also for FFI bindings.

On the other hand, for the little things it would be weird. Would you do that for a single regular expression?

It's pretty damn fast every time I use it, but I don't do as much metaprogramming, which doesn't surprise me that a specific portion of D takes up time, it makes total sense to me that metaprogramming would take up a bit of time, and throw in some regex and voila, but he's definitely a D wizard in my book, I've known his work for a while now. I'm laughing at the fact 7 seconds is too long, I think it's a reasonable amount of time, but I'm usually compiling C#. The most time for me when building D is when pulling in libraries using dub (which potentially could be improved, but I'm no expert in the matter) although if I enable parallel compilation that usually breezes on by through my code.

D compilation: Not only slow but also high usage of memory. f.e. a simple program with searching and regex eats 2-3 G memory. sigh!

I haven't had this problem. I wrote a daemon to blend colours for my RGB keyboard in D and it stays below 8192 bytes.

See comment about regexes:

However, D's compile-time metaprogramming facilities allowed us to get ambitious in some places... for instance, std.uni precomputes some Unicode lookup tables during compilation, and std.regex makes heavy use of metaprogramming to compile regular expression strings to D code, again during compilation. As a result, making use of those features will result in heavier load on the compiler.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact