Hacker News new | past | comments | ask | show | jobs | submit login
Forth compiler in one literate x86 assembly file (annexia.org)
78 points by mbrubeck on Nov 15, 2009 | hide | past | web | favorite | 10 comments

This is a gem: I've read it before and I read it again.

This is what I long for: simple self-contained bootstrapping systems that can be eventually used to build anything. These days it seems that all we have is that "anything" and we'll just glue the pieces together without ever understading how everything works.

That's what got me started in computers anyway: to understand how _everything_ works.

A wonderful thing about Forth-style languages is that they are amenable to many implementation strategies, and if you are looking for a quick-and-dirty DSL you can usually do it as a Forth.

Something I did recently is a source-to-source compiler as a series of function calls, e.g.:

1 2 +

may translate to

stack.push(1); stack.push(2); add();

in an Algol-type language. You don't get niceties like a "real" namespace or interactivity with this approach, but if the function calls are inlined, the only performance overhead comes from usage of the data stack - making it a good candidate for "glue code" that doesn't need to be dynamic. It's basically "Forth-like macro-assembler" at this level.

For things like runtime mutability, interrupt-and-continue behavior, and interactivity, another strategy is to build an interpreter loop where each word is a list or array of data containing two types - values and function calls. Lots of features become possible - even easy - once you do that.

Since I started experimenting with stack languages I've been trending towards this approach: build a Forth-type API on top of whatever language is best for the task. Then build the app in the Forth. On the one hand you can say "ah, but that's not necessary," but on the other, it focuses your energies around setting up the plumbing well, so that the final solution is small and easy to extend. And that's always a good outcome.

A book called "Threaded Interpretive Languages" by RG Loeliger is about Forth (not about multi-threading). The author is sufficiently excited about the topic that you can envision him falling out of the chair. It also covers the very basics of how to build Forth.

Does forth qualify for the same category as smalltalk and lisp?

Forth is an easy-to-bootstrap language that defines several primitive words.

You use the primitives to define other words in order to achieve your desired effects.

It is a stack-based language.

Forth programs tend to be extremely small, typical usage is in embedded control applications, though there have been much larger programs written in Forth. Forth was invented by Chuck Moore, he is still active:


It definitely is a language that is worth learning, if only because it will subtly change your point-of-view.

If you're really interested you might want to read this book:


Forth has incredibly primitive memory management. Of course, you can add your own, but out of the box you don't have any GC. Heck, you don't even have reference types; it's like the hoary days of BCPL where "everything is an int" until use use ints as pointers. PostScript is a very cleaned-up forth, for example.

Bootstrapping a forth-like system on your own is a great exercise; it'll teach you a lot about writing minimal systems. The _Threaded Interpretive Languages_ book is a good read, too.

My personal belief (after seeing many forth-based projects fail) is that it's a terrible language for scaling to real systems and even medium-size teams, but it's got some great ideas. The trouble comes when you do stuff like trying to treat that toy driver ("hey, i just wrote a disk driver in eight lines of forth!") as something that is ready for production. Also, a sea of invented operators like "^!" does not help, though arguably this is more developer discipline than a fault of the language.

[[[pats forth on the head. nice language, now go play]]]

Forth and lisp are very closely related — conceptually, not historically.

If you wrote a lisp where evaluation went left-to-right instead of right-to-left and where the operators took fixed numbers of arguments (so you could drop the parentheses), you would have an unconventional but recognizable dialect of forth.

http://en.wikipedia.org/wiki/SECD_machine is an interesting example of something between lisp, forth, and traditional assembly.

Like Lisp, it's a programmable programming language. Like Smalltalk, the original Forth was a self-contained programming environment with full source that you can hack on the fly. (This particular Forth system isn't the best example of that.) Unlike either, it's lower-level; except for things like embedded systems you're unlikely to prefer it to get stuff done. It's most notable for being super-simple to implement on bare metal.

Forth is like the complete opposite of Lisp. But after you reverse so much, you arrive at something similar.

Syntax is of course the most obvious `backward' thing. Conceptually Lisp moves you away from the machine and closer to math, Forth stays way closer. Forth does not (normally) include a garbage collector. Naming variables is discouraged. Forth is almost untyped, Lisp is strongly-dynamically typed.

Awesome. Forth is a great language!

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact