A sometimes minimal FORTH compiler and tutorial for Linux / i386 systems by Richard W.M. Jones
"LISP is the ultimate high-level language, and features from LISP are being added every
decade to the more common languages. But FORTH is in some ways the ultimate in low level
programming. Out of the box it lacks features like dynamic memory management and even
strings. In fact, at its primitive level it lacks even basic concepts like IF-statements
Why then would you want to learn FORTH? There are several very good reasons. First
and foremost, FORTH is minimal. You really can write a complete FORTH in, say, 2000
lines of code. I don't just mean a FORTH program, I mean a complete FORTH operating
system, environment and language. You could boot such a FORTH on a bare PC and it would
come up with a prompt where you could start doing useful work. (...)
Secondly FORTH has a peculiar bootstrapping property. By that I mean that after writing
a little bit of assembly to talk to the hardware and implement a few primitives, all the
rest of the language and compiler is written in FORTH itself. Remember I said before
that FORTH lacked IF-statements and loops? Well of course it doesn't really because
such a lanuage would be useless, but my point was rather that IF-statements and loops are
written in FORTH itself."
I like the feeling of concatenative languages. I especially seem to like the postfix notation over prefix and largely this creates the paradigm of lisp vs forth at first step - in my opinion.
I’ve recently dug in to Factor and it’s quite an amazing piece of art that is highly usable and up to date despite such a small community. It almost feels like a secret weapon and too good to be true. I’m having a ton of fun hacking on it. It’s like Smalltalk and forth had a baby.
I dream of making an OS with Forth/Factor similar to Plan9 and Oberon.
I get a kick out of Forth being made of words and Genesis from the Bible.
At first was the Word, and the Word was with Forth, and the Word was Forth.
Tidbit if anyone cares: I’ve been obsessing over returning to first principles of computing. I’m bored of the internet and browsers. I want to have fun with software AND hardware and Forth seems like such a perfect language to do exactly that with. Factor is a nice grown up example of that but it’s definitely a few steps removed from portability and self bootstrapping behavior like in CollapseOS.
Retro Forth is extremely impressive: http://forth.works
Oberon System 3 with its gadgets framework or the current AOS (still used as teaching tool at ETHZ) also have interesting ideas, and better support for systems programming.
The book "The Oberon Companion. A Guide to Using and Programming Oberon System 3" is available as part of the System 3 environment delivered as part of AOS.
You can find some old ISOs from here https://github.com/cubranic/oberon-a2
Also, following on from A2, Composita introduces a component architecture to allow managed memory without GC, and contextless switches:
IMHO better than Rust, and I'd love to see this style of memory management paired with a minimal lisp eg.
so we can build more deterministic high-level structures
Also have a look at Mesa/Cedar, as it was the inspiration for Oberon ideas.
Comments include a link to one hour demo on YouTube with members of the team.
As idea, many of these concepts can be done on modern OSes, on top of something like COM/D-BUS/gRPC/XPC, but none of them go as far as they did.
The “hook” for me was seeing things like ifs and loops being written in Forth itself from simpler primitives. I’ve always been a fan of bootstrapping a general purpose computing environment from a small set of primitives, and Forth lets you do exactly that.
> I’m bored of the internet and browsers. I want to have fun with software AND hardware
I can 100% relate to that feeling. Discovering Forth has been therapeutic for me in a sense. I spend the day dealing with AWS and Java and logs and metrics for large distributed systems, and that brings me zero joy or motivation. I spend my work days looking forward to my time with Forth in the evening.
I think it was CollapseOS that sent me down this rabbit hole :)
"My minimal stack of layers is – problem, software, hardware. People working on the problem (algorithms, UI, whatever) can't do software, not really. People doing software can't do hardware, not really. And people doing hardware can't do software, etc.
The Forth way of focusing on just the problem you need to solve seems to more or less require that the same person or a very tightly united group focus on all three of these things, and pick the right algorithms, the right computer architecture, the right language, the right word size, etc. (...)
So you need at least 3 teams, or people, or hats, that are to an extent ignorant about each other's work. Even if you're doing everything in-house, which, according to Jeff Fox, was essentially a precondition to "doing Forth". So there's another precondtion – having people being able to do what at least 3 people in their respective areas normally do, and concentrating on those 3 things at the same time. Doing the cross-layer global optimization.
It's not how I work."
That’s largely where we are today in many sectors of society. Experts who only understand the nuances of a niche layer they’ve patched into a series of layers of indirection by the result of not taking the time to fully understand what one is working with from first principles or fundamentals. The OS’s of today are computation stacks that resemble something like bandaids and house of cards with people piling on intricate complexity without conceptual understanding that can be effectively reasoned about.
I understand that simple is hard. However when one can reason about a system when it fits entirely in their head then they can extend it completely. I believe all systems should be this way. To have anything else is a risk of corruption or assimilation as the minority control a majority through complexity against time.
Sorry I kind of went off on a tangent there.
Tl:dr systems should fit in anyone’s ability to reason about it conceptually. Anything too complex is an attack vector and disaster waiting to happen.
I don't remember who said that, but it's a really smart observation: Civilization is not just the state of society when there are lots of smart professionals, orchestrated together to provide infrastructure, and tools, and goods, and services, it's a state of society when almost everyone routinely uses the fruits of labour of those professionals without even being aware of that labour. That's the point: that it Just Works™. You don't have to think about it.
Obviously, most people can't do that due to less skill and time allowed.
The thing is, "genius" doesn't scale. At some point, if you want to scale production up, you'd need more people. And most of them won't be geniuses, they will be normal people, in the very literal sense of the word "normal": they'll be somewhere in the middle of the bell curve on most of the parameters; and even the number of parameters in which they're outliers too will be in the middle of some bell curve. So either your tools and processes accommodate normal people, or your company won't make it. Heck, even if you somehow manage to gather together 10 geniuses, it's a (biased) coin toss whether they'll work together productively.
And that means you will at some point arrive at systems (and organizations!) that nobody comprehends in entirety. At this stage you can only hope that those systems managed to grew up resilient, that is, that they can be maintained, repaired, and extended on a piece-by-piece basis. Sometimes they can't be (cf. most government departments/small-to-medium businesses at the latest stages of their life cycle), so they collapse and are (eventually) replaced by something else. Sometimes it turns out that the organization that had collapsed was your whole civilization and now you're in the the Dark Ages. That sucks, but the only way out of the Dark Ages is gradually building up a new complex system, and while it will start from a small and seemingly perfect new way to run things, it will eventually, grow large and incomprehensible in its entirety—if it doesn't, it will be supplanted by the one that can grow so, there is really no escape from this.
Perhaps I’m optimistically doomed. However that’s the world I want to build towards.
Plus don't forget about such thing as politics (in the broad sense of the world). Get a 10 people together to work on anything, and there is already good chance they'll form 2-3 subgroups. Get 100, and now you have to actually think about management and persuading people to move things roughly in the same direction.
Outside of the truly mentally challenged, every individual has the capability to express genius. We are individually limited in today's society based on the accessibility and affordance of being curious.
If the entire human species was able to be curious and experiment on any idea they had then we'd see everyone was "true genius" in their own regard, in my optimistic opinion and theory of what the human animal can do.
Knowledge has to be accessible to all, resources to survive have to be abundant to all, and resources to experiment with ideas must be available to all. These are lofty requirements but it's possible and I want to do whatever I can to get there and I do not want to settle for the status quo of today.
The world is malleable today, and it rarely gets this malleable in history. When it's this hot, big changes can take place for better or worse - the opportunity to do big things is exactly now without having to settle for how "the world works" - lots of things can change now and we have the technology to do some amazing things, old structures pin us down or erect some level of "reality distortion field" that limits our greater degree of freedom available to us each.
Maybe I'm delusional, but, at least I'll have fun working on what I dream of.
Think of the biology of the human body or practically any other industry. I've been out of college for about a decade and have moved roles several times at my company and I've only managed to learn a sliver of the complexity of my department. We have public documents that are thousands of pages long and that doesn't even get into the software. It's true that a much simpler solution exists like the industry did prior to computers, but it was probably an order of magnitude less efficient. Now we save large amounts of money at the expense of much higher complexity. You need a lot more people because each person can only specialize in a few things and has to abstract away the rest and trust that other co-workers are taking care of the other parts that you don't understand. It's frustrating, but I think most industries have this. Think of a nuclear power plant. There are hundreds of systems and processes. Is it even possible to learn them all in a single career past a bare minimum? What about outside the plant? The plant connects to the transmission grid which is operated by an entity and you have power marketers which sell your plant's output into a market which is operated by another organization running sophisticated software to optimize the region. There are several regulatory bodies involved from the NRC to the State Utility Commissions, to NERC & FERC that all have some regularory jurisdiction over you. Once your electricity gets on the distribution grid it is handled by utilities and so forth. The world is just insanely complex.
Could it be simpler? Yes, but there is an efficiency cost to that as well. Chuck Moore's chips are cool, but do they solve any current problems? If they were significantly better, people would be jumping on them. The reality is they're probably better for some niche uses. Certainly not enough to leave the confines of Windows or Linux with a traditional toolchain. Even without all the inertia it would be a tough sell. I don't know anything about complexity theory, but I imagine computer science is under the same rules as the rest of the universe. This also might be where Chesterton's Fence is relevant. We shouldn't argue to take down the fence when we don't know why it was put up to begin with.
It also erodes the peak innovation potential of the entire species by increasing the distance to experiment and be curious while also creating greater need of dependence to survive.
I think a lot of your examples can be fixed.
Nuclear Power Plants are simple actually - they need to be simple. Complexity kills them, and human's are the weakest link. Process is introduced to ail complexity. Complex ideas introduce more complex processes to include more humans which ends up creating less efficiency. Many examples of that. A big reason why startups can compete with corporations is that startups have less complexity but then introduce more as they age usually through poor problem-solving and get slower. Companies like Apple have punched above the average weight to survive those chasms of complexity and still move quickly unburdened but obviously they have their own class of warts.
More thoughts here: https://news.ycombinator.com/item?id=24479238
Agree with you but also I think we can do better is basically what I'm saying. I don't see the world today as "this is just how it is and that's that" - there's no physical rules why we have a society the way it is today - it's mostly imaginative rules. I want to change the game, the current board is lame.
A nuclear plant isn't simple. There's a reason why they require more than 1 person to maintain and that they have only existed for 0.00000000001% of human history. Conceptually they're simple if you black-box abstract the majority of the plant which is loaded with technology from the structure to the cooling structures, sensors, waste disposal, regulatory structures which prevent another Chernobyl...etc etc. You're abstracting all of that necessary complexity away (just like we do with modern tooling) and calling it simple when it isn't.
There's a lot to unpack there.
Another way to describe it is that software and hardware are only becoming increasingly complex. What are the total amount of people that understand this complexity end to end? That is a very small elite. Given anything goes wrong like infrastructure or some unknown unknown event then the entire idea of "Just Works" is going to be surely broken. This is a huge attack vector to humanity.
I'm super interested in the subject: https://github.com/akkartik/mu. In the framework of this thread I focus on 2 of the 3 layers: problem and software. Hardware is out of scope. For now. Recently I've started thinking about BIOS (inspired by https://caseymuratori.com/blog_0031). So perhaps it's only a matter of time.
Hence the title of probably the scariest thing ever broadcast on TV:
(Edited to add that while I haven't done any significant Forth work in probably 20 years, I've long been intrigued by Factor and I keep meaning to try 8th, as I've got some small simple program ideas that would be lovely to be able to run on Mac, iOS, and Android.)
Oberon inspired Acme and pforth runs on plan 9: https://github.com/Plan9-Archive/pforth9 (Factor cant be built on plan 9 due to the lack of c++ feature.)
I went through a forth phase and realized that its a good fit within the industrial control and motion segments. I started writing a motion control library I called forthaxis which took coordinates and spat out trajectory commands for servo drives. Didn't get very far but it was a fun exercise.
Edit: I never used one (or indeed ever saw one) but for some reason reading about Forth at the time gave me a lingering fascination with Forth type languages - which was useful when I worked on a project for a few years that used PostScript, C & Lisp ...
Forth is an extensible virtual machine
Most interpreted languages that are implemented with a virtual machine (sometimes known as a bytecode interpreter) have a fixed set of instructions. Modulo some implementation details, writing new Forth words is akin to extending the virtual machine with new instructions.
However, the article's Forth is a more extensible form of this. The entire evaluation of the virtual machine is split into an extensible set of states (starting with only interpret, compile, head, forth, lit), new types of instructions can be introduced by adding to a list of heads (starting with only DOCOL and EXIT), and then words are, specifically, lists of addresses that the forth state interprets, whose evaluation is triggered by a DOCOL seen in the head state.
This opens up possibilities to define other sorts of interpreters for specific purposes. The article gives the example of polynomial evaluation, but you could also do something like have a variant of the forth state for token threaded code in memory-constrained environments. Sort of like how ARM has an additional Thumb instruction set.
More detail for this last idea: you'd have an array of, say, 256 pointers, and in the bforth state you'd read the next byte of the thread, look up the address in the table, push it on the RS, then go into the head state. This gives you perfect interoperability. You'd just need a special bcompiler state that would look up the bytecode for each word.
I've read similar ideas on Lisp. I don't put much stock in it in either context, but I think Lisp has a better claim to extensibility. All programming languages enable decomposition of a program. Functions/methods/macros/words don't really 'extend' a language.
You can't easily add garbage collection to Forth. You can't easily add static type-checking to Forth. Why call it 'extensible'? It's no more extensible than C.
It may be easier to hack on a Forth implementation than a C compiler, sure.
The fact that this is exposed inspires a certain way of programming, and this seems to be essential to the Forth way.
This isn't saying too much about the language itself.
Extensibility refers to the ability to extend the compiler directly in the language. Lisp supports this with macros, which are not functions and are different than macros in C because they alter the behavior of the compiler, and Forth supports this with defining words and compiling words, which behave differently from other forth words because they alter the behavior of the compiler. Compare this to a macro in C, which pre-processes the macro into more C prior to compilation rather than at the time of compilation. In this respect, both Forth and Lisp are more extensible than C because facilities to extend the language exist within the language itself.
How about BOOST_SCOPE_EXIT in C++?  Is it extending the language, or just a trivial macro-based hack on RAII? What's the difference?
My other issue with it is that the major features of programming languages that people actually care about, aren't the kinds of things you can implement with clever macros. They tend to rely on serious engineering.
Even in a highly extensible language, you can't easily throw together features like industry-strength garbage collection, or a type system, or borrower-checking. (I considered listing formal verification but that would be unfair, it can't really be added to an existing language.) These things are implemented by compiler engineers, and always will be. (If I'm mistaken about that, and am underestimating anything, I'd be interested to know, but I don't think I am.)
What's the 'killer example' of LISP macros? I can see a pretty neat example at .
I can't say anything about adding a type system, because it's unclear what sort of type system would even apply here. But at least in Common Lisp (which incidentally already has a type system), I think the way you'd do it is create your own front-end that interprets and checks all the typing/borrowing information and then passes this to the usual compiler. There's a long tradition in Lisps to define interpreters that extend the base language in some way.
SBCL even lets you hook into the compiler, and I saw some article about someone adding vectorization to SBCL using only user-level code.
Regarding BOOST_SCOPE_EXIT, it's yes or no depending on what extending the language means to you. I think it's more useful to use a definition where the answer is 'yes.' It's true that C macros are not a very powerful interface for language extension, though.
Also, it seems like you're saying that "extensible" means "easily extended" or "doesn't take serious engineering." It just means you can extend it (and, I'd add, in a principled way). Writing good Lisp macros can take serious engineering, but also there are other ways to extend Lisp.
In Lisp, you can (and do) define pattern match destructuring and lvalue assignments entirely using macros. These are features that require a new edition of a language standard, usually.
Sure, but to do this properly you'd need to get your hands dirty with the implementation. Garbage collection doesn't work well as a bolted-on library, it needs to be handled in the language implementation. Forth is no different from any other language in that regard.
> I can't say anything about adding a type system, because it's unclear what sort of type system would even apply here.
It's directly analogous to static type-checking in a 'normal' programming-language, except we use the stack for accepting parameters and for returning values.
It would ensure each word manipulates the stack in the expected way, ensuring that the appropriate number of elements are consumed and pushed, and that each element is treated as being the correct kind of data (type-checking should fail if you attempt to dereference an integer).
This has been done in the Kitten programming language.  Java bytecode verifiers do something similar.
> SBCL even lets you hook into the compiler, and I saw some article about someone adding vectorization to SBCL using only user-level code.
Neat. I like the idea of advanced optimisations as libraries, sounds like a good research topic.
> it seems like you're saying that "extensible" means "easily extended" or "doesn't take serious engineering." It just means you can extend it (and, I'd add, in a principled way)
Perhaps I seem dismissive, but it just doesn't seem to me like many serious language features can be done properly in extensible languages.
The latest major features in mainstream programming languages are await/async and borrower-checking. I imagine it might be possible to implement await/async using Lisp macros, but I really doubt the same goes for borrower-checking.
> In Lisp, you can (and do) define pattern match destructuring and lvalue assignments entirely using macros.
That's pretty impressive, but it still seems that there are plenty of useful language-extensions that can't practically be implemented in extensible languages.
You can add garbage collection because the Forth program is a language implementation. It probably won't look exactly like ANSI Forth in the end, but that's (theoretically though potentially not in practice) ok.
> It's directly analogous to static type-checking in a 'normal' programming-language, except we use the stack for accepting parameters and for returning values.
I'm familiar with type systems for stack languages, but the issue is that this only really could apply to a standardized Forth language. Forth is only incidentally concatenative and stack-based --- this just happens to have both a simple implementation and has good compositional properties. There is nothing stopping you from introducing new programming models in a Forth, and the article gives a few examples. You can add arbitrary bytecode interpreters to a Forth, for example, and easily make it interoperable with the threaded interpreter in the base Forth. Nothing stopping you from adding words so your Forth feels like a register machine, either. It's because of this that any sort of type system seems doomed (other than one that describes state transitions of the CPU...).
> Perhaps I seem dismissive ...
I sort of don't see the point of saying "it's not really extensible if I can't extend it in all possible ways." In any case, Forth and Lisp are, for trivial reasons, languages that let you embed arbitrary other languages within them (it's about as exciting as how Turing completeness seems to be easy to meet -- which is to say, not very). Worst case, the cost is implementing a whole compiler or interpreter for said language, but, still, it's possible. Common Lisp gives you many hooks to change low level behaviors, so there is usually a better way than this worst case.
Something like borrow checking, though, is a pervasive new feature. The issue is that everything needs to know about how ownership is transferred. It's no different from, even in C, changing some basic struct the whole application uses and then having to update everything to account for it. You could add borrow checking to Common Lisp, but it would have to be in demarcated areas in which borrow checking is being done, and, like Rust, you'd have to figure out an 'unsafe' to be able to use all the Lisp (respectively, C and C++) that's already out there.
> formal methods
Being able to bolt on embedded formal methods to a programming language is active research. I think it's unreasonable to expect this of any language right now, other than ones specifically designed for it :-) (Speaking as a Lean user.)
Forth doesn't really seem to be the kind of thing where practitioners would care about formal methods... It's very defensible to say, then, that Forth is wrong because of this (we depend quite a lot on software being correct!). But, I don't see anything about Forth that prevents you from defining words that check formal specifications -- and I don't mean this in the trivial adding-a-formally-checked-language-into-Forth way.
Anyway, I don't think it's good or bad to be extensible. It's just a property, and Forth and Common Lisp happen to be examples of languages that are much more extensible than usual. There are certainly engineering challenges either way when it comes to extensibility. And, with an extensible language, while you might be able to make bespoke solutions, you now have a bespoke (hopefully small) language to maintain, too. Language design ability and programming ability don't necessarily go hand-in-hand, either...
Extensibility by itself doesn't solve the terrarium problem because it doesn't define any standard, so there isn't anything to build on. When presented with extensibility, you still build the terrarium, it's just a customized one, and if you do it ground-up like Forth, you can potentially build it smaller and simpler. But in the end you still have a terrarium with assumed boundaries.
This has led me away from Forth-the-language in the last few months to explore the terrarium problem further, and I hit on the idea of treating this as an organizational issue solved at the level of the core UX. If the problem is defining boundaries in software, we should have better ways of doing that. At first I considered this in terms of selection - selection being one of the pillars of structured programming, and many of our improvements taking the form of easier selection metaphors. This gradually led me to explore the idea of document editing with a binder and sticky notes metaphor, with a supplemental compiler process taking the resulting complex, layered documents and processing them into a linear form for consumption. The presumption is that if I present a rich set of tools for defining types of divisions and groupings defined skeumorphically - pages, bins, tape, wire, guides, stickies, overlays - and then add a bit of labelling and indirection on top, a powerful "thinking tool" should emerge where the organization is easy and the compilation system makes it easy to query and traverse it in a customized way.
If I can finish designing it.
These days, I wonder if it's possible to run Forth inside a VM as the operating system. How hard would it be to stand up a networking stack?
This would be reasonable afternoon project (except for the optimization stuff).
None of the C code is complex or tricky or big. You could easily do it in assembly instead. That's why these kind of languages are great for a lot of embedded applications. You only need to write a small assembly core to support your language, plus assembly function to support your hardware, and then you can do everything else in your higher level FORTH-like language.
Thanks again, four years later!
I'll get the compilation phase done tomorrow, barring surprises.
> loader [...] provides a scripting language that can be used to automate tasks, do pre-configuration or assist in recovery procedures. This scripting language is roughly divided in two main components. [...] The bigger component is an ANS Forth compatible Forth interpreter based on FICL, by John Sadler.
> BUILTINS AND FORTH
> All builtin words are state-smart, immediate words. If interpreted, they behave exactly as described previously. If they are compiled, though, they extract their arguments from the stack instead of the command line.
> If compiled, the builtin words expect to find, at execution time, the following parameters on the stack:
> addrN lenN ... addr2 len2 addr1 len1 N
> where addrX lenX are strings which will compose the command line that will be parsed into the builtin's arguments. Internally, these strings are concatenated in from 1 to N, with a space put between each one.
> If no arguments are passed, a 0 must be passed, even if the builtin accepts no arguments.
> While this behavior has benefits, it has its trade-offs. If the execution token of a builtin is acquired (through ' or [']), and then passed to catch or execute, the builtin behavior will depend on the system state at the time catch or execute is processed! This is particularly annoying for programs that want or need to handle exceptions. In this case, the use of a proxy is recommended. For example:
> : (boot) boot;
> FICL is a Forth interpreter written in C, in the form of a forth virtual machine library that can be called by C functions and vice versa.
> In loader, each line read interactively is then fed to FICL, which may call loader back to execute the builtin words. The builtin include will also feed FICL, one line at a time.
> The words available to FICL can be classified into four groups. The ANS Forth standard words, extra FICL words, extra FreeBSD words, and the builtin commands; the latter were already described. The ANS Forth standard words are listed in the STANDARDS section. The words falling in the two other groups are described in the following subsections.