
Yosefk - My history with Forth & stack machines - astrange
http://www.yosefk.com/blog/my-history-with-forth-stack-machines.html
======
impeachgod
I've always been fascinated with Chuck Moore's VLSI design tools. How does he
manage to design working VLSI chips with such limited functionality? Does he
do complex simulation by hand? Wouldn't it get hard laying out each individual
transistor?

~~~
thesz
I think he use combinatoric approach.

Yuo can describe a gate using transistors. Then you can describe a circiuit
using gates and so on.

This is wildly popular in functional programming, and in Haskell in
particular.

Take a look at those two libraries: Lava:
<http://hackage.haskell.org/package/chalmers-lava2000> and
<http://hackage.haskell.org/package/york-lava>

and Wired: <http://hackage.haskell.org/package/Wired>

Wired is especially interesting because it let you simultaneously describe a
circuit and its layout.

------
pauldirac137
I feel a rant coming on...

This is a _fantastic_ article which should be read by anyone interested in
stack languages.

Though it wasn't my first programming language, Forth was certainly the first
programming language I loved. I read Starting Forth and was blown away by the
elegance of the language and by how easily you could get the language to
modify itself. It also started a fascination with stack languages that
persists to this day.

However, when I tried to actually do anything useful with Forth, I kept
banging my head against a wall. The typical problem is that stack juggling is
just too hard for most problems, and even when it isn't, it's more work than
just writing down an algebraic expression would be. Stacks work well for some
problems, but evaluating arbitrary algebraic expressions isn't one. Why would
I want to do that? asks the Forth zealot. Because I do! Because I don't want
to have to worry that I won't be able to do that easily in the future when I
do need to. So even though I've designed my own Forths and other stack
languages since then, rule #1 is always: support local variables. Even if I
don't need them, I want them. If you don't believe me, translate this problem
into Forth or some other stack language in a clear, intuitive way without
using local variables:

    
    
        # python code to compute solutions of a**2*x + b*x + c == 0:
        def solve_quadratic_equation(a, b, c):
            # assume a >=0, no complex roots
            d = math.sqrt(b * b - 4.0 * a * c)
            return ((-b - d) / (2.0 * a), (-b + d) / (2.0 * a))
    

Three lines of code, completely transparent. Now see this thread:
<http://osdir.com/ml/lang.concatenative/2005-01/msg00023.html>. You'll see how
difficult this is to do without local variables in a stack language. The
reason is easy to spot: you are using values in a pretty random-access way,
not in a streamed way. This is common in algebraic equations. And it's not
stack-friendly.

Another thing I hate about pure Forth is that floating-point values, if they
exist at all (they often don't) are put on a different stack, which means you
have to duplicate all the operators with different names (I call this
"operator underloading") and this can lead to hilariously obscure bugs because
the two stacks are independent.

Basically, the attitude that the hard-core Forthers have reminds me of
analogous statements that have been made about communism, or moonieism or
whatever extreme -ism is promoted as the cure to all problems: if you are
having problems, it's because you haven't gone far enough down the road yet.
Only by committing 1000% to the approach will you ever get anything done. This
is pure cult behavior.

And yet: Forth does have a valid and useful niche, in embedded systems, and it
is a fascinating language, one well worth studying. But I don't think it's a
coincidence that none of the software that people use everyday (other than
very low-level stuff like e.g. the boot proms) on their computers is written
in Forth.

Here's a challenge: I would really love to see someone write a full-featured
web browser (at the level of a Firefox, Chrome or Safari) in Forth. You can do
that in C, C++, Java, Python, and probably in dozens of other languages I
could name. The beauty of this problem is that you can't go and bitch that you
shouldn't have to support something like (say) CSS or embedded video because
you think it's unnecessary; it's part of the problem domain and you don't get
to complain about it. I think that any web browser that anyone attempted to
write in Forth (without local variables, naturally!) would never get finished.
But maybe I'm wrong, and I just haven't reached Forthlightenment yet. If so,
prove me wrong.

~~~
_delirium
> Here's a challenge: I would really love to see someone write a full-featured
> web browser (at the level of a Firefox, Chrome or Safari) in Forth. You can
> do that in C, C++, Java, Python, and probably in dozens of other languages I
> could name. [...] I think that any web browser that anyone attempted to
> write in Forth (without local variables, naturally!) would never get
> finished.

Do those languages actually pass the test, though? I see the point, but I
think this particular challenge problem is a bit too hard in terms of size of
teams and resources needed--- hard enough that as a descriptive matter, Java,
Python, and every other language that isn't C/C++ has actually failed the
test. Many people have tried, and all of those browsers have, well, never
gotten finished. People have been writing a pure-Java modern web browser since
at least 1997, when Netscape decided to rewrite Navigator in Java (in a
project, "Javagator", that was never finished). Along the way, there've been a
half-dozen other half-finished Java browsers (HotJava, Lobo, etc.). There's a
half-finished Common Lisp browser, too (Closure). But nothing outside of C/C++
competes with Webkit, Opera, IE, and Firefox.

I do agree that Forth isn't a great language for writing to complex predefined
specifications. I like its general ideas, though, and the principle of
deciding to solve a simpler problem rather than a harder one actually seems
fine to me, if you really are in a position where you have the choice. Not
that I'm big on Forth, personally, but it's somewhat philosophically consonant
with my own preferences, which are in the suckless.org vein.

~~~
pauldirac137
You're right, I'm being extremely charitable to Java and Python. As you say,
there was a Java browser (HotJava) in the early days of browsers, and there
was a Python browser once called Grail, but neither one was even close to
Firefox/Chrome/Safari. I think you _could_ write a Firefox clone in Java, and
in Python too (though a Python Firefox clone would be perhaps too slow to
use). My point was that I don't think you could even write such a beast in
Forth (or, more accurately, in Forth using the extreme-Forth approach
described in the article; maybe it would be possible in ANS Forth). You need
the abstraction layers or the problem is unmanageable.

------
zeteo
"Astronomically costly licenses. Geological run times."

Nice writing.

~~~
solutionyogi
You should definitely read his other articles. He is a talented writer who
knows his subject matter as well. I will never forget this paragraph,

"Me. I’m a professional programmer. By which I mean to say, I shovel through
piles of virtual shit for a living. And to shovel quickly and happily, I need
a big shovel. Python is one of my shovels. Core dumps are one of my piles of
shit. Python, meet the core dumps. Core dumps, meet the Python."

"A person who was exposed to machines and doesn’t hate them is either an idiot
or is completely devoid of soul! Step back, the child of Satan!"

From: [http://www.yosefk.com/blog/python-teaching-kids-and-
biting-b...](http://www.yosefk.com/blog/python-teaching-kids-and-biting-bits-
dont-mix.html)

[I sometimes refer myself as Janitor because I really do clean virtual shit. I
wish I had registered <http://janitorprogrammer.com/> ]

~~~
plinkplonk
"You should definitely read his other articles. He is a talented writer who
knows his subject matter as well."

His take on Extreme Programming is interesting.

[http://www.yosefk.com/blog/extreme-programming-
explained.htm...](http://www.yosefk.com/blog/extreme-programming-
explained.html)

~~~
primodemus
"Code is more like poetry: change this line, and now the next line doesn’t
rhyme, or you’ve broken the rhythm, or you’ve put angry words into a happy
poem, that sort of trouble. Which is one reason to like code ownership."
Amazing.

------
jacquesm
If you liked this look in to 'factor'.

~~~
thristian
Yeah, I was (metaphorically) waving my hand in the air and yelling "Factor!
Factor!" for the first half of the essay, but when I got to the end I wasn't
so sure. Factor solves a lot of the problems the author describes - it has an
extensive standard library, and garbage collection, and all kinds of other
useful things - but I think the core complaint about writing Forth code still
stands: You still have a data stack, stack-shuffling words like 'dup' and
'over' and 'rot' still make for ugly, hard-to-read code, and re-thinking your
expression-graph to be expressible without such shuffling is still very hard.

I still have a local checkout of Factor in my home directory, and I really
would like to get around to playing with it some more, someday - but (at least
to begin with) playing with Factor sometimes feels very much like hard work.

~~~
gecko
I don't honestly think that Factor suffers from the same problems enumerated
here at the end of the day. I used to dally in Forth in the form of Mops and
PowerMops, and my problems were basically the same as those enumerated in this
essay. I've tried to get into Factor multiple times; this past time I
succeeded, and I have been writing increasingly large amounts of code in my
spare time that seem to flow okay. I might even start releasing some of it
soon. I think the differences can be chalked up to a few major things:

 _Real locals_. Factor locals are not penalized; they perform the same as a
data stack. And while another comment correctly notes that Factor programmers
prefer to avoid locals, I'd point out that locals are also used in many places
in the standard library. Factor's preference is not the same as Forth's near-
insistence.

 _Higher-level combinators via lambdas._ Part of why they have that attitude
is that Factor's higher-level combinators are very natural if you're coming
from a functional language. A lot of its combinators are things like "run
these pile of lambdas against this one object" (kind of a reverse map) or
"append all of these lambdas with this extra operation or datum" (think of
currying). These are VERY different in practice than dup/swap/rot. I still
have to pause and think a lot while I'm coding about how and which to use, but
it's getting better. The main hurdle is simply not forgetting anything.

 _Rich data types._ Being able to have a single element on the stack that is
an array, or a class, or an expandable vector, or a hashtable, or what have
you, GREATLY simplifies things compared to Forth, where I'd have to "just
know" that the two top things on the stack entering my function were a pointer
to an array and its length or some such.

 _Much better error reporting._ With richer data types comes much better error
messages. You didn't just read random crap from memory because you read the
stack effect in the wrong order; you tried to call + on a hashtable. Combined
with static stack effect checking, and I find that debugging my Factor code is
usually about logic, whereas at least half the time, my Forth code was about
getting the stack effects right.

\----

I'm not saying Factor's perfect. It's not. I use more locals than the core
team does in my code, and I'm not currently convinced that's wrong in any
sense. But I also get a lot of mileage out of the higher-level combinators, to
the point that I can write concatenative code without feeling like I did in
Forth that I'm doing the processor's work for it. It feels a lot closer to any
high-level functional language, where I'm just composing functions, not
jiggling the stack.

~~~
Avshalom
_I use more locals than the core team does in my code, and I'm not currently
convinced that's wrong in any sense._

I think that the big "win" in Factor vs say CL is implicit argument passing.

from the article even:

 _In order to have really small definitions, you do need a stack, I guess - or
some other implicit way of passing parameters around;_

As far as I'm concerned as long as locals don't get in the way of that they're
groovy.

~~~
silentbicycle
You can get the same implicit argument passing in J (called "tacit
programming") and Haskell & ML (via currying, called "points-free style").
It's optional in those languages, though, which strikes me as a good idea:
sometimes it's a vast improvement, but sometimes it makes the code hard to
follow for no real gain.

------
kragen
Here's my own rant. (I think Paul Dirac 137's rant actually contains more
words than P. A. M. Dirac's textbook about quantum mechanics, which I spent
many evenings in high school trying unsuccessfully to get to the third page
of.)

I remember when I thought Forth was the next big language, too. And like you,
my experience trying to program in it was disillusioning.

A big part of the problem, I think, is that I tried to use the stack instead
of variables. Forth has always had variables (at one point in your post,
Yossi, you say it doesn't; but I think you mean local variables.) It doesn't
have to be any harder than C to write. You can translate directly from C to
Forth; C:

    
    
        static int mac80211_hwsim_start(struct ieee80211_hw *hw)
        {
                struct mac80211_hwsim_data *data = hw->priv;
                printk(KERN_DEBUG "%s:%s\n", wiphy_name(hw->wiphy), __func__);
                data->started = 1;
                return 0;
        }
    

Forth, just a straight translation of the C:

    
    
        variable hw  variable data
        : mac80211_hwsim_start  hw !
          hw @ priv @ data !
          KERN_DEBUG s" %s:%s"  hw @ wiphy @ wiphy_name  printk
          1 data @ started !
          0 ;
    

This function doesn't happen to be recursive. If it did happen to recurse,
we'd have to explicitly save the values of its "local" variables on the stack
before the call and restore them afterwards. On the other hand, most functions
aren't recursive.

Now, maybe you can optimize that Forth a little bit; for example, you can
probably dispense with the variable "data" and just use the top-of-stack, and
actually that variable is only read once and surely the debug message line
doesn't modify hw->priv, etc. etc. Maybe I should have randomly picked a
different piece of C. But at some point along the path of "optimizing" and
"simplifying" the Forth code, you find yourself getting into the kind of
nightmarish code you demonstrated above, where you have four or five things on
the stack and they move all the time, and understanding it is just a
nightmare. But you don't have to do it that way. You can use variables, just
like in C, and the pain goes away. You never have to use any stack-
manipulation operations, not even once. They're always there, tempting you to
use them, taunting you; and you have to exercise a great deal of restraint to
avoid writing your code as cleverly as possible, because that will make it
impossible to debug.

I think in this case the painless-but-optimized version looks like this:

    
    
        : mac80211_hwsim_start
          dup wiphy @ wiphy_name  log   priv @ start ;
    

Here I figure that "log" is locally defined as something that printks a
KERN_DEBUG message with the calling function name followed by a colon and then
the argument, and that : start started 1 swap ! ;.

But when I said "it doesn't have to be any harder than C to write", I lied a
little bit. It doesn't have to require detectably more code, but per line,
Forth is still a bit more error-prone than C, in my experience. In C you get
enough static type checking to immediately detect something like 90% of your
type errors; you get a warning if you forget to pass an argument to a
function, or pass too many; the data flow of any subroutine is easily
approximable without having to know the stack effect of every subroutine you
call; and so on. But maybe if I programmed enough in Forth, these errors would
become less significant.

(I suspect that global variables are less of a problem in Forth than in other
languages: you can have multiple global variables with the same name without
accidental sharing; you're very unlikely to have mutually recursive functions
without knowing it (and, in any language, recursive or mutually recursive
functions require special care to ensure termination anyway (that is,
recursion is error-prone); etc.)

On the other hand, as you pointed out, Forth is easily extensible. I think, as
Eric Normand pointed out, that you want to use that extensibility to bootstrap
into a less error-prone, more problem-level language/system as quickly as
possible, and in fact, this is the standard approach using Forth promoted by
the likes of Chuck Moore and Elizabeth Rather, as I understand it. It's just
that the next layer up they're talking about is a more traditional domain-
specific language, something like FoxPro or Rexx or sh, rather than something
with garbage collection and data types.

In theory, at least, it seems like as you scale to a larger program, Forth's
advantages over C would become more significant, as your program looks more
like a tower of DSLs --- up to the point where you split your C program into
multiple programs that the OS protects from each other.

There's a quote from Jeff Fox in my quotes file:

    
    
        [In C under Unix] Bugs are planned, and the whole picture is all about
        the planning for bugs.
    
        Forth is about planning for good code where the bugs don't happen. If
        you say BEGIN AGAIN damn it, you mean BEGIN AGAIN not twenty other
        possible meanings based on C insisting that it is one of twenty
        different bugs that need extra hardware and software to be handled
        properly. 
    
    	-- Jeff Fox <fox@ultratechnology.com>, in a discussion on
    	   comp.lang.forth, inadvertently explaining why Forth is not
    	   widely used, 2006-05-20, in message-id
    	   <1148149942.763594.292230@u72g2000cwu.googlegroups.com>,
    	   subject "Re: hardware errors, do C and Forth need different
    	   things in hardware?"
    

My own take on Forth is that it's by far the simplest way to build a macro
assembler, and you can do anything with it that you can do with any other
macro assembler, perhaps a little bit more easily and with more portability,
and syntax that's not quite as nice. At some point you want a high-level
language on top of your macro assembler, though. The Forth theory is that
implementing a domain-specific high-level language has a better cost/benefit
ratio than implementing a general-purpose high-level language, and a macro
assembler is a perfectly adequate way of implementing a domain-specific high-
level language.

Where this approach falls down is that it's true that implementing a domain-
specific high-level language has a better cost/benefit ratio than implementing
a general-purpose high-level language if you're implementing it for a single
user, such as NRAO. But if your program memory isn't limited, and you can
share the language with all the computer users in the world, a general-purpose
high-level language like Lua or Python or Tcl or Ruby is a more economically
efficient tradeoff, because whenever you implement some optimization, they all
get the benefit.

(Also, I actually think it's easier for me to write bug-free assembly than
bug-free Forth, but that may be a matter of experience.)

With regard to 18-bit words, I guess the advantage over 16-bit words is that
you can fit four instructions per word instead of three. (The low-order bits
of the last instruction are necessarily 0; fortunately that is true of NOP.)

Once I asked a famous CPU designer (who shall remain anonymous, since this
wasn't a public conversation) what he thought about Chuck Moore. He said
something to the effect of, "Chuck Moore? I used to work under him at AMD.
He's great!"

"No," I said. "The Forth guy."

"Oh, HIM!" he said. "In my DREAMS I could do what he does."

I think the GreenArrays people are making a big mistake by marketing their
chip as competition for microprocessors and microcontrollers. What they're
really competing with is FPGAs and PALs. Unfortunately, they don't have a VHDL
or Verilog synthesis system for their chips, and I don't think they're going
to build one.

I think the reason that almost every bytecode system under the sun is stack-
based is that low-level bytecode is basically a bad idea these days, but it
made a lot of sense in days when code density was really important. (Bytecode
as a compact representation of an AST might still make sense.)

And stack-based bytecode is a lot denser than purely register-based bytecode.
Hybrid bytecodes like Smalltalk-80’s, where you have 8 or 16 or 32 bytecodes
that fetch and store to registers, are even denser.

I can't claim to be a good or even an adequate Forth programmer. So take all
of this with a grain of salt.

