People interested in Forth might want to also check out PostScript (similar and stack-based). I recently wrote a blog post with a very brief PostScript tutorial and an example program translated line-by-line to Python
https://kenschutte.com/postscript-vs-python/
Well, my favorite stack based language was RPL on the HP. Simply because it fit well in the scope of the environment that it was used. It was really easy to develop and debug on the calculator (I know many did very large projects working with editors on the PC, but I did all my work on the machine itself).
It lacked any kind of composite structure outside of lists and arrays. No structs or records. But that was never a real barrier.
It was just a nice, high level stack machine with a cool library and toolset.
My favorite stack based language is dc. It's also the oldest language still found on Unix systems, predating even C (the original version was written in B on a PDP-11 - even before it had an assembler).
> Almost everybody in the group has done languages. Thompson and Ritchie built assemblers from scratch. On the tiny PDP-7 the assembler was supplemented by tmg, Doug McIlroy’s version of Bob McClure’s compiler-compiler. Using it Thompson wrote B, the evolutionary link between Martin Richards’s (Cambridge University) BCPL and C. The PDP-11 assembler, a desk calculator dc, and B itself were written in B to bootstrap the system to the PDP-11. Because it could run before the disk had arrived, dc—not the assembler—became the first language to run on our PDP-11. Soon revised to handle arbitrary- precision numbers (v1), dc was taken over by Bob Morris and Lorinda Cherry. It now ranks as the senior language on UNIX systems.
As a big RPN and HP calc fan forever, and a recent Forth fan, I was a bit confused by my initial foray into RPL. It seemed like it allow infix notation and that threw me of at first glance.
Did I bail to soon?
I am a big stack machine convert and have had an HP RPN calc since 12.
The first thing I ever wrote in Perl was a CGI script that printed out a form in Postscript. I wrote ps to draw boxes and fit the text the user had entered.
It wasn't pretty to mix perl and ps, and I don't think I have ever written anything as wacky as that code, but it was fun.
i don't think postscript is very similar to forth, although they are both imperative and stack-based, which gives their syntax a superficial similarity
what's more important is that (in the terminology of eopl) their expressed values and denoted values are completely different
Question to the more educated among us: I read that Apple's UI was based on Display PostScript... how similar is it to original PostScript, and why was it abandoned?
Apple's UI is based on Display PDF since macOS 10 (Mac OS X, using the old branding), this is the royalty-free replacement for Display PostScript of OPENSTEP and NeXTSTEP which became the foundation for post-9 macOS after Apple absorbed NeXT.
@DonHopkins is the most knowledgeable person on this topic here since he worked on another PostScript-based graphics system NeWS for the competitor Sun Microsystems.
That’s interesting. I don’t know anything about it but maybe it means it was based on PS “drawing model” more than the actual PS language. (ie the basic primitives and how stroke, fill, etc work)
It was used as the display language for NeXT and some unix workstations. Just googling real quick it was a direct dialect of PostScript with the main changes being related to using efficient 32 bit integer ids in place of strings and similar performance optimizations.
There's an important distinction between the PostScript programming language, and the PostScript imaging model. You can have one without the other. There are a lot of problems using a full blown programming language as a graphics model.
PDF is like PostScript without the programming language, just the imaging model. (But with a lot of extra crap thrown in so haphazardly and indiscriminately that the Adobe PDF Viewer browser plugin traditionally requires downloading patches almost every day!)
But you can "distill" (partially evaluate) a PostScript program into a PDF document by executing it, and recording the calls to the PostScript imaging model that it makes (and you can also optimize them too -- see Glenn Reid's and Acrobat's "Distillery").
Apple made a library called Quartz, that was like PDF for the screen, without the programming language part that Display PostScript of NeWS supported. And Quartz evolved into Core Graphics in Cocoa, which is used today.
PostScript calls it the "Stencil / Paint Imaging Model". There are other libraries and APIs inspired by PostScript, like Quartz, Core Graphics, Keith Packard's Cairo graphics library, and the browser's Canvas API (often implemented on top of Cairo).
I wrote up a comparison of Forth and PostScript, and some history and links about PostScript and the languages that inspired it (Brian Reid's retrospective he posted to laser-lovers is especially interesting and first-hand) here:
>Kragen is right that PostScript is a lot more like Lisp or Smalltalk than Forth, especially when you use Owen Densmore's object oriented PostScript programming system (which NeWS was based on). PostScript is semantically very different and much higher level that Forth, and syntactically similar to Forth but uses totally different names (exch instead of swap, pop instead of drop, etc).
>But its most obvious similarities with Forth are based on generic Computer Science concepts like stacks, which is not something particularly unique to Forth itself, so saying PostScript was influenced by Forth ignores other much more influential languages like Interpress, JaM, and Design System, which directly influenced PostScript and were designed by the same people. Also, PostScript is not threaded like Forth. But it's homoiconic, unlike Forth and [like] Lisp.
[...]
>Brian Reid wrote about page independence, comparing Interpress' and PostScript's different approaches. Adobe's later voluntary Document Structuring Conventions actually used PostScript comments to make declarations and delimit different parts of the file -- it wasn't actually a part of the PostScript language, while Interpress defined pages as independent so they couldn't possibly affect each other:
>By now you can probably see the fundamental philosophical difference between PostScript and Interpress. Interpress takes the stance that the language system must guarantee certain useful properties, while PostScript takes the stance that the language system must provide the user with the means to achieve those properties if he wants them. With very few exceptions, both languages provide the same facilities, but in Interpress the protection mechanisms are mandatory and in PostScript they are optional. Debates over the relative merits of mandatory and optional protection systems have raged for years not only in the programming language community but also among owners of motorcycle helmets. While the Interpress language mandates a particular organization, the PostScript language provides the tools (structuring conventions and SAVE/RESTORE) to duplicate that organization exactly, with all of the attendant benefits. However, the PostScript user need not employ those tools.
Minor nit, it was called Distiller. I worked on it for about a year, not so much on the core processing code, but on the periphery - user interface, adding an API, etc.
When I first started work on it, one of my new colleagues walked by and asked what I'd be working on. I said "Distiller", and they gave me a sad look, like "you just got the worst assignment here."
> PDF is like PostScript without the programming language, just the imaging model.
Of course PDF does have a programming language, a very old version of JavaScript, with an API that Acrobat/Reader provides.
It's mostly used for simple stuff like calculating and updating one form field from your input in another.
You can tell when you look at the API that it was designed 20-25 years ago.
> (But with a lot of extra crap thrown in so haphazardly and indiscriminately that the Adobe PDF Viewer browser plugin traditionally requires downloading patches almost every day!)
Browsers don't use the Adobe plugin any more, thank goodness.
But on the "haphazard and indiscriminate" part, yes, guilty as charged.
One of my later projects at Adobe was to build a multimedia plugin that could embed any of the popular media players of the day: Real Player, QuickTime, Windows Media Player, one or two others I don't remember, and even Flash.
I built the main plugin which provided a JavaScript API you could access from your PDF and a way to add a "player plugin" for each media player, and teamed up with colleagues to build each of the individual player plugins.
No one at Adobe around 2002 thought Flash was important, so I kind of had to do a stealth minimal implementation.
It was definitely a fun project, and also a great example of the "haphazard and indiscriminate" additions you referred to.
"Distillery" was actually the name of the original PostScript version that Glenn Reid posted to usenet in the 1980's. Here is a copy of his "release 10 of the Distillery" from 10 Mar 1989:
>Glenn Reid wrote a PostScript partial evaluator in PostScript that optimized other PostScript drawning programs, called "The Distillery". You would send still.ps to your PostScript printer, and then send another PostScript file that drew something to the printer. The first PostScript Distillery program would then partially evaluate the second PostScript drawing program, and send back a third PostScript program, an optimized drawing program, with all the loops and conditionals unrolled, calculations and transformations pre-computed, all in the same coordinate system.
>It was originally John Warnock's idea, that Glenn implemented. And it led to Adobe Acrobat's "Distiller". Acrobat is basically PostScript without the programming language.
>Owen Densmore recounted John Warnock's idea that PostScript was actually a "linguistic motherboard". (This was part of a discussion with Owen about NeFS, which was a proposal for the next version of NFS to run a PostScript interpreter in the kernel. More about that here:)
>Glenn Reid described his classic "Grandfather Clock" metaphor to comp.lang.postscript that really opened my eyes about writing code in interpreted languages like PostScript (and JavaScript, while JIT'ed at runtime, still makes calling native code expensive), and balancing optimization with readability. With modern JavaScript/canvas, JavaScript code that calls other JavaScript code runs really fast, but calling between JavaScript and built-in code is still slow, so it's good to have the built-in code do as much as possible when you do (like rendering a long string of text at once, instead of rendering it word by word like many PostScript drivers would):
>Glenn's post in a comp.lang.postscript discussion about PostScript programming style and optimization:
From: Glenn Reid (Abode Systems)
Newsgroup: comp.lang.postscript
Subject: Re: An Idea to Help Make Postscript Easier to Read (and Write)
Date: 10 Sep 88 17:26:24 GMT
You people tend to forget that the PostScript language is interpreted.
It is well and good to use tools to convert to and from PostScript,
but it is not quite as "transparent" as we all might think.
I like to think of a big grandfather clock, with the pendulum swinging.
Each time pendulum swings, the PostScript interpreter gets to do one
operation. The "granularity" of the clock is nowhere near the speed
of a microprocessor instruction set, and any comparison with assembly
languages doesn't make sense.
The difference between:
0 0 moveto
and
0 0 /arg2 exch def /arg1 exch def arg1 arg2 moveto
can sort of be measured in "ticks" of the interpreter's clock. It's
not quite this simple, since simply pushing a literal is faster than
executing a real PostScript operator, but it is a rough rule of thumb.
It will take about three times as long to execute the second of these
in a tight loop, and about five times as long if it is transmitted and
scanned each time. My rule of thumb is that if you have roughly the
same number of tokens in your stack approach as you do with your 'exch
def' approach, the 'exch def' is likely to be much more readable and
better. Otherwise, I usually go with the stack approach.
One other thing of note is that if you have too much stack manipulation
going on, it may well be symptomatic of a problem in the original program
design.
Also, most procedures don't do any stack manipulation at all, they
simply use their arguments directly from the stack. In this situation,
it is especially wasteful (and confusing, I think) to declare
intermediate variables.
Compare:
% sample procedure call:
(Text) 100 100 12 /Times-Roman SETTEXT
% approach 1:
/SETTEXT { %def
findfont exch scalefont setfont moveto show
} def
% approach 2:
/SETTEXT { %def
/arg5 exch def
/arg4 exch def
/arg3 exch def
/arg2 exch def
/arg1 exch def
arg5 findfont arg4 scalefont setfont
arg2 arg3 moveto arg1 show
} def
Which of these is easier for you to understand?
Anyway, I think the discussion is a good one, but let's not forget
that PostScript it is an interpreted language. And I don't think
it is terribly hard to use and understand, if it is written well.
Glenn Reid
Adobe Systems
i've been fiddling with a series of prototype interpreters for a simple stack machine over the last year or so, and i've come to a few tentative conclusions:
1. it's probably a win to implicitly stick the arguments to a subroutine into local variables on subroutine entry, as smalltalk-80 and elisp and java do, rather than letting the subroutine do that itself if it wants to, as forth and postscript and hp calculators do. this is mostly for convenience and readability, but it also affects efficiency and safety. in the case of a straightforward stack interpreter, each stack operation takes about the time of 10 machine instructions. for example, in http://canonical.org/~kragen/sw/dev3/hadixmv.S, the addition bytecode instruction is implemented by the following arm assembly code
ldmdb r4!, {r3} @ cargar segundo item de la pila r4 en r3 temporario
add r5, r5, r3 @ sumar número cargado al primero en pila
dispatch
here `dispatch` is an assembly macro that expands (in my un-checked-in version) to
ldrb r3, [r6], #1 @ cargar próxima instrucción, post-index
ldr pc, [r7, r3, lsl #2] @ indexar tabla de bytecodes apuntada por r7
so this is four instructions, but three of them access memory, making them slower (on my target platform), and the last one usually causes a pipeline bubble, so this takes typically 10 clocks on my target chip
a few bytecode instructions are a bit faster than this, a few are much slower, and the average seems to come out to about 15 clocks, while an average instruction is maybe closer to 1.2 clocks, depending on your code. so there's a strong temptation to do anything that will allow you to waste less time in bytecode dispatch
i think glenn is strawmanning a bit here with his arg5. i think you'd write something like
/settext [/text /x /y /size /font] {
font findfont size scalefont setfont
x y moveto text show
} fn
that's still 10 calls inside the subroutine instead of 6, but glenn only got down to 6 because he put the arguments in almost the most convenient order
2. probably a super dumb jit compiler, if well optimized, will never be much slower than a straightforward bytecode interpreter, and in common cases will be about three times faster, but the overall system performance will still be about three times slower than well-optimized assembly. and i can write the compiler in the language i'm implementing (though i can't optimize it that way) which is a lot more convenient than writing it in assembly. even arm assembly, which is super sweet. for this reason, for many years now, the jvm 'interpreter' (what it uses for interpretation of everything that hasn't yet been found to be a hotspot) is a super dumb jit compiler and not actually an interpreter
this tends to undermine the performance argument for automatically creating local variables like this, but i think not completely
3. stack machines have more compact code than register machines, but because you need about twice as many operations to do a given job, the interpreter dispatch overhead is twice as expensive. my primary objective is code density, and although i haven't compiled anything big yet, i think i can get about 2/3 the code size of thumb-2 or rv32c. but a register-based code is probably better when code density isn't the primary objective
Don, on any Lisp or Forth or other post on niche languages on this site, there is a long comment from you going into deep detail about its history and evolution.
Your wealth of knowledge is incredible, and it is thanks to hackers like you that I've been spending the past year rediscovering the roots of computing, the wonders of Lisp, Smalltalk and, this past week in particular, Forth ( HONK ) .
These days my ~/Desktop and printer spool are full of papers from the 70s and 80s that, compared to the clunky world of Javascript and UNIX clones, feel like science fiction.
So thank you for broadening our horizons, and teaching code monkeys like me the lost art of programming computers. Your contributions are invaluable.
Yeah, I've been working on Noticias, which is an implementation of NeWS in Typescript targeting the browser.
Initially, I used HTML5 Canvas to draw, but I'm now in the process of switching over to SVG. Once I have demos that I can show, I'll be making the repo public.
If you're interested in working it, please reach out to me, duncanmak@gmail.com.
I've been reading into this today coincidentally. It seems the main idea was that register based machines are faster, mostly due to the use of local variables in many languages mapping to register and memory latency when the stack is in memory. But that seems to be false based on more recent research.
> Also, it's not clear a super scalar stack machine would be possible.
The "Boost: Berkeley's Out-of-Order Stack Thingy" (Steve Sinha, Satrajit Chatterjee and Kaushik Ravindran) [1] effectively translates stack code into RISC as part of instruction decoding and issue. This is similar to how "Design of a Superscalar Processor Based on Queue Machine Computation Model" (Shusuke Okamoto, Hitoshi Suzuki, Atusi Maeda, Masahiro Sowa) [2] does the same thing but for a queue machine instead of a stack machine.
Can you give some insight on how the J1 fits in all this? The presentation explaining what drove its invention suggests it actually was faster than the register-based alternative, although I'm sure that is also partially because it only focused on the instructions needed for its task.
From how I understand the presentation, they ran into trouble because of code size. Basically stack code is much denser because you don't need arguments for registers and, given the 16kbytes -> 6 kbytes is more than just half from going 32->16bit, appearantly a bit more efficient for implementing their program too I guess.
The size was a concern because of the limited size (16kbyes) of bram modules on an fpga.
I can't say why it performs so much better though, maybe just because it's simpler & 16 bit so it's smaller and can be clocked faster on the fpga? That is just a guess though.
I'm not sure if it is fair to compare the J1 to the BOOM, the latter does a lot more I'd think. That being said, it shows that for such use cases a stack machine can be better suited and much less complex to design, maintain and work with.
the summary of laforest's dissertation says that a small stack machine is much faster (than a presumably in-order MIPS) "for deeply nested or recursive code", but "worse for iterative code"
quite aside from the new results it presents, it seems to be a much more comprehensive answer to the question of why stack machines fell out of favor than i've seen before
I've definitely seen that explanation before. I remain slightly skeptical, but that's mostly because I secretly hope stack machines could "win" some day.
Compared to a register machine a stack machine needs to execute extra instructions to shuffle the stack around. If you tried to do something clever to make this superscalar you'd end up with something very much like shadow registers anyhow.
I too found Charle's Moore's cpu designs interesting. One possibility I thought about for a bit was a VLIW stack hybrid, where you'd have bundles of say 8 stack instructions each executing concurrently in their own lane, and some instructions for moving values between lanes. It'd be a weird thing to program for, but potentially quite fast for how minimal it'd be.
I was musing about this stuff back when reading Hennessy and Patterson in the early 00's. Moore's homepage was really interesting and inspiring, that someone could work idiosyncratically and largely individually and actually get chips to tape out. Cool to hear he's still going strong.
WASM is a stack based "CPU" and arguably the future of the Web. So is Python's. I think also Java and Erlang, but don't quote me on that. A stack based CPU is very easy to implement (or rather, registers are very complex hardware-wise) and a stack-based VM is relatively easy to optimise for a register CPU.
Seems to me registers are a fancy hardware implementation detail, that software at higher abstraction can and should avoid unless maximum performance is required. It was surprising for me to learn recently that on modern x86 CPUs, named registers actually reference hidden, dynamically allocated physical registers, a bit like virtual memory is an indirection layer over physical memory.
>This is a great article! I love his description of Forth as "a weird backwards lisp with no parentheses".
>Reading the source code of "WAForth" (Forth for WebAssembly) really helped me learn about how WebAssembly works deep down, from the ground up.
>It demonstrates the first step of what the article says about bootstrapping a Forth system, and it has some beautiful hand written WebAssembly code implementing the primitives and even the compiler and JavaScript interop plumbing. We discussed the possibility of developing a metacompiler in the reddit discussion.
[...]
Check out this beautiful hand written wasm code, a dynamic wasm based Forth compiler:
It compiles new Forth words into teenie tiny little wasm modules, and hooks out through JavaScript in the browser to create and link them in to the main module, so it can call the code it just compiled. Wasm doesn't let you modify compiled code, but you can compile new code and link it in and execute it!
I’ve been wrestling with WASM bytecode and text format of and on for the last couple years for want of doing something assembly-ish but not wanting to write off doing it for the web.
Personally, I’d say it’s at least as much of a register machine as a stack machine and ways in which it is one vs the other at times sure had me running in circles.
I’ve seen plenty of quotes about how it’s not supposed to be a real assembly, how you’re not supposed to deal with its text format or bytecode, and how it and its documentation are for compiler writers not typical programmers.
Which is exactly the take that I think is so wrong with our status quo of opaque, overly abstracted technologies that are highly optimized in their parts yet so disappointing in the whole.
As a Forth substitute, I was sorely disappointed. I did learn a lot in my wanderings, and what WebAssembly is mostly is a an implementation to run a C architecture. Which is pretty much what it came from. Not that I’m against that, but I sure ran in circles trying to make it more Forth-like.
And given it’s raison d’etre to be more performant vs Javascript it seems to be faster in some benchmarks while slower in others. I think this is more a testament about how optimized V8 and its brethren are then a slam on WebAssembly.
I was hyped for WASM as well, but the more I read the spec and standardisation process, the less I am bullish about this as a long-term piece of tech. I would be surprised if it is still around in 30 years (compared to stuff like UNIX, C or even Forth)
It's closeness to Javascript and the fact that it's basically managed by companies that want to keep Web as complicated as possible so they can enjoy their monopoly on it, turns me off big time.
And lastly, the only thing it has in common with Forth, is being stack-based. Computing has become an unsustainable pile of buggy abstractions on top of one another, and WASM is simply another layer on top this pile of dung, rather than a return to the basics. I get why it wants to give programmers nothing more than a rounded pair of scissors, but then for a glorified VM it's incredibly over-engineered. Art has never been created by a committee of commercial entities.
I had in mind to write a OS that runs WASM natively, but these days I'd rather put a persistent, programmable Forth interpreter that gives you full access to the hardware at the lowest level, and build up from there.
JVM, Python and WASM are more like the SECD [0], not a plain stack machine, by use of a local environment. BEAM (for Erlang) is a register machine. And the JVM and WASM get compiled to code for register machines, with the VM stack not present (in a concrete form) at runtime.
> a stack-based VM is relatively easy to optimise for a register CPU.
You get SSA for almost free (just phi the stacks at control joins), but after making SSA everything is the same regardless of stack- or register- VM.
So the basic principle is that stack machines have smaller instructions, but require extra instructions to shuffle values around on the stack. This is why register machines won out with hardware, and stack machines are popular for virtual machine intermediate representations (that get compiled or JIT down to register code in the end on basically all machines today).
If you wanna learn about how register renaming works, look into Tomasulo's algorithm. State of the art chips are a lot more complex, but that's a good historic introduction to the fundamental ideas.
I really wanted one of those! I held one in my hands one time when I was visiting the MIT-AI Lab -- it was Norman Margolus's, who made the CAM-6 with Tommaso Toffoli, which was programmed in Forth. He wouldn't let me have it, but he let my friend and I borrow a CAM-6 for a weekend science fiction convention to trip out on.
Computer science is still so new that many of the people at the cutting edge have come from other fields. Though Toffoli holds degrees in physics and computer science, Bennett's Ph.D. is in physical chemistry. And twenty-nine year old Margolus is still a graduate student in physics, his dissertation delayed by the work of inventing, with Toffoli, the CAM-6 Cellular Automaton Machine.
After watching the CAM in operation at Margolus's office, I am sure the thing will be a hit. Just as the Moog synthesizer changed the sound of music, cellular automata will change the look of video.
I tell this to Toffoli and Margolus, and they look unconcerned. What they care most deeply about is science, about Edward Fredkin's vision of explaining the world in terms of cellular automata and information mechanics. Margolus talks about computer hackers, and how a successful program is called “a good hack.” As the unbelievably bizarre cellular automata images flash by on his screen, Margolus leans back in his chair and smiles slyly. And then he tells me his conception of the world we live in.
“The universe is a good hack.”
[...]
Margolus and Toffoli's CAM-6 board was finally coming into production around then, and I got the Department to order one. The company making the boards was Systems Concepts of San Francisco; I think they cost $1500. We put our order in, and I started phoning Systems Concepts up and asking them when I was going to get my board. By then I'd gotten a copy of Margolus and Toffoli's book, Cellular Automata Machines, and I was itching to start playing with the board. And still it didn't come. Finally I told System Concepts that SJSU was going to have to cancel the purchase order. The next week they sent the board. By now it was August, 1987.
The packaging of the board was kind of incredible. It came naked, all by itself, in a plastic bag in a small box of styrofoam peanuts. No cables, no software, no documentation. Just a three inch by twelve inch rectangle of plastic—actually two rectangles one on top of the other—completely covered with computer chips. There were two sockets at one end. I called Systems Concepts again, and they sent me a few pages of documentation. You were supposed to put a cable running your graphics card's output into the CAM-6 board, and then plug your monitor cable into the CAM-6's other socket. No, Systems Concepts didn't have any cables, they were waiting for a special kind of cable from Asia. So Steve Ware, one of the SJSU Math&CS Department techs, made me a cable. All I needed then was the software to drive the board, and as soon as I phoned Toffoli he sent me a copy.
Starting to write programs for the CAM-6 took a little bit of time because the language it uses is Forth. This is an offbeat computer language that uses reverse Polish notation. Once you get used to it, Forth is very clean and nice, but it makes you worry about things you shouldn't really have to worry about. But, hey, if I needed to know Forth to see cellular automata, then by God I'd know Forth. I picked it up fast and spent the next four or five months hacking the CAM-6.
The big turning point came in October, when I was invited to Hackers 3.0, the 1987 edition of the great annual Hackers' conference held at a camp near Saratoga, CA. I got invited thanks to James Blinn, a graphics wizard who also happens to be a fan of my science fiction books. As a relative novice to computing, I felt a little diffident showing up at Hackers, but everyone there was really nice. It was like, “Come on in! The more the merrier! We're having fun, yeeeeee-haw!”
I brought my AT along with the CAM-6 in it, and did demos all night long. People were blown away by the images, though not too many of them sounded like they were ready to a) cough up $1500, b) beg Systems Concepts for delivery, and c) learn Forth in order to use a CAM-6 themselves. A bunch of the hackers made me take the board out of my computer and let them look at it. Not knowing too much about hardware, I'd imagined all along that the CAM-6 had some special processors on it. But the hackers informed me that all it really had was a few latches and a lot of fast RAM memory chips.
Toffoli and Margolus wrote a fine book about things you could do with their machine. It was more fun to read than Wolfram's later "New Kind of Science".
This isn't a CAM simulator, but I once wrote a little hack of a CA playground inspired by it. Table-driven in a similar way, user-scriptable with JS, includes neighborhoods like the Margolus neighborhood. https://github.com/darius/js-playground/blob/master/ca.js and the associated ca.html.
I love his work, both SF and math and CA! I met Rudy at the Hackers conference decades ago and gave him a demo of an early version of my CAM6 simulator running on my SparcStation 2, and he showed me Cellab that he and John Walker developed at Autodesk, and we've been exchanging ideas about CA since then. A few years ago he pointed out a very subtle bug in my CAM6 heat diffusion code that I fixed.
>Flick is named after “flickercladding,” the CA skin which covers the robots in my books Software and Wetware. In Flick, we see an AutoShade®d office whose rug is made of flickercladding that runs the TimeTun rule. You can tell which parts of the picture are “rug” because these cells have their bit #7 set to 1.
Flickercladding Interior Decoration
Conceived by Rudy Rucker
Drawn by Gary Wells
Modeled with AutoCAD
Rendered by AutoShade
Perpetrated by Kelvin R. Throop.
In this rule, we only change the cells whose high
bits are on. These cells are updated according to
the TimeTun rule.
>Some looked humanoid, some looked like spiders, some looked like snakes ...All were covered with flickercladding, a microwired imipolex compound that could absorb and emit light.
>The plot goes disastrously awry, and a human corporation called ISDN retaliates against the boppers by infecting them with a genetically modified organism called chipmold. The artificial disease succeeds in killing off the boppers, but when it infects the boppers' outer coating, a kind of smart plastic known as flickercladding, it creates a new race of intelligent symbiotes known as moldies — thus fulfilling Berenice's dream of an organic/synthetic hybrid.
DonHopkins on Oct 25, 2017 | parent | context | favorite | on: Boustrophedon
The Floyd Steinberg error diffusion dithering algorithm can use a boustrophedonous scan order to eliminate the diagonal geometric artifacts you get by scanning each row the same direction.
"In some implementations, the horizontal direction of scan alternates between lines; this is called "serpentine scanning" or boustrophedon transform dithering."
I implemented some eight bit cellular automata heat diffusion rules with error diffusion, which accumulated an unfortunate drift up and to the right because of the scan order.
"Rudy Rucker: I feel like you might have some kind of bug in your update code, an off-by-one thing or a problem with the buffer flipping. My reason is that I see persistent upward drift in the action, like if I mouse drag a blob it generally moves up. Also the patterns appearing in the blob aren't uniform. I mean...this IS supposed to be the 2D Rug rule, isn't it?"
So instead of scanning back and forth boustrophedoniously (which wouldn't eliminate the vertical drift, just the horizontal drift), I rotated the direction of scanning 90 degrees each frame ("spinning scan") to spread the drift out evenly in all directions over time.
// Rotate the direction of scanning 90 degrees every step,
// to cancel out the dithering artifacts that would cause the
// heat to drift up and to the right.
That totally canceled out the unwanted drifting and geometric dithering artifacts! That made it possible to cultivate much more subtle (or not-so-subtle) effects, like dynamically switching per-cell between different anisotropic convolution kernels (see the "Twistier Marble" rule for an extreme example).
// This code originally started life as a CAM6 simulator written in C
// and Forth, based on the original CAM6 hardware and compatible with
// the brilliant Forth software developed by Toffoli and Margolus. But
// then it took on a life of its own (not to mention a lot of other CA
// rules), and evolved into supporting many other cellular automata
// rules and image processing effects. Eventually it was translated to
// C++ and Python, and then more recently it has finally been
// rewritten from the ground up in JavaScript.
// The CAM6 hardware and Forth software for defining rules and
// orchestrating simulations is thoroughly described in this wonderful
// book by Tommaso Toffoli and Norman Margolus of MIT.
// Cellular Automata Machines: A New Environment for Modeling
// Published April 1987 by MIT Press. ISBN: 9780262200608.
// http://mitpress.mit.edu/9780262526319/
No I haven't heard of that -- I'll check it out! Thanks for the juicy tip.
I played with the CAM-6 in Norman Margolus's office at MIT, and a friend of mine who worked for him brought one to a science fiction convention where we tripped out on it all night in a hotel room! I saved a copy of the floppies full of Forth code. (linked above)
register machines need to run about half as many instructions on average to do the same thing, which means that at the same clock speed a 1-cycle-per-clock cpu will go about twice as fast with a register instruction set. countering that is that the stack machine doesn't need multiplexers on the inputs to the alu, so it may actually be able to run at a higher clock speed if those multiplexers were part of the critical path, and the stack machine will occupy less silicon area, so you may be able to have more of them (but usually ram takes a lot more area than your cpu)
for out-of-order and superscalar processing, register machines have an advantage in that you can more easily have successive instructions that don't interfere, simply by not using the same registers. generally in a stack machine instruction it's hard to avoid using the result of the previous instruction; you need an extra instruction to do it (`over` or `swap` or something)
but i suspect it's partly just path-dependence: koopman's new wave of stack machines were already struggling upwind against 30 years of ibm 360, intel 8080, dec pdp-11, motorola 68000, dg nova, sun/fujitsu sparc, etc., all register machines, and so a rich store of knowledge had built up about them in compiler backends, assembly programmers, and cpu designers. maybe if chuck had designed the rtx2010 in 01973 instead of 01988 the story would have been different
maybe worth noting that all of the jvm, microsoft's clone of it (the cil), and wasm (as sph pointed out) are stack-based, and arm for a while had a 'jazelle' instruction set which interpreted many jvm bytecodes directly (trapping to software for more complex ones), so in a sense stack-based programs are more widespread now than they ever were before; that's what you see inside a .class file. they just are usually translated into some other instruction set before execution
also several people have pointed out that the stack machines most commonly used in practice, including the ones i mentioned above, are hybrids; they have architectural registers (often called local variables) but you have to copy them onto the stack to operate on them
Netflix's Altas, a telemetry services, uses a stack language for querying metrics. The language has a relatively steep learning curve, but once one groks it, it is surprisingly easy to use to compose complex OLAP queries.
One intersection insight I had with stack languages is that they might lend themselves to an AST-like visual approach. The tree <-> postfix duality was an interesting aha for me that I’d long forgotten since that sophomore data structures class. :)
I've started building a small desk clock project based on a Rockwell R65F11 embedded FORTH processor, in an effort to force myself to actually do a nontrivial project in FORTH. It's not progressing quickly, but it's also not a priority. Several folks whose opinions matter to me were big FORTH hackers in the late 70s thru the 90s, so I figure there must be something there worth hacking on!
No. This is a kind of duff’s device[0] although it is hard to identify. In essence, it’s attempting to manually unroll the loop’s - the while(true) - switch cases, so that if you have a PLUS then a DROP right after it can just run linearly(that is, fall through to case DROP), not having to go back to the top of the loop to go back into the switch statement and then check that ip != PLUS to get to case DROP, etc.
Mitch Bradley's CForth has an infinite while loop as the inner interpreter with a switch statement inside, that uses a macro "#define next continue" to jump to the next instruction in the infinite while loop, instead of break.
Mitch's CForth is about as efficient (and well designed and refined) a Forth interpreter you can possibly implement in portable C.
Mitch's OpenFirmware is a more traditional industrial strength Forth implementation, extremely advanced, feature rich, and well designed, that runs on many different processors and hardware, has assemblers for different CPUs, and a metacompiler that supports many different CPUs, word sizes, byte orders, threading techniques, top level interactive control structures, headerless code, machine independent bytecode, read-only firmware, portable device drivers, hardware and driver configuration, and many hardware platforms.
This is an implementation of Open Firmware, a processor-independent
firmware design. Open Firmware is specified by IEEE 1275-1994,
Standard for Boot (Initialization, Configuration) Firmware. The
IEEE standard designation lapsed in 1999, when the Open Firmware
working group declined to go through the rather time-consuming
IEEE reaffirmation process, but the design lives on.
This implementation was written primarily by Mitch Bradley, the
original author of Open Firmware. The bulk of the recent
development was done at Firmworks by Bradley and colleagues.
It traces its roots back to the original Open Boot firmware
implementation that Bradley developed at Sun Microsystems in the
late 80s and early 90s. That in turn had roots in Forthmacs,
a Forth programming language implementation developed and
marketed during the early 80s by Bradley Forthware. And
Forthmacs, in turn, owes a debt of gratitude to the public
domain Forth implementations F83, developed by Michael Perry,
and MVP Forth, by Glen Haydon. Lilian Walter wrote the USB stack.
DonHopkins on Nov 19, 2021 | parent | context | favorite | on: Forth vs Lisp
The inner interpreter of Mitch Bradley's portable and efficient CForth is simply implemented with a big switch statement, in the function called "inner_interpreter".
The C compiler typically compiles that code with the big switch statement into an efficient jump table, behind the scenes.
Properly speaking, that's a token threaded interpreter, since it switches on token numbers in the code field, instead of indirecting function through pointers to the C equivalent of "code" words.
I've previously posted some stuff about Mitch Bradley -- I have used various versions of his ForthMacs / CForth / OpenFirmware systems, and I was his summer intern at Sun in '87!
Mitch is an EXTREMELY productive FORTH programmer! He explains that FORTH is a "Glass Box": you just have to memorize its relatively simple set of standard words, and then you can have a complete understanding and full visibility into exactly how every part of the system works: there is no mysterious "magic", you can grok and extend every part of the system all the way down to the metal. It's especially nice when you have a good decompiler / dissassembler ("SEE") like ForthMacs, CForth, and OpenFirmware do.
They wanted to make it low friction to learn for their audience who are largely javascript programmers. I think it's a reasonable tradeoff even if I personally would prefer a more combinator style syntax.
Was that really the intent? I made a longer rant elsewhere in this thread but I sure got the impression it was really for compiler implementers and not for anyone really writing in it at all. It’s sort of a false Lisp and a false Forth when I really should have not been afraid of the ARM vs x86 split and brushed up on a real assembly. :)
WASM grew out of the earlier asm.js, which was explicitly designed as a subset of javascript: http://asmjs.org/spec/latest/ The core idea there was to identify a subset of JS that was high performance on browsers and could be used as a compile target. WASM appeared because once asm.js got some momentum, people naturally asked "hey if browsers could implement just a couple extra things how much better could we make this?"
what they actually came up with was just as strongly against the familiarity of the community momentum already built, wasn't it?
what features do you have in mind of the current wasm spec that are (1) in common with asm.js and (2) not in common with a hypothetical more-forth-like spec for efficient secure execution in browsers?
I guess my point is there minds are going to be melted anyway by 1. low level details 2. a stack machine and 3. s-expressions. might as well use a text format that at least fits the domain.
As Mitch's summer intern at Sun in 1987, I used CForth to make an interactive extension language for CADroid (a printed circuit design program from Lucasfilms that Sun used internally).
all it really took was for me to stop using stack manipulation operators and treat forth as an interactive user interface rather than a programming language
That's an incredible blog! I have two wishes now: that you would continue to write your notes because I've skimmed a few and they were deeply interesting to me, and that you would add an RSS feed, so I can add it to Feedbin.
But in any case, thanks. Lot of interesting ideas in that article.
glad you like it! it's more a book than a blog; its contents are final and immutable, so an rss feed would be of very limited interest. the sequels so far are derctuo and dernocua