
Stack Computers: 4.4 Architecture of the Novix NC4016 (1989) - kristianp
https://users.ece.cmu.edu/~koopman/stack_computers/sec4_4.html
======
jacquesm
The Novix was crazy fast when it landed. I had access to one of the first
samples and we had to sign all kinds of stuff to get our hands on one. Rumor
had it that it was part of what went into the Tomahawk, but I've never been
able to substantiate that. I used it for image processing.

The later versions of the same software ran on regular x86 hardware, which
caught up pretty quickly with custom processors. I had a DSP032 which also
ended up in the drawer because of the speed with which regular processors
improved.

Interesting that now, a good 30 years later we are again using co-processors.

~~~
kjs3
I think it was Joel McCormack who wrote a paper (context: X11 display servers)
that basically said that _in the long run_ , dumb frame buffers were probably
better than dedicated graphics processors. He reasoned the general purpose
processor guys were improving performance faster than the dedicated GPU folks
were able to due to greater scale and resources. They were therefore able
(eventually) to perform the same graphics primitives just as fast as the GPUs,
and were using up all of the memory bandwidth anyway.

That doesn't _seem_ to be as true any more, and there were always corner cases
(e.g. hardware accelerated mice sprites, I think, were always a win for
various reasons), but it was an interesting debate back in the day.

~~~
jacquesm
That all changed when high level primitives and floating point operations were
moved to the graphics co-processor. SIMD at that level was a game changer. I
wrote a graphics driver for the BBC Micro to use the Elektor 'GDP' as an
output device, it totally blew the 8 bit CPU of the day out of the water.

These things tend to oscillate, presumably one day there will be another
generation of CPUs that is much closer to our current CPU/GPU combo that will
outperform GPUs and then the cycle will start over again.

~~~
kjs3
_it totally blew the 8 bit CPU of the day out of the water_

I think the counterargument in this context would be "...until we all went to
16-bit processors.". But I understand the point you're making.

 _These things tend to oscillate_

I think you're of course correct, and I don't think the original paper
disputes that. What I think (without much research) has happened is that the
_period_ of that oscillation has dramatically expanded since the paper was
written.

------
mud_dauber
I got to build algorithms for Novix's successor, the RTX2xxx family at Harris
Semiconductor. Holy cow, it was fun teaching customers how to wrap their heads
around a fully stack-based instruction set. :-)

------
hapless
For those missing the context, this was a circa 1985 attempt to execute Forth
almost directly in hardware.

Very cool, and a bit weird by 1985 standards

~~~
dbcurtis
Stack machines were never very popular outside of Burroughs. Burroughs built
several of them implementing a number of elegant ideas.

When Algol first came out, they put a grad student summer intern onto the
project of writing an Algol compiler, from scratch, for one of the stack
machines. The intern completed the entire compiler in one summer. So, the
question is, was the compiler a single summer intern project because stack
machines are so well suited to Algol, or was the compiler a single summer
intern project because the intern was Don Knuth?

~~~
mgsouth
The HP3000 [0] ("classic", pre-PA-RISC) was stack-based. It was a pretty
popular business machine; the Classic architecture was produced between early
70s and early 90s. It was 16-bit, segmented, Harvard, and competed against IBM
S/32 - S/36, AS400, PDP-11, and Vax. To most programmers there wasn't anything
too special about the architecture; almost all development was in RPG, COBOL,
or a really incredible (for CRUD) 4GL product from Cognos called Powerhouse
[1].

[0]
[https://en.m.wikipedia.org/wiki/HP_3000](https://en.m.wikipedia.org/wiki/HP_3000)

[1]
[https://en.m.wikipedia.org/wiki/PowerHouse_(programming_lang...](https://en.m.wikipedia.org/wiki/PowerHouse_\(programming_language\))

------
lebuffon
James Bowman used similar ideas to create J1 in 200 lines of Verilog.

[https://excamera.com/sphinx/fpga-j1.html](https://excamera.com/sphinx/fpga-j1.html)

------
RJhKTcQMgG
Made me dig up this old note from 2010
[http://www.ultratechnology.com/chips.htm](http://www.ultratechnology.com/chips.htm)

------
pinewurst
I had access to a Novix eval system (like a shoebox with a 5 1/4 floppy in the
side) at the time and for a lot of things it was _very_ fast - on a par with a
VAX 8600 as I remember.

My memories of FORTH are really good too, especially after writing PostScript
(FORTH enough) printer drivers that were downloaded printer programs. They
turned further downloaded lists of nodes into properly placed topological
diagrams and printed them out.

