
Nintendo 64 Architecture – A Practical Analysis - bottle2
https://copetti.org/projects/consoles/nintendo-64/
======
monocasa
I gave a talk not too long ago about running Rust on a Nintendo 64, with the
slide deck written in Rust and running on an N64.

[https://twitter.com/DebugSteven/status/1054903603985559553](https://twitter.com/DebugSteven/status/1054903603985559553)

So I guess what I'm saying is that I have pretty hands on knowledge with the
system and would be happy to answer any questions I can.

One thing I'll throw out there, is that one of the biggest limitations of the
N64 (its 4KB texture memory) gets called a texture cache a lot, but that's a
misnomer. It's a manually managed piece of memory, and (IMO) the system would
have been much better off if it were actually a cache rather than having to
load an entire texture in regardless of what was being sampled. Nowhere I've
seen in Nintendo's literature do they call it a cache either. The crazy hacks
that Rare did to subdivide their geometry on texture boundaries wouldn't be
necessary for instance. I'd maybe even be into a 2KB cache over a 4KB chunk of
manually managed memory.

One other aside is that I think the system still has tons of unlocked
potential. So much of unlocking it's power seems to be centered around memory
bank utilization. Switching which page of DRAM within a bank is expensive in
terms of latency, but it seems like if you allocate your memory in 1MB bank
chunks you can get around a lot of the limitations of the systems having the
slow memory that developers complained about at the time. I don't blame
developers at the time, they were coming from SNES where it was single cycle
access to RAM, to the N64 that had a very deep, very modern memory hierarchy
and what all that means for your code. The industry as a whole didn't really
catch on until about halfway through the PS2's development cycle. But applying
some of those PS2 techniques back, the system really purrs when you have
dedicate a 1MB bank to each streaming source or destination. I can't wait to
see what crazy stuff happens when the demoscene folk really start to get their
hands dirty with it.

~~~
zeta0134
> The crazy hacks that Rare did to subdivide their geometry on texture
> boundaries wouldn't be necessary for instance.

I would _love_ to know more about this. Is their texture format tomfoolery
written up somewhere?

~~~
monocasa
I don't think anything is written up, but you can see it if you put Project64
into wireframe mode. The most clear I've seen it is in the intro cut scene of
Conker's Bad Fur Day where the camera's slowly backing up from Conker's
throne.

So TMEM is only 4k. If you want mippmapping, that eats half of it, down to 2k
practically. That leaves you with enough room for 1 32x32x16BPP texture at
most. So I think what they did in some cases was to take a mesh with a larger
texture, and run it through a processor to tesselate the mesh on smaller
texture block boundaries in UV space, so they can render each tile of geometry
with the same texture block at once, then swap to the next subtexture and
render all it's geometry. That'd give you an apparent larger texture than you
could fit into TMEM, and is one reason (of many) why their games look so good.
They also might not have had tooling for that and just brute forced it by
hand, I can't tell just from looking at the wireframe.

------
twic
One of my favourite little easter eggs in Goldeneye is that during the Silo
mission, two of the satellite components you have to steal are the N64's RSP
and RDP:

[https://twitter.com/007goldeneye25/status/109829415491907174...](https://twitter.com/007goldeneye25/status/1098294154919071745)

~~~
LaserDiscMan
Nintendo also included their own Easter eggs:

The rabbit you need to catch beneath the castle in Super Mario 64 is named
"MIPS". :)

[https://www.mariowiki.com/MIPS](https://www.mariowiki.com/MIPS)

~~~
saagarjha
> In the remake Super Mario 64 DS, MIPS does not make a reappearance, instead
> being replaced by the rabbits scattered throughout the castle for each
> character to find.

Aw, they should have brought it back and renamed it ARM…

------
r0bbbo
For someone just starting to learn about computer architecture at a low level
(rather embarrassing for someone who's been in the industry for over a
decade), this is a really interesting read, and it's helping concepts like
pipelining and caching to gel.

~~~
als0
Hardware used to be so exotic!

~~~
msla
> Hardware used to be so exotic!

It might circle back around to being exotic again, if FPGAs take off and more
work is done on highly-specialized task-specific hardware, as opposed to
building the fastest general-purpose chips you can and beating problems to
death with sheer speed.

~~~
rhlsthrm
Is this a thing that's happening? I remember how cool I thought FPGAs were
from my college CE classes. Seems intuitive to me that you would want to
specialize the hardware once you have a process figured out. The fact that
FPGAs are upgradable makes it even more of a no brainer to me.

~~~
to11mtm
Well there's a cost benefit; Cheaper FPGAs may not necessarily be as
performant as others for some tasks, there's still a gate budget to deal with,
and I personally am not sure whether the FPGA world is one where you have full
control over what you do with the chip when you sell it in a commercial
product.

I know Gigabyte back in the mid 2000s made a PCIE Card that let you use DDR as
a disk drive once upon a time; for the original they actually used a Xilinx
Spartan FPGA since it was a smaller run.

~~~
moftz
FPGAs can sometimes be one-time-programmable using anti-fuses that basically
disables the JTAG interface or can be set to disable the interface if the FPGA
detects attempted tampering. Most of the time, an FPGA is going to be set to
OTP to prevent competitors from stealing source code for applications where
upgrading the firmware via JTAG is not necessary.

The FPGA also can have a massive unique key that allows the designer to create
a whitelist algorithm that only lets certain unique IDs run that firmware.
Other options involve setting a time limit for how long the firmware will run,
disabling certain features, or totally bricking that FPGA forever. Spartans
have this feature but it would still allow for someone to build a new design
that doesn't check the device ID.

Additionally, the bitstream can be encrypted so that if a field update is
necessary or the firmware is stored in a stored in a separate flash chip,
someone can't reverse engineer it.

Overall, the more you pay, the more security features there are available. An
example secure design would disable JTAG pins permanently and have a
microprocessor inside that would handle new updates. The processor would
authenticate any new encrypted firmware before programming the internal flash.

------
castratikron
Someone made a pin-compatible Controller Pak with FRAM that doesn't require a
battery. Has layout files too.

[http://www.qwertymodo.com/hardware-
projects/n64/nonvolatile-...](http://www.qwertymodo.com/hardware-
projects/n64/nonvolatile-nintendo-64-controller-pak)

------
rawoke083600
Brilliant article ! I love these in-depth "old-hardware" articles with my
morning coffee. (It's only 8am here in South Africa)

Lol this stood out for me (although probably just semantics...):

"Reality Co-Processor running at 62.5 MHz." What big dreams we had back then
to try and"simulate reality" with only 62.5Mhz :)

Well done author... well done Nintendo !

~~~
pjmlp
Amiga's Agnus and its AGA successor were running at around 7 - 35 Mhz tops. :)

------
bambataa
On the cost-saving point, I always understood that the limited 4kb texture
memory led to lots of games having really muddy, blurry textures. How much
more would have 8kb or 16kb cost? It seems a small cost saving that had a
pretty large, negative impact.

~~~
nwallin
Regarding simply having 8/16/32kB, the cache was integrated with the chip
itself, it wasn't RAM that lived on the motherboard. So adding more would have
required a larger chip.

It was a multifaceted problem, and was ultimately a design flaw/oversight
rather than someone saying "I think 4kB is enough memory to store all the
textures". The problem is less that the cache was small, it's more that
Nintendo's plans for how awesome RDRAM and a unified memory architecture
didn't pan out.

Problem #1: There was no dedicated video memory. All RAM on the N64 was shared
RAM. So framerates tanked if you didn't have most of your stuff in cache. Keep
in mind the framebuffer also lived in this unified memory area, so the video
chip was already very noisy on the memory bus.

Problem #2: The unified shared system RAM was RDRAM, not SDRAM. And the
latency on RDRAM is absolutely terrible. So the already expensive cost of
using RAM was compounded.

If the N64 did what the playstation and saturn did and just have dedicated
video/system RAM, and made this RAM relatively low latency SDRAM instead of
the relatively high latency RDRAM, this 4kB limitation wouldn't have mattered.

~~~
joneholland
If I recall correctly, later games actually packed higher throughout ram into
the cartridges to work around the latency of the onboard ram.

~~~
nwallin
They used uncompressed textures on the cart, in ROM. (not on-cart RAM)
Normally a game would store compressed textures in ROM, and decompress them
into RAM. It was a solution with significant tradeoffs though.

#1 It was still slower than the cache.

#2 You were still using the single shared bus. You would still be using cycles
which contribute to data stalls elsewhere in the system.

#3 ROM was expensive. N64 games were typically in the ballpark of $10 more
expensive than Playstation or Saturn games because of the manufacturing
expense.

#4 I don't fully understand why, but it was all or nothing. You couldn't have
uncompressed textures in ROM but also gain the benefit of the cache. Maybe the
cache invalidation was poor or something. I wish I knew more.

Later games were more likely to go this route because ROM was cheaper.
(Moore's Law and all that)

~~~
monocasa
So the TMEM wasn't cache, but manually managed memory split into 8 512 byte
banks that had to be loaded from the RDP's command list stream. That's half
the problem.

Additionally, the TMEM could only be loaded from RDRAM, not directly from the
cartridge. I think the RDP's DMA master is only connected to the RDRAM slave
port and not the main system's bus matrix.

So going back to it, games would a lot of the time store compressed data with
a simple algorithm that could run out of the CPU's cache. Then the scheme
looks like

* Cart->RDRAM DMA of compressed texture

* CPU decompresses texture into another RDRAM bank, and can be considered a RDRAM->RDRAM transfer. Sometimes the RSP handles this instead. I'm not sure if you could load straight out of RSP DMEM to avoid another bounce to RDRAM. I don't think XBUS works that way, but I could be wrong.

* RDRAM->TMEM DMA of uncompressed texture

Interestingly, games with more advanced texturing schemes like Indiana Jones
tended to use uncompressed textures. They did this to avoid the decompression
step and it's bandwidth. At that point it's just staging the texture with that
cart's DMA, and slurping that into TMEM without any other processors eating
bandwidth in between.

------
Causality1
I don't think I would have gotten as interested in gaming if it weren't for
Nintendo's decisions with the N64. The native bilinear texture filtering,
Z-buffer, and subpixel model rendering make such an enormous difference to me
I'd have found the Playstation unplayable.

~~~
webwielder2
To me, N64 continued the video game "tradition" I knew from the 16-bit era.
Colorful, fast-paced, responsive. PSX games by contrast were dour, slow-paced,
and controlled poorly because of Sony's initial failure to consider how
digital control wouldn't work in a 3D environment.

~~~
echelon
The N64's control sticks used digital rotary encoders.

~~~
duskwuff
The PSX only had D-pad style controls at release in 1995 -- no joysticks! The
Dual Analog controller wasn't released until 1997.

(I think you're getting distracted by the terms "digital" and "analog". It may
help to think about this in terms of discrete and continuous inputs instead.)

------
293984j29384
The design of this web page leaves a bit to be desired. The tabbed boxes with
the light grey background on the white background of the website itself are
pretty easy to miss as you scroll forever.

------
plerpin
Did Rambus have a dossier of compromising photos on industry executives in the
mid 90's? Why did Intel and Nintendo go all in on such an expensive and
technically inferior memory technology? The latency is such a killer,
especially if you're on an architecture with really deep pipelines (ahem, P4).

~~~
rasz
Rambus meant Nintendo shipped 5 gen console on 2 layer pcb with only 4 big
ICs. TWO layers! that was a huge cost saving. Compare to Sega Saturn with a
total of something like 144bits of various memory buses divided into multiple
memory banks over multiple memory chips.

~~~
plerpin
How many layers did the Saturn's PCB have?

~~~
rasz
at least 4 like playstation, compare this nightmare [https://mcretro.net/sega-
saturn-photographing-has-begun/](https://mcretro.net/sega-saturn-
photographing-has-begun/)

to [https://bitbuilt.net/forums/index.php?threads/trimming-
your-...](https://bitbuilt.net/forums/index.php?threads/trimming-
your-n64-revs-1-4.52/) revercse engineered pcb layout here
[https://gmanmodz.com/2020/01/30/2020-the-year-
of-n64-again/](https://gmanmodz.com/2020/01/30/2020-the-year-of-n64-again/)

One was thrown together by committee with some function goal in mind, the
other designed top to bottom with huge influence from process engineers.
Simplified layout and reducent component count/variety means less time in
pick&place, faster optical alignment, faster optical inspection, less
opportunity for process flaws.

~~~
plerpin
Very interesting, thanks.

------
rasz
some more hardware info including die shots and development info (verilog
simulation etc) here
[https://www.eevblog.com/forum/blog/eevblog-491-nintendo-64-g...](https://www.eevblog.com/forum/blog/eevblog-491-nintendo-64-game-
console-teardown/)

------
littleweep
I get the nostalgia angle -- but why are we discussing a console that's >20
years old?

~~~
zapzupnz
Because it's interesting and technical. Despite the name 'Hacker News', I
don't know if you've noticed but not a lot of what's on here is necessarily
news — just plenty of food for thought for engineers, people in comp sci, etc.

