
Playstation Architecture: A Practical Analysis - plerpin
https://copetti.org/projects/consoles/playstation/
======
louthy
I remember developing for the PS back in 1996-1999. The first title I worked
on I was building the graphics engine and animation systems. Originally in C,
then in MIPS assembler to get as much perf as possible: with a fixed target
the difference for your title would be mostly down to the performance of the
graphics engine.

I got to the point where I'd fitted the entire graphics engine and animation
system into 4K, so it would fit in the instruction cache, and moved as much
regularly used data into the 1K scratchpad as I could fit (Yes, an L1 cache
that you decided manually what to put in it!). Access to the scratchpad would
take 1 cycle.

Then I'd _' hand interleave'_ asm operations. Reading from memory was slow, it
took 4 cycles, which normally would be filled with NOPs by the C compiler (1
load instruction, 3 NOPs). So, I'd use assembly instead of C and try to fill
those NOPs with other actual operations that didn't need the memory being
requested, essentially doing hand-crafted concurrency.

Because loading and storing from/to memory was such a common operation, this
would make the code very, very hard to maintain, and sent me slightly crazy
for a while! Often it meant doing x, y, z operations (for 3D processing, like
vector multiplication) concurrently, but wherever the the NOPs could be
reduced, more could be done.

With various other bits of cunning I eventually got it to the point where I
broke the manufacturers specs for whatever Sony said the PS could do per-
second (memory is a bit fuzzy about what those specs were, but I remember
myself and the team being pretty damn pleased at the time).

It was a fun machine to program for. The Saturn which was out at the same time
always struggled to keep up because it was so hard to develop for, even though
on paper it was better. I think that was what struck the death knell for Sega.

~~~
pcwalton
> Reading from memory was slow, it took 4 cycles, which normally would be
> filled with NOPs by the C compiler (1 load instruction, 3 NOPs). So, I'd use
> assembly instead of C and try to fill those NOPs with other actual
> operations that didn't need the memory being requested, essentially doing
> hand-crafted concurrency.

This isn't usual for the MIPS ISA, is it? MIPS has branch delay slots but not
memory delay slots, per my understanding. (Of course if there's a data
dependency on a pending load the CPU still has to stall.)

~~~
louthy
The R3000 suffered from the need for load delay slots too, unfortunately. I
may have forgotten all the exact details though. I am genuinely in awe of
people who seem to retain the details of this stuff 10 - 20 years later, my
brain certainly doesn't work like that!

------
dzeix35wc363CzC
For anyone interested how it was developing for the original PSX (and
therefore under hardware constraints)

 _How Crash Bandicoot Hacked The Original Playstation_

[https://www.youtube.com/watch?v=izxXGuVL21o](https://www.youtube.com/watch?v=izxXGuVL21o)

Immensely interesting video! "Old" hardware makes me feel so humble regarding
what we have now. Hardware limitations back then really pushed developers
towards novel approaches and solutions.

~~~
agavin
Thanks for the interest in our shared video gaming past. I had a lot of fun
making that video. The PS1 was a fun machine as it was capable, complex enough
that you felt it had secrets, but not so bizarre or byzantine that you felt
learning them was a waste of time. And you were pretty much the only one in
there as the libraries were just libraries, not really an OS. Still true of
the PS2 although that was a complex beast, but by the PS3 there was more of a
real OS presence. If you want some more, slightly different, slightly
overlapping info on the PS1 or making Crash, I have a mess of articles on my
blog on the topic: [https://all-things-andy-gavin.com/video-games/making-
crash/](https://all-things-andy-gavin.com/video-games/making-crash/)

~~~
2600
I really enjoyed your "Making Crash Bandicoot" blog posts and the "War
Stories" video. I would love to read about your work on Jak & Daxter and
working on the PS2.

~~~
kernelbugs
Me too, the Jak & Daxter games have a very special place in my heart as my
childhood introduction to gaming. I'd love to reach about it!

------
flipacholas
Wow I never imagined my article being in hackernews. Thanks for sharing it! If
anyone has any comments, requests or want to report a mistake, please drop me
a message (email address is in that website), I’m constantly adding more
material.

~~~
sgwil
I really enjoyed the special features (3d model viewers etc) you added to this
article. It's a great presentation and an interesting read.

------
fxtentacle
It makes me feel nostalgic to read about the times when people still tried to
understand the hardware and make things work amazingly well despite harsh
resource limits.

Nowadays, even simplistic programs take ages on a device 1000 times more
powerful. I had time to read this article because restarting our ruby
development server is so excruciatingly slow.

It seems the craftsmanship aspect of computer programming is getting lost, and
all that remains is the large leverage that one can use to drive profits with
software.

~~~
Razengan
> _It seems the craftsmanship aspect of computer programming is getting lost,
> and all that remains is the large leverage that one can use to drive profits
> with software._

Too many lazy/greedy developers fighting against users and operating systems
(coughelectroncough)

~~~
inform880
I understand that Electron apps tend to be a resource hog, but what other
options for software create as many cross platform opportunities with as much
work? I think for every single greedy developer/company there's another small
software project only able to get off the ground because of the maximized
opportunities. This is coming from a relatively new full time developer who's
trying to get a small side project off the ground.

~~~
AnIdiotOnTheNet
> I understand that Electron apps tend to be a resource hog, but what other
> options for software create as many cross platform opportunities with as
> much work?

Is laziness a good excuse for poor software? For wasting the time and
resources of every one who uses the application? For disregarding the
conventions of the host platform including any user preferences and
accessibility features?

~~~
inform880
> For wasting the time and resources of every one who uses the application?

I don't believe this argument for a second, with the ubiquitousness of
successful Electron apps.

~~~
AnIdiotOnTheNet
An application that takes longer to load, is slower, and consumes more CPU and
RAM is by definition wasting the user's time and resources. The slow and
bloated nature of Electron applications is griped about quite often. That
people are required to use Slack for work doesn't change that.

------
willis936
There’s a lot of awe in the comments about how limitation drove ingenuity.
It’s true throughout most of the second half of the 20th century electronics,
not just the toys that came at the end. The gameboy is a great example. A Z80
is an incredibly simple thing. You have to sit there and hack together games
in assembly, performing all sorts of tricks to keep the memory footprint down.
The first speak and spell is another example. I wish I could conjure up more,
but basically any novel toy that involved electricity between 1950 and 1970 is
a work of art. You don’t need a computer, or even a transistor, to use
electronics tools in a creative way to make an interactive toy.

------
ddoolin
It's crazy to see just how far they've come from the first console to the
latest and all the lessons they've learned along the way.

[https://www.youtube.com/watch?v=ph8LyNIT9sg](https://www.youtube.com/watch?v=ph8LyNIT9sg)
(Some Playstation 5 architecture highlights).

~~~
Koshkin
Well, as much as I would love to share the excitement, all they've come to is
what essentially amounts to another gaming PC. Not sure how 'far' this is, but
clearly all innovation happens now on the PC side. (It almost seems now that
the fact that the earlier Playstation models and other consoles had the custom
architecture that was different from PC may have had to do with the need to
provide enough power in a smaller package.)

~~~
mortenjorck
Did you watch Cerny’s presentation, specifically the part on the custom SSD
architecture? I took the opposite conclusion: This is the most innovative
development from the console market in possibly decades. They’re throwing
everything behind optimizing the SSD-to-VRAM pipeline in a way that component-
built PCs won’t be able to do until several new industry standards are
developed, with potentially profound impacts on the way games are designed.

It’s specifically an interesting contrast with the Playstation 3’s exotic Cell
processor architecture, which was probably driven more by a desire to _appear_
innovative than by practical applications. By moving to standard x86
architecture, Sony has counterintuitively allowed its system designers to
focus on areas that will actually give game developers some novel
possibilities.

~~~
deelowe
> optimizing the SSD-to-VRAM pipeline in a way that component-built PCs won’t
> be able to do until several new industry standards are developed, with
> potentially profound impacts on the way games are designed.

What are the innovations? I'm just seeing gen 4 pci-e paired with some sort of
nvram solution (details are sparse). Apologies if I'm missing something
spectacular here...

~~~
wtallis
The only really interesting thing is that the console SoCs have decompression
offload hardware, so data coming in from the SSD at 4-5GB/s can be unpacked to
a 9+GB/s stream of assets, straight into the RAM shared by the CPU and GPU. A
desktop PC can easily handle the decompression on the CPU with a combination
of higher core counts and higher clock speeds, but then the data still needs
to be moved across a PCIe link to the GPU.

The SSDs themselves are nothing special, and there's no clear sign of anything
else in the storage stack for either new console being novel. It looks like
they're improving storage APIs and maybe using an IOMMU, neither of which
requires new industry standards for the PC platform.

~~~
mrguyorama
When it comes to graphics assets, they've been stored compressed and
decompressed by the GPU on the fly pretty much since early versions of
DirectX. What is the innovative part here?

~~~
wtallis
They're using lossless compression algorithms that are a lot more complicated
than S3 Texture Compression and friends, and work on arbitrary data rather
than being appropriate only for textures.

------
kozak
What's amazing is that it had only 2 MB of RAM, that's at least 2000 times
less than a cheap phone has today. In fact, that's less than a moderate
quality mobile phone photo as stored on disk.

~~~
mikorym
I think these programmers were forced to understand their subject matter much
better than today.

Say for (a silly) example you want the first 10 000 digits of Pi. It's pretty
easy to just store that today. But back then you didn't just have to know what
Pi _is_ , you had to have the smallest program to calculate it that you can
think of.

Would be interesting to hear too how Tekken 3's coders managed to work with 2
MB of RAM.

~~~
pjmlp
Which is why I find so ironic that many still think only languages like
Assembly, C or C++ have a place in IoT on devices like ESP32.

Sure, we used Assembly when performance was the ultimate goal, but also plenty
of high level languages, including stuff like Clipper for database front ends.

512 KB with a couple of MHz are already capable of doing a lot of stuff, one
just needs to actually think how to properly implement it.

~~~
jstimpfle
> Which is why I find so ironic that many still think only languages like
> Assembly, C or C++ have a place in IoT on devices like ESP32.

Short answer, Parkinson's law.

When I browse the Web, open my Windows File Explorer, open Photoshop, open
Visual Studio, open just a graphical application that needs GPU acceleration,
or do whatever thing that should not be an issue, I find that Parkinson's law
very much applies.

[https://youtu.be/GC-0tCy4P1U?t=1727](https://youtu.be/GC-0tCy4P1U?t=1727)

[https://youtu.be/GC-0tCy4P1U?t=2172](https://youtu.be/GC-0tCy4P1U?t=2172)

[https://twitter.com/rogerclark/status/1247299730314416128](https://twitter.com/rogerclark/status/1247299730314416128)

Wtf. Did you see how fast VS6 started on a machine from almost 20 years ago.
Today I get depressed whenever I need to open Visual Studio, and I don't even
know that I use any more functionality that wasn't available 20 years ago.

Many many programmers (in my perception at least) are extremely dissatisfied
with today's state of computing, and the reason why we got here is the popular
opinion that we don't need to worry about performance.

Nobody should optimize representations of Pi or count CPU cycles by default.
That's not the point. But if you claim that Java is always fast enough for
example, then I think we're having a strong disagreement. It's partly
confirmation bias and being unaware of vast parts of the landscape, but it's
also a fact, that I couldn't name you a complex GUI written in Java with
satisfying ergonomics.

~~~
pjc50
Java (the language) is not especially slow; back in the mid 2000s I was
optimising a Java GUI to display hundreds of thousands of nodes for chip
design clock trees.

Java (the culture) makes it hard to be performant. There's a great tendency to
use all sorts of frameworks and elaborate object systems where much simpler
code would give you at least 90% of the functionality for 10x the performance.

But if you get some programmers with experience beyond Java who care about
performance and are rewarded for performance, it's certainly possible.

I briefly wondered whether the demoscene ever had a go at Java, and indeed
they did:
[http://www.theparty.dk/wiki/Java_Demo_1999](http://www.theparty.dk/wiki/Java_Demo_1999)
/
[https://www.youtube.com/watch?v=91HzuGqpTHo](https://www.youtube.com/watch?v=91HzuGqpTHo)

~~~
jstimpfle
Yep I agree. Java isn't slow if you build a trivial application. Or an HTTP
server. Or something for batch processing. Or if you're a masochist working
around performance issues until you reach about ~C level of performance and
performance "robustness".

The problem is what Java encourages you to do. And I didn't want to single out
a language; Java is just one that I revisit from time to time and I'm always
astonished how awkwardly hard it is to do the simplest things in a
straightforward (and CPU-efficient) way.

And yes, there are a lot of slow C++ programs, and even slow C programs.
There's just a clear tendency for C programs to be faster and more responsive.

~~~
commandlinefan
> tendency for C programs to be faster

I was involved in a massive rewrite of a website from C to Java a while back.
One coworker observed that, when they were coding in C, it took a lot longer
to get anything working, but once you did, it was pretty solid: C had a
tendency to just crash if anything was wrong, so you'd work for quite a while
before you got something that didn't crash consistently. Java, on the other
hand, allowed you to get something working (that is, running without crashing)
much quicker: but the things that were out of place were still there, they
just caused harder-to-find problems that were much more likely to become
customer-facing before they were caught.

------
snvzz
Neat article, content wise. I wouldn't call it in-depth because it's very high
level, but it is neat.

The presentation could be made much better by getting rid of those tabbed
sections and just making the article one long page.

~~~
flipacholas
Thanks. The reason for the tabs is that each article has many sections (CPU,
graphics, audio, etc) so I imagined some people may be more interest
particular sections than others. For this, if you find something interesting,
you can use the tabs to read more about it. Otherwise, you only have to scroll
down a little bit. This behaviour is only found when viewed on a desktop pc by
the way, mobile users will see a long page.

Anyway, I’m not a professional web designer, the tabs are just an attempt to
keep the info concentrated.

~~~
pixelbath
I liked the tabbed sections, but I feel like they could use some
differentiation from the rest of the content. For example, in the
"limitations" tabbed section, I had to scroll down and flip through the tabs
to determine where the tabbed content ended. Separating this out a little more
would make it apparent that "changing this tab only changes this content,"
something along these lines (rough in-page mockup):
[https://imgur.com/D2lhOwV](https://imgur.com/D2lhOwV)

~~~
flipacholas
That's a nice idea, I'll experiment with it. Cheers

------
YaarPodshipnik
Another great article about PSX architecture, with regards to the DOOM port
specifically:
[http://fabiensanglard.net/doom_psx/index.html](http://fabiensanglard.net/doom_psx/index.html)

I highly recommend the series about various ports of Another World on the same
website as well.

~~~
jasonwatkinspdx
Unfortunately there's no writeup or such about it, but I worked on a doomed
(ha!) project to port Unreal 1 to PSX. I was doing level design at the time.
The programmers did manage to get a functioning renderer up. It was limited to
more simple geometry than the PC software renderer could handle at the time,
but still worked enough you could have made a game on it, had other things not
gone wrong with the project. If you search for "Unreal PSX" you'll find some
work by a couple diehard retro unreal fans to try to finish up some of the
partially completed content.

------
idoby
Is "page-flipping" basically double buffering or am I missing something?

~~~
0xcde4c3db
I hadn't heard the term in a while, but I believe page-flipping more
specifically means that the buffer roles can be switched in hardware (e.g. by
changing a "framebuffer start address" register in the CRTC or RAMDAC) instead
of requiring a buffer to be copied (e.g. via a DMA or blitter operation). Both
schemes are double-buffered in the sense that you're never actively rendering
to the same buffer that's being scanned out, but page-flipping has
significantly less overhead.

~~~
wtallis
When's the last time anyone produced a hardware platform where double-
buffering required a full copy rather than just updating a pointer/register?
PCs moved past that in the '90s at the latest, and I'd expect most other
platforms that supported 32+ bit addressing on both the CPU and graphics
processor were similarly capable of relocating the front buffer at will.

------
smaili
It blows my mind that the little mobile device in my hand is able to not only
render but actually interact with models at the touch of my fingertips that
were once only possible with a major home console. How far we’ve come :)

------
simias
>1024×512 pixels with 16-bit colours or a realistic one of 960×512 pixels with
24-bit colours allowing to draw the best frames any game has ever shown…

One small detail that I don't see mentioned in the article is that the GPU
cannot actually rasterize at 24bpp, only 15bits RGB555. 24bpp is mostly only
used to display pre-rendered static images or video decoded by the MDEC. I
seem to recall one 2D game that managed to have 24bpp gameplay but it was a
clever hack more than anything else. Internally the GPU always functions at
24bpp however, it just dithers and truncates to RGB555 so an emulator can
actually remove the truncation and run at 24bpp "natively".

Beyond that some of the GPU's limitations can be improved in emulators with
more or less complicated hacks. In particular a modification called PGXP can
be used to side-channel the depth and sub-pixel precision data to the GPU
implementation to allow perspective correct and more precise rendering:
[https://www.youtube.com/watch?v=-SXT-y0vKv4](https://www.youtube.com/watch?v=-SXT-y0vKv4)

It doesn't work perfectly with all games and it's fairly CPU-intensive but it
looks pretty decent when it works well.

>MIDI sequencing: Apart from playing samples, this chip will also synthesise
MIDI-encoded music.

I don't know what that means. I implemented the SPU on my emulator a couple of
weeks ago and I'm not really sure what that refers to.

>The port of the controller and the Memory Card are electrically identical so
the address of each one is hardcoded, Sony altered the physical shape of the
ports to avoid accidents.

To expand on that: the interface always talks to both controller and memory
card within the same slot, so when you talk to memory card 1 you also talk to
whatever is plugged into the controller port 1. Then in the serial protocol
the first byte tells who you're talking to (0x01 for pad, 0x81 for mc), and
the other device is supposed to see that and remain in high-z.

So actually plugging a memory card in a controller port (or vice-versa) would
work, the problem would be if you plugged two memory cards or two controllers
in the same port, in which case they'd speak on top of each other.

Beyond that the protocol to discuss with the memory card and especially
gamepad is, in my opinion, absolutely insane. It's over-complicated and under-
featured. It's also incredibly slow (especially for memory card access).

Regarding copy protection:

>On the other side, this check is only executed once at the start, so manually
swapping the disc just after passing the check can defeat this protection...

That works with most games, but later games were more clever: you could relock
the drive and restart the init sequence early on to see if the drive really
recognizes the disc.

It was also used as a protection against early modchips: since those would
constantly stream the SCEx magic string to unlock the drive (instead of just
during the first sectors like a real disc would) you could lock the drive,
read some sectors that shouldn't be able to unlock it then re-check. If the
drive is unlocked you know there's a modchip and you display a spooky message
about piracy. Note that this technique would detect the modchip even when
playing with an authentic disc so you'd effectively be unable to play the game
at all on modded hardware.

~~~
flipacholas
Thank you for helping me improve the article. I'll take a closer look at your
comments tonight.

------
travbrack
Spyro looked a lot better without the textures IMO

~~~
serf
Spyro without textures and only gourand shading looks like every mobile game
from 2009-2019.

From the examples it looks like the textures in Spyro were mostly about hiding
all the ugliness around the edges from the straight gourand shader output,
aside from the characters.

~~~
MegaLeon
I think Spyro is one of the best examples of Gouraud shading mastery on the
system. The textures are there to add detail, in fact they had a basic LOD
system for the environment where they swapped textured models with gouraud
shaded models using the same tint for objects far away.

The skyboxes were rendered as well as meshes and then shaded, and they still
hold up today from an artistic point of view:
[https://imgur.com/gallery/vocZw](https://imgur.com/gallery/vocZw)

