
Coding for the World's Trickiest Chip – SEGA's Saturn DSP [video] - mises
https://www.youtube.com/watch?v=n8plen8cLro
======
DrPhish
Back in the day, I managed to figure out a way to get into SEGA's dev site by
guessing a URL beyond the login page. It wasn't secured in any way. The
username and password would just pass you into a hidden URL, so guessing a URL
was functionally identical to logging in with proper credentials When I
figured that out, I downloaded the entire contents of the site. Lots of PDFs
with "SEGA Confidential" watermarks, header files and small executable
utilities.

Luckily I was working at an ISP and had lots of bandwidth to get it all before
they closed the hole a few days later.

I tried to write an emulator, and wrote a CD reader/debugger that worked
pretty slick, but never got anything to a place where I could release
anything.

I've always kind of wondered if I have anything that hasn't made its way into
the general SS emulation community...

~~~
drx
I am involved with the community, I would be very happy to help you figure out
what is or is not out there. My email is in my profile.

~~~
DrPhish
Contacted

------
lostgame
The Sega Saturn is my go-to favourite console to write homebrew for.

I spend some free time reading and re-reading technical documents regarding
the infrastructure of the system, and the functions and features each chip
has.

With it’s vast complexity but incredible feature set, it’s not a wonder that
average console programmers at the time could absolutely not wrap their heads
around making the most of the console, and yet dedicated enough teams like
Traveller’s Tales, with Sonic R, and Sonic Team, with NiGHTS and Burning
Rangers, were able to get some unbelievably beautiful effects and performance
from the system.

Jo-Engine ([https://jo-engine.org](https://jo-engine.org)) is a fantastic,
open-source homebrew development kit that enables an incredibly simple and
effective starting point for anyone wishing to code for this fantastic and
overlooked system.

XL2’s fantastic Sonic Z-Treme is an example of the insane levels of creativity
and performance this engine is capable of.

~~~
TapamN
I do Dreamcast homebrew
([https://www.youtube.com/watch?v=2uZP9iOQc6E](https://www.youtube.com/watch?v=2uZP9iOQc6E))
([https://www.youtube.com/watch?v=MlFu-y1LDbs](https://www.youtube.com/watch?v=MlFu-y1LDbs))
and I've thought about trying Saturn homebrew, but getting code to run on a
Saturn has been more difficult than running it on a Dreamcast.

With a DC, I just connect it to a computer through serial or ethernet, pop in
a CD-R with a loader program, and I can easily upload and run code on it.

On the Saturn, getting a loader to run it much more difficult since it
requires some kind of console modification (like disabling the disc-door state
detection in order to allow disc swapping, or getting hold of a sold-out disc
drive replacement) or an ISA-only PC Comms Link card and a Pro Action Replay.
(I do happen to own a PAR, but not a Comms Link card and would rather avoid
setting up ISA compatible hardware just for one purpose.) Testing on an
emulator is a terrible idea, because you have no idea if it will work on real
hardware, and I like trying to come up with ways to break emulators anyways,
so that option's no good.

And... I was going to ask if you had any ideas for how to upload code in my
situation, but I decided to search for "pc comms link saturn" while typing
this and found a Bluetooth adapter for the PAR's comms port, so... I guess I
just ordered one.

Actually, I've been working on an SH-2/3 assembly blitting library to run on
the HP Jornada 690, a palmtop computer which has a 133 MHz SH3-DSP as the CPU.
One of my goals for the library is for it to run on a 32X (which I actually
don't own), and a Saturn is much closer to a 32X than the Jornada, so I'll be
able to get a better idea of it's actual 32X performance this way.

And the Jornada really does use the DSP variant of the SH-3. I was
disassembling the hardware initialization code of the boot ROM to figure out
what it's memory timings were (to compare them to the 32X) and found it
executing DSP instructions. I wrote a test program to double check, and it
passed. The Jornada has a software modem, so the DSP capabilities are probably
used for that.

~~~
lostgame
Oh, for me, I absolutely use emulation, for obvious reasons, debugging and
RAM/VRAM viewing/mapping being the obvious ones - I then use a tool provided
by jo-engine that does all the tough work of creating an ISO file that I
simply burn to my drive-door modified Model 1 NTSC Saturn when I want.

It's not ideal. Nor, however, for me, is the Dreamcast's need for either an
outdated serial port, or an expensive and rare broadband adapter.

Furthermore, the Dreamcast just seems like one of the first systems that
seemed extremely close to a PC. I love that I can still fiddle with assembly
in very logical ways with the Saturn. I'm not sure if that's the case with the
Dreamcast. (The DC, is, incidentally, one of my favourite consoles ever in
terms of software!)

For the Saturn, there is a new piece of software that can be uploaded to
certain times of common/inexpensive Action Replay/Backup RAM carts that allows
you to run standard burned CD's. I've meant to buy one for a while now, but
sadly I haven't had the time I've wanted to prioritize Saturn dev in the last
while due to adulting. :3

~~~
TapamN
Well, not many DC emulators will run my code anymore, and eventually none
will. :P On the DC, I used to use serial port with a USB adapter that manages
384k baud, but upgraded to a BBA several years ago.

I really, really, don't like the idea of doing a bunch of development on an
emulator, then finding out it doesn't work right on real hardware (either not
at all or with unexpectedly terrible performance) and having to figure out
what all is going wrong. Doing it all on a real console means you find out
immediately where any problems arise.

The Dreamcast tries to pretend to be like a PC, but treating it like one will
seriously limit performance. The SH-4's terrible cache needs to be babied to
get the most out of it, and there's a lot of unusual things you can do with
the 3D hardware since it works very differently than other GPUs.

You can definitely do assembly on the DC, and it's really important for
performance critical code. GCC can't use the SH-4's SIMD instructions
effectively, so you have to use at least inline asm or true assembly. I'm not
totally sure what you mean by "in very logical ways". I guess optimal SH-4
code doesn't look logical! Rendering-type code is harder on the SH-4 than the
SH-2/3 because it's superscalar and FPU instructions have longer latency than
ALU instructions. You have to spend more time figuring out how to hide
latencies, and software pipelining becomes more important for performance.
Optimal SH-4 assembler code ends up much messer looking than optimal SH-2/3
code.

For example, normalizing an array of 3D vectors in C, using inline assembly to
use the inner product (FIPR) and square root reciprocal (FSRRA) instructions,
you'd be lucky to get around 24-28 cycles per vector (it'd be even worse if
you used pure vanilla C). But with software pipelined assembly, I can manage 8
cycles per vector.

But it's much more complex, not just because it's in assembler. The optimized
main loop juggles normalizing 4 vectors at the same time (The loop is 32
cycles, with 4 vectors it averages 8 cycles per vector), and there's code to
prepare the loop and exit the loop, and special case code for if there are
fewer than 4 vectors to transform. The resulting machine code for the entire
asm version is probably around 10 times the simple version.

Another example would be transforming a bunch of 4D vectors by a 4x4 matrix. C
with inline asm will take about 16 cycles per vector, but optimized pure asm
can do 4 cycles.

A 3D rendering engine I'm working on for the DC does software vertex shaders
by basically copy & pasting optimized assembly loops into a single function
and adding a bit of glue code between them.

~~~
mkesper
Is your code inspected by emulator writers? I guess there would be a lot to
learn.

------
stagger87
Although the video is short and may not reveal everything about the chip, the
chip appears anything but tricky or complex. Seems like it would be fun to
optimize for.

~~~
Izmaki
Did you skip the part where he told that 6 instructions are being executed in
parallel per cycle or is this not tricky in your opinion?

~~~
nielsbot
Is this really under control of the programmer? Sounds like a "we need a
smarter compiler" problem...

~~~
carlmr
Writing a smart compiler for such a special application would have probably
cost more time. This unit was specifically used (from what I can gather) for
matrix multiplications, so the average Sega programmer didn't actually touch
the code it was running.

~~~
nielsbot
Thanks. Not sure why I was downvoted. What is this, stack overflow?

~~~
carlmr
I don't know either, people are judgmental I guess?

------
z3phyr
A little off topic: I do not have any old consoles, yet I would love to
program for them, especially 3DS, PS3, Saturn or Dreamcast. Is using an
emulator the right option?

PS: I also live in a place where it is very hard to find and buy them.

~~~
code_duck
An emulator would be the right option anyway, as that’s how development was
originally done for most of them. It’s too difficult to write to physical
media every time you change your program.

~~~
jowsie
I don't believe this is actually true for many consoles. Game Dev Kits where
normally a more open version of the hardware that was either hard coupled with
a PC, or able to connect to one and have game data essentially streamed too
it.

See this video for an example;
[https://www.youtube.com/watch?v=GH94fKtGr0M](https://www.youtube.com/watch?v=GH94fKtGr0M)

Not that I'm trying to discourage developing for emulators. Just see how
machine accurate the emulator you're using is first :)

------
fulafel
There were also DSPs in some computers in those days (eg NeXT hardware). Using
DSPs to accelerate GP computation tasks seemed to have much potential, which
was never realised.

~~~
sannee
These days, we accelerate DSP tasks using GPUs :)

------
mothsonasloth
As a Java developer, that video reminded me of the luxury of high level
languages.

I don't think I could have managed a professional career working with assembly
day to day.

