Hacker News new | past | comments | ask | show | jobs | submit login
How Naughty Dog Fit Crash Bandicoot into 2MB of RAM on the PS1 (quora.com)
934 points by ekianjo on June 18, 2015 | hide | past | web | favorite | 247 comments

If you haven't read the "Making Crash Bandicoot" blogposts, you're in for a treat.


I don't know if Dave Baggett's heroic essay (well, as heroic as a game debugging essay can be), "My Hardest Bug Ever" is part of those blogposts, but it's one of the things I now mentally associate with the legacy of Crash Bandicoot (which apparently involved all manners of engineering feats):


HN discussion here: https://news.ycombinator.com/item?id=6654905 (with participation from the author, dmbaggett https://news.ycombinator.com/user?id=dmbaggett)

Why don't to actually link to the original Quora answer where the gamasutra post was lifted from?

Just to be clear...the "Dave Baggett" who authored the Quora post seems to be the same "Dave Baggett" who is listed as the author of the Gamasutra post. Usually the phrase "lifted from" implies an act of plagiarism, but I think in this case, the two authors are one and the same.

As to why I didn't link to the Quora answer? Because not everyone on HN has a Quora account. And in the past, Quora has sometimes erected a login-wall for non-users trying to view a post. Gamasutra does not.

Nice read. I really like this part: "But we worried about the camera, dizziness, and the player’s ability to judge depth – more on that later."

It's interesting that they were concerned with dizziness and the camera, concerns which seem to have unfortunately evaporated in most 3d games made since, to their detriment.

I never had issues with dizziness ... until games started using an over-the-shoulder camera. Basically, if the character is too far removed to the side, I stop connecting my input with their movement. It feels like I'm here and there's some random guy right next to my left who mimics my movements and also I'm incorporeal. It's a sort of uncanny valley territory - if the camera is just a bit to the side - no problem; if it's a far-away but centered behind the back - I'm fine; if it sort of tries to hover to the right, but swings around (Brutal Legend does it) - it doesn't bother me. But Gears of War is practically on the edge and I can't play it for long.

These concerns are back for VR. Also, remember that Crash Bandicoot was made at a time when devs weren't sure how mouse-look should work.

Well, mouse look was already in use with shooters at that point; it was well known. Maybe you mean behind the character third-person camera? Or just the concept on consoles?

Quake came out a few months earlier and had mouselook turned off by default (and I (somehow) played the whole game with only the keyboard). So I would agree that the concept wasn't well-known or established yet.

I think they're referring to different games using the Y axis differently.

ND still seems to be concerned with it. I did not personally play The Last Of Us, but it's cinematic nature (I did watch a movie recut) makes me feel like they put a lot of work into those aspects.

Actually, "The Last of Us" is a pretty bad offender as far as camera and motion sickness go. For first person games, there's probably no solution, but third person games like "The Last of Us" could use some camera management that would eliminate or greatly reduce motion sickness. It doesn't. It's all manual and needs readjusting constantly. Which is better than the camera rotating constantly, but only slightly as the end effect while playing is almost the same.

EDIT: Crash Bandicoot on the other hand managed the camera extremely well without creating any motion sickness.

That's because Crash Bandicoot is on-rails.

If Crash Bandicoot was so hard to squeeze in 2MB, I imagine other guys like Solid Snake (can't remember its name) would be incredibly hard.

Metal Gear Solid

I worked on the port of MGS to PC back in 1999-2000. Here is what I've learned:

- Models were not "skinned" as it was popular in the day. Some textures were covering only the front part of the body, others arms, etc. As such it was possible to use very little colors per texture (16) and use palettes (which is a very small "texture" in the graphics memory). If models were skinned they would've required all the colors used anywhere on the body, and would produce other unpleasant effects (different sampling frequency, especially on the shoulders, etc.) Konami's character modeling is top-notch.

- Music/Sound - this was enignma for us. We were never given their internal sound mixer, but the popular metal gear tune was "mod"-like with very short samples - all of this + game effects was fitting in a 512kb audio buffer (adpcm).

- Game used overlays for the executable part. About 600kb were a main/shared/common part, and if I'm not mistaken 100kb or a bit more were swapped (the overlay). The main part would declare entry-points to be reached, and the "swapped" overlay were like many .so/.dylib/.dll files that knew about the main part.

- TCL-like language was used to script the game, the radio, traps/objects in the game, etc. Each character would have a "main" like function that accepted (int argc, const char argv[]) and handled the arguments from there (these were directly from the TCL scripts). Ah, the whole thing used "C" only.

- So 600kb+100kb, leaves you about 1.0mb for objects, "scenerio" files to be loaded, etc. Since our port was more or less "wrap" the PSX libs as PC, we didn't have to change too much, just on the surface - a bit like patching here and there.

- The game used a tricky pointer hack, basically on the PSX accessing a pointer with the highest-24-bit set means read it from the CPU cache, otherwise not (or maybe the other way around). This was used for example to indicate whether the C4 bomb was planted on the ground, or on the wall instead of keeping a booblean/bit flag for it. Possibly was used for some more things. Since it was 24-bit, that meant 16mb.

To work on Windows we had to ensure that we don't go above the 16mb (and the exe starts from 4MB), we also had all overlays for the game compiled-in rather than doing the swapping as the game did, but we had plenty of space even then to fit. It's possible that we might've messed up some of the AI tweaks, but no one complained, and we were young and did not care. Then I had something to find all places where these pointers were used and mask them out when they had to be read, but kept the 24-bit highest bit in there (okay, it's a bit like tagging I've learned much later when I did some Common Lisp).

- As we couldn't do shit about having the original mod-music working, we relied on couple of then popular MGS web-sites and "stole" from them the whole music piece, and other things which came as an audio "pre-rendered" form, and then played them directly from our game. Ah... So embarrased!

- On my part I'm really proud that I was able to do a global-hack where I kept the fixed-point coordinates sub-pixel precision, so our PC port did not "tremble" or "shake" like others to come. Basically on the PSX when you draw a triangle, the "chip" makes all numbers integer pixels, and each vertex "sticks" to a concrete pixel - this makes "shimmering" like feature, and I was able to get around it.

- The other team-mates were able to get software/hardware renderring (directx I think 3, or was it 5?...). Konami used quite a lot of rendering trick that were not available back then. For example the camo-suit basically used the framebuffer as texture, from the location where the character was rendered - so it looked a bit like shimmering!

- Two lessons learned from it - We've put much better high-res textures for the eyes (hired someone from Texas to do it for us), when we got the idea rejected by Hideo himself (by the phone), he told us (through the interpretter) that the game during normal game-play did not have any eye-movement, so higher-res textures would look like crap, while with a blurry texture your own eyes won't see it as a problem - it's really sometimes LESS is better.

- Another was from my boss back then. We had to have a very strict frame-rate - I thnink 30fps otherwise some things were not properly emulated. On some older machines we had the fps going below 15fps, due to the actual renderering, not game code - and since he had experience, he simply said - we'll just skip the drawing of frame then to gain some time. Now that seemed like thing that it should not work, but it did - and saved us from trying to do a non-constant frame-rate hacks.

- Another minor tidbid. The game reffered to it's files/chunks/etc. by using a 16-bit CRC, since there were quite of lot of objects - almost 32000 overall, there would be collisions, but the way Konami solved it was by simply renaming an object, rather than changing the hash-function, or something else. It puzzled us why some soldiers were called charaB or chara4 without other numbers, but when I got afraid of hash-collision, and saw that there are none (for all objects in the game) it kind of explained.

- Who knows how many other treasures we did not discover. In all working on it, made me love the "C" language more than "C++" then. The code was only with japanese comments, and early on I wrote a very simple translator - using the offline Star Dictionary (I've downloaded from somewhere), while it was not much usable, apart from really weird to understand (at first) code or algorithm, it also uncovered things like "CONTINEKU" "METARU GIRU SORIDU" (Continue, Metal Gear Solid), and at first I was like... are these folks writing english with japanese symbols?

- They had a dedicated "optimization" programmer - he basically went through the code, have found the hot-spots and turned them into assembly (mainly model t-junction extrapolations, splitting the model in pieces to fit in the small 1kb fast-cache, and few others). Hopefully he kept the original "C" code, and it was easier for us to choose the right part here and there.

Fascinating insights.

Lots of music on the PSX used a system like that, because that's a very natural fit for the PSX SPU. Tracker "modules" combine the sample data and tabulated sequencing data, but what you found more often on the PSX was separate sample wavetables and sequencing data closer to (i.e. literally converted from, and convertible back to) a MIDI format: it's smaller, timing-based, without all those pesky 00s wasting space. (It sounds better with the PSX reverb unit/buffer on top, of course.) It's actually very similar to what Minoru Akao did for the AKAO sound engine for the PSX Final Fantasy games, for example.

What did you think of the multi-tasking kernel/DMA bit in the "main" binary? (Or did you just remove that?)

By the way, the VR missions mentioned above were released as a separate add-on disc in many regions (rather than the later release Integral which the PC port was). If you do happen to have an original and can't play it on a PS2/PS3 because it doesn't recognise that the 'lid' is open (because it's a tray/slot-loader)... try launching the other executable, it runs fine :)

I don't rememmber a lot of it.

Konami were very late with delivering their (MTS?) system that was their audio/tasking thing, we were not given anything in advance. As such we've just found in the code where the samples and music had to be played, and as I've said above we "stole" (downloaded) the music data from the web-sites that had them (not sure how they've got all the effects, or it could be that we also found some waveforms in the source package).

I think at some point the "radio-codec" (okay, that's the actual Radio that Snake talks to the others) used this system - maybe a bit like fibers/threads, when a message comes, then switches. I'm not sure what exactly I did, and how much I understood it well (threads/fibers were not my thing back then... that much) but we've got it working.

I do remember disassembling it and looking at it from the other way! (I was curious.) Pre-emptive multi-tasking kernels at that level weren't things you saw frequently on consoles then. Many people just hung things off the vertical blank instead.

You can decode the audio bits - it's just Sony's special version of ADPCM - or use a cartridge (for those PlayStations old enough to have a cartridge port) and read out the SPU memory over X-Link. You didn't even need a debug model to do it (although you did need a handy parallel port and the ability to bit-bang, or run a DOS program).

The CODEC used CD-XA Mode 2 Form 2 (2532-byte sector, with less error-correction layers) ADPCM-compressed streaming audio, 1 of 8 channels, at a relatively low sample rate - which works fine for speech. Lots of PlayStation games used the same basic technique for music and voices (as well as FMVs, although the bulk of that data would have been MDEC-compressed video).

I've got more familiar later with PS2, as I move to another studio, and was an audio programmer there (Treyarch) and had to do a lot of "PS1" programming on the SPU. I was later reading an article on the TRON operating systems, and found that a lot of the primitives on the PS2 were based on it, even the scheduler to the point - https://en.wikipedia.org/wiki/TRON_project - but never got much of it.

Oh, these were some exciting times! - the whole systems was there open for you to see (at least from the software level, and to some point HW).

>- Two lessons learned from it - We've put much better high-res textures for the eyes (hired someone from Texas to do it for us), when we got the idea rejected by Hideo himself (by the phone), he told us (through the interpretter) that the game during normal game-play did not have any eye-movement, so higher-res textures would look like crap, while with a blurry texture your own eyes won't see it as a problem - it's really sometimes LESS is better.

Could this be part of the reason why I didnt like the look of the gamecube port as much as the PS version?

That's one of the appeals of classic 8-bit-style artwork: characters features are so poorly defined that it's easy for the player to mentally substitute their own perspective for the character's. (The character is essentially imagined to feel how you feel.) This can give the character a kind of 'charm' that hi res doesn't really do.

The art in the games of Bitmap Brothers, and many other games - Star Control I/II, Heroes of Might and Magic I, II & III (all with different styles), the grotesque gothic view of Disciples ][ (beautiful!) and plenty of other 2D games.

For anyone that enjoys it, here is some great art done with 8-bit palette cycling (cylcing dozen or more colors to achieve animation) - http://www.effectgames.com/demos/canvascycle/ - (select other bitmaps too)

That's a great link, thanks! Something subtly 'exciting' about those images, probably the idea of adventure buried in there somewhere :)

We've been lucky somehow I guess to discover and talk about such issues in advance. But it could be that a lot of people might not be bothered by them. Even today's game do lack fine touches and are hovering the Uncanny Valley, but it seems people are used to it... The same way I still can't get used to watching Lord Of The Rings in 48fps :)

Hehe I always get a kick off of showing my wife Uncanny Valley examples. The other day she was really creeped by the E3 FIFA footage. I on the other hand must have developed a way of adapting, or teaching my brain what to enjoy, because I loved 48fps Hobbit despite noticing the artificialness that everyone else complains about.

> it also uncovered things like "CONTINEKU" "METARU GIRU SORIDU" (Continue, Metal Gear Solid), and at first I was like... are these folks writing english with japanese symbols?

It's pretty normal for Japanese people to write English words in katakana, especially in things like games. Many program menus are perfectly readable by English speakers if you can read katakana. It's something taught in every Japanese school, so being skilled in it makes you look intelligent.

It was probably closer to KONEKUTTO and METARU GIRU SORIDO though.

Close, it's "soriddo" (ソリッド).

English loan words in Japanese are so fascinating to me. Here's an example: "limited slip differential" -> リミテッド・スリップ・デフ (rimiteddo surippu defu)

(The ・ is used to separate foreign words/names when a Japanese speaker would not be able to figure it out)

This must be how Romance-language speakers feel when they see their words modified and incorporated into English.

If you want to learn the Katakana syllabary, try this website I found recently: http://katakana.training

There's also http://hiragana.training for the other syllabary.

Yes, it's surprising how many things you can figure out just by being able to read hiragana and katakana. Though there are a lot of things that tend to be anomalous, like the insertion of small tsu characters in places an English speaker would not imagine a glottal stop, even assuming an English speaker who even knows what that is.

Sometimes it really takes imagination. I have a family member who has an arcade game labeled "Hangly Man" (a Pac-Man clone). It took quite a while for it to dawn on me to reverse that back to kana (HANGURI) and figure out that it was meant to be "Hungry Man."

>"Hangly Man"

That is quite amusing! I think the hardest word I've found for Koreans and Japanese to say is "parallel".

That is some amazing info!

I found this video where you can see the effects of the sub-pixel vertex precision issue: https://www.youtube.com/watch?v=HrFcYbwz_ws

Thank you for sharing this. This brings so many memories to me, as it was my first professional gig! I've put all my time into it, we were so fast back then, we've got the first level working on PC for two-three weeks. It took much more (7-8 months) to finish. At some point there was a deal whether we would allow load/save from any point in the game, but instead we proposed to include the VR Missions instead of it.

Early on due to my porting libraries I've introduced a severe bug, where the internal timer was 10x (or 100x?) faster, causing issues for loading/saving, this was resolved just weeks before shipping thanks to the other awesome Ukraine programmer. It was also a lesson for me to be less cocky, and take the blame sometimes.

Thank you so much for this. MGS is my all time favourite series and i spent countless hours on the PS1 version and also the Integral version on PC.

Two things to take away for coding in general: clever CRC trick; dedicated optimization programmer. These might benefit other projects in the future.

The compilers back then were horrible (gcc2 something?). Our next project was the other way around - porting IHRA Drag Racing from PC to Playstation... now that was a bitch, and showed how inexperienced we were :) (it's much easier to port to something where you have better hardware and software all along).

The IHRA Drag Racing game (PC) version had a full simulation of the engine (valves, torque) - e.g. for any configuration it'll calculate right away the torque/gear ratio tables (okay, I was never a car buff, and my memory is very short here), but essentially there was an algorithm, which I later understood certain car-tuning services were using to adjust real cars! - I mean there were DOUBLES and lots of fortran-looking code written in "C".

Where we failed was trying to reuse this code on the PSX. First there was no hardware floating point, and what took 10 seconds of calcuation on PC took 45 minutes on PS1 - unacceptable.

Again our boss made a crazy idea - why we don't precalculate some values and store them on the CD - way more limited than the PC, but still something. Not sure how the values were chosen, but overall the game was not a success - one magizine rated it, as one of the worst PSX games ever... I left the project early on, as I felt it wasn't doing good (and was feeling really bad since then about it, as I felt like deserting the person that took care of me, and brought me to US) - http://www.mobygames.com/game/playstation/ihra-motorsports-d...

I'm guessing the compilers got better with the modern game consoles? They should be great for latest since it's just multi-core x86.

Regarding engine simulation, that's a trip that it was useful enough mechanics were using it. Kind of disturbing they were using it though... Yeah, precalculation is kind of the goto way to deal with this sort of things for many resource- or performance-constrained systems. Always worth remembering.

I looked up the game on eBay. I can't get a consistent price because everyone starts at $10 and works down from there. Still worth somewhere from $1-9 plus shipping. Your worst project is at least helping people with bill money. Not the worst outcome. ;)

> Hopefully he kept the original "C" code

You mean "Fortunately he kept the original "C" code?

Great read anyway!

Thank you! And thanks for the correction, but it looks like I can't correct it anymore (no more editing allowed).

You're right about being in for a treat. Thanks for the link.

> Ultimately Crash fit into the PS1's memory with 4 bytes to spare. Yes, 4 bytes out of 2097152. Good times

Wow. Just wow. One can only imagine the amount of hard work and sweat that was put into making this possible. And the pride of developers when it actually worked and the game has become a success. Great story.

4 bytes to spare isn't surprising. When you're crunching things down, you stop once it fits. I once worked on a project that used 4x of its memory space when the project was half done. When I was done, I had 13 free bits of space. Yes, bits; I was doing a lot of bit picking to get it to work.

What was surprising is the lengths they went to to make things fit. A solver? Wow. My problem was relatively straight forward in comparison: just bit packing and silly amounts of code reuse: Hey, these two completely unrelated routines have the same 7 byte sequence; can I make it common?

Fun times, I miss projects like that.

Yes, this is kind of like "of course the remote is in the last place you look, why would you keep looking once you found it?"

A little more than a year ago I was working on a very space-constrained device: only 2kb of program flash (an attiny23 for those curious). I had to use libusb with this, which ate up a huge portion of that space. My first shot at the main program put me over the limit by almost 500 bytes. By the time I was done, I had packed the program + the usb lib into the program flash with 4 bytes to spare.

Man, was that fun.

I wonder why they didn't use an off-the-shelf bin packing solver. But I guess open source solvers weren't a thing back then, and the commercial ones were way too expensive. (Not too mention that the devs might not have heard of these beasts.)

The story of how Mew (Pokemon) came to be is also pretty interesting. From memory: when they were getting close to shipping, they had a completely full cart (only one or two bytes to spare). They removed all of the debugging stuff, which freed up enough space for a new Pokemon to be defined. A programmer secretly added Mew right before they shipped, and Nintendo didn't find out until a couple of weeks later.

That sounds like a rumor. Mew already existed in the storyline unless that programmer then added the entire Mew storyline (inc. Mewtwo) - otherwise there was no reason for Mewtwo to exist

maccard posted a source a moment ago: http://www.nintendo.co.uk/NOE/en_GB/news/iwata/iwata_asks_-_...

The character of Mew was meant to be referenced through the story, but not actually catchable. Considering that you can only get Mew by exploiting bugs in the game, it's pretty believable that it wasn't added above the radar.

You could obtain Mew in completely legitimate ways, it was given through promotional events in Japan:


Long after the game shipped and Nintendo realized that the Pokemon had been included, though. Required external hardware to get as well.

Some of the developers nowadays need to take clues from these guys.

Care to elaborate?

Presumably it's another complaint about developers these days being lazy, taking up too much memory/CPU cycles, blah blah...

The basic clock app on my Android phone takes 33MB of memory. That is completely and utterly insane. Except for playing MP3 alarms, everything on it could be done on a 64K Commodore 64 without breaking a sweat. And of course, the actual code for playing an MP3 is baked into the Android operating system, so that's not what's taking up all the space.

The basic clock app on your Android phone has up to 4k assets, shadows, faux-3d layering, tons of animations, multi-touch support, gesture support, includes a timer, a countdown timer, an alarm system, every timezone imaginable, a separate "night mode", automatic "home time" for when you are traveling, multiple clock styles (digital vs analog), and all of the code runs in a VM.

Besides some of the features, there is no way in hell a Commodore 64 would be able to handle the textures, input, and UI of an app like that.

And while you may think that those things are useless, to the overwhelming majority it's the reason why they use that app at all.

Besides, why not use 33MB of memory? I think the number of people that would actually benefit from being able to have 30MB more memory (while using the clock app) on their Android devices is literally 0. Plus being able to be somewhat wasteful with memory provides tons of benefits. It speeds development time, it reduces CPU usage via caching (which reduces battery consumption), it allows higher res textures and a nicer UI, it allows easier-to-maintain code, and tons of other little benefits.


Multi-touch is in the system libraries, shadows and faux-3d layering and animations, too, the alarm system and every timezone imaginable is in the system, automatic home time is in the system.

The app uses 33MB of storage on disk. Only the app, none of the above mentioned libs.

The Facebook app is nowadays 159MB. That’s 120 floppies. For a single app.

I thought you were taking about RAM memory, not storage memory.

Also, the Facebook app (on my Android device at least) is 40.36MB...

But accounting for assets, multi-language translations, functionality, and the fact that it contains a bunch of multi-platform code (multi platform meaning Android with Google services, android with Amazon's stuff, Android with none of that, etc... which means it cant depend on a lot of "system" libraries that might not be there), and more then 40mb isn't even that bad. That being said the Facebook app is a bit of an outlier, with most apps being in the 5MB range.

I am working on an open source project, we have a simple IRC client for android (it uses a server for the translation between IRC and our protocol, the server also allows the app to show messages from times when the device was off).

This app already has over 8MB storage space and 34MB RAM. And the app has only 5 images overall in it, uses no Google services, and has only about 25k LOC.

Is it actually taking up an additional 33 MB, or is it just using libraries that are probably already loaded into physical memory, and it just happens to map those into its virtual address space?

Pretty amazing what kind of skills game development required back then. One side of me is happy that we have all these great tools today, the other is sad because you hardly use this low level stuff in todays software development world.

We are solving different problems today, but the level of software development skills required for a game that today could be done by a single person in Unity in a few weeks is quite impressive.

A good portion of this stuff does still go on in Games Development.

I don't work in games any more, but on the last title I worked on (Forza Horizon, Xbox 360), one of my colleagues engaged in a very similar exercise in order to allow data for the open world to be streamed in quick enough to deal with car (travelling at potentially 150+mph) to drive through the world without having to wait for loads, whilst streaming direct from the DVD (we weren't allowed to install to the HDD).

Given that the world was open and you could drive in pretty much any direction, trying to pre-load the right tiles of the world was difficult, and seek times made it tough to bring stuff in if it wasn't all packed together. However we were almost at the limit of DVD capacity so we couldn't do much duplication of assets to lower the amount of seeking required.

My colleague wrote a system that took about 8 hours overnight (running on a grid) to compute a best attempt at some kind of optimized packing. It did work though!

I was amazed myself that you can drive around at high speed through a busy city in GTA 3 on the PS3 without any noticable stutter and hardly any popups - good work there, too.

Very creative. I love this kind of stories. The first interactive display, made by Sutherland IIRC, had no framebuffer (irrealistic price at that time) so the team had to stream everything to fit into the small buffer they could get. As I understand it, everything was lazy and in sync with the screen 'machinery' so there was no delay even with limited memory capacity. It felt like hardware haskell.

Some day I will write a blog post about the tricks we have to pull off to fit all code & assets - or get animations running at 30FPS - on Pebble.

Trust me when I say this: low level development is alive and well at hardware companies.

Why wait?

Blog posts serve as great marketing these days, from showing off your engineering team's technical prowess, to recruiting others who hope to participate in some really advanced problem solving.

The press also helps maintain your existence in people's minds. With Google entering the market with Android Wear, and of course Apple's entry; a couple of 'this is how cool a Pebble watch is inside' would do well to not let us forget you exist!

Yup, you see quite a few people from game dev make the transition to the embedded space when they decide that working 100hr weeks isn't their idea of fun.

I talked to a Pebble firmware engineer at Bitcamp who told me about the hoops they had to go through. Really neat stuff.

Yeah sure, not saying it's non existent, just a lot less common. A blog post about that would certainly be very interesting :)

Please write this post. I love my pebble time (but really you guys need to fix some stuff in the new OS)

Subscribed! :)

We also have pretty amazing hardware today that is surprisingly underused most of the time.

I've seen many Unity projects with the worst code you could imagine still run close to 60 FPS when shipped.

Having an easy entry point means you're also going to get a lot of mediocre programmers using it. They seem productive in the first few weeks of the project but then quickly slow down halt once they start changing the design and end up with massive overtime hours while trying to debug and optimize the resulting mess.

Its made even worse with managers trying pseudo-agile-learned-in-a-two-day-masterclass adding even more procedures and overhead.

So yeah, you can make games today with less development talent than yesterday, it's still going to cost you more than having skilled software engineers and the resulting product will be a fraction of its potential.

> I've seen many Unity projects with the worst code you could imagine still run close to 60 FPS when shipped.

On the other hand, a lot of the worst Unity dreck is horribly unoptimized and runs incredibly poorly, despite graphical simplicity and no real visual effects. If you tool around on Steam or YouTube you can find tons of examples - look up, for instance, Jim Sterling's "Squirty Play" series. Not every bad game on there has issues, but many do.

[1] https://www.youtube.com/playlist?list=PLlRceUcRZcK0zAt8sV33Z...

There are some abstractions, like TCP sockets, that I don't think anyone would leave behind. But language abstractions like "classes" or even "functions" are certainly worth digging into. (The new lambda feature in Java 8 is an interesting case - you don't have to get into C to see the complexity there, but rather the bytecode, which is a kind of abstract sort of assembly.)

I'm not terribly familiar with the scene, but there are a variety of competitions for fun and art that operate within highly constrained environments. The demoscene[0], and computer golf[1], come to mind (although ironically computer golf is in some sense has to be very high level). There's also the security scene, which is quite bit-fiddly.

It's also the case that Go and particularly Rust are quite low-level system languages at their heart, so are presumably amenable to running in constrained environments.

[0] https://en.wikipedia.org/wiki/Demoscene

[1] E.g. Rhoscript http://rhoscript.com/

Can you suggest some explanation of the lambda feature in byte code?

Sure - read this gist from the bottom up: https://gist.github.com/javajosh/55339742aad2f1a1b881

Be sure also to check out Brian Goetz's excellent "Lambdas under the covers" talk, linked in the gist.

" the other is sad because you hardly use this low level stuff in todays software development world."

I don't think there is a dichtomy. It's always about smartly leveraging available resources. The problem with modern development perhaps is then that there are these tempting high level orthodoxies that often obscure the core matter at hand. Ie. focusing on some pointless candied abstract object interface rather than focusing design efforts on data flow and the datastructures and algorithms used.

The need for low level optimization has most definetly not vanished. When program data fits into local ram the bottleneck moves into cache optimization.

Please write this post. I love my pebble time (but really you guys need to fix some stuff in the new OS)

Uh, sorry, did you accidentally reply to wrong message?

appears so. sorry!

Just try to code for mobile or embedded platforms.

Yeah they might get bigger storage every year, but the reason why Google, Apple and Microsoft do talks about package size on their developer conferences is that size is number one reason for people to avoid installing apps, or to chose which one to remove.

Also given how the app life cycle works on mobile platforms, big apps are being killed all the time they go into background.

That's not really the same though. You are not really limited by RAM in most cases, also you are not limited by slow storage media like CD/DVDs as it was/is with most of the consoles. The reality is, hardly anyone on mobile writes their own engines (a part from some of the big players) because it's just too much of an investment and Unity (or one of the other popular engines) is well optimized and allows much faster development.

It's the same.

All iOS devices except the iPad Air 2 have less than 2GB RAM (most 512 MB). Android 1-4 devices have often less than 1GB RAM. It's common that only 1-3 apps can stay in RAM depending on the platform and the apps memory usage (foreground apps, not background services).

Applications/games in the Win95/PS1/N64 era were coded a lot more efficient. Back than, a common Win95a PC had 4-8MB RAM, (highend was 32MB).

Win95 machines with 4 MB RAM were exceptions, not the rule. It was very painful to use such machine, as it was swapping all time time, otherwise doing nothing.

Any realistic setup had 8 MB RAM or more.

Application at that time also didn't support i18n, didn't anti alias fonts, had low-res, low-color assets, that were enough at 320x200(240)/640x480 resolution.

Windows 95 was painful to use even with 8 MB. 12 MB was minimum amount that didn't cause it to swap all the time when actually doing something. 16 MB was nice.

A Pentium 133 with 8MB and 800x600 (32bit colors) run fine in with Win95a and several open applications. Try that with Android, even with 2GB RAM and quad core CPU - the Java based system on top of Linux is quite resource hungry. Flagship Android 5 phones have at least twice the hardware spec (twice as much RAM and CPU) of the iPhone 6 and are comparable in performance and user experience (latency) and not faster. That's the difference of Object-C vs. Java. And old applications like Microsoft Office are all coded in C/C++ and parts of older applications in Assembler.

One icon in Win95 had 512 bytes (32x32, 16 color, 4 bitplanes). One icon in Android has 256 kB (256x256, true color). The 800x600, 16-bit hicolor (that's what I used at the time) framebuffer had a bit under 940 kB. The 1920x1200 truecolor has 8,8 MB, not counting the texture backing stores used by modern display servers.

The amount of RAM needed has to do with assets used by the code, not the code itself. The code itself is miniscule.

And no, Android phones do not have 4 GB RAM. Low end has 512 MB, with many phones in 1-1,5 GB range and the 2015 flagships have 3 GB. (Nexus 5 a 7 have 2 GB. Nexus 6 has 3 GB). All that without swap (where would you like to swap? To flash?). While most modern 32-bit ARM CPUs do come with LPAE, Android does not support that, so going above 4 GB will have to wait for ARMv8.

Android doesn't support LPAE? That's pretty surprising, any sources about that? LPAE doesn't need any usermode support to function. What specifically does Android do to prevent using LPAE in underlying Linux kernel?

Well, one thing is what Linux kernel supports by itself, other is, what does the board support package for your chipset. So maybe there is LPAE Android device somewhere, where the SoC provider did bother, but in general, nobody does.

my android phone does :-) asus ze551ml

You phone is also Intel based, not ARM. That opens another question - would Intel be able to make phone SOCs, if Android SDK would compile to native ARM code, as some advocates prefer?

Java is not the problem, rather Google's shitty compilers.

Lots of industrial applications run embedded Java with a few KBs and acceptable performance for their use cases.

Android used the Oracle javac compiler until this year, so blaming anything on Google's new compilers is a little strange.

That comment shows how much you understand about compilers and Android.

javac has nothing to do with Dalvik or ART.

That shows how little you understand about the Android build tools. Dalvik and ART are not compilers.

Prior to this year, javac compiled the Java code to .class files and then dx translated the Java bytecode in the .class files into Dalvik bytecode in a .dex file, with some simple dedupe optimizations.

Only this year did the Android build system switch to Google's own compiler.

Go take a degree in computer science, learn about intermediate code representation, compiler frontened, compiler backend, CPU instructions, JIT compiler, AOT compiler, register selection.

Then make little drawings about which piece of Android is converting intermediate code representation into native CPU instructions.

For brownie points compare the quality of generated Asssembly code between Hotspot, Dalvik and ART for the same unmodified jar file.

Already done and wrote a non-optimizing lisp compiler and an optimizing toy compiler with common subexpression elimination and fancy register allocation.

I gather from your response that you've realized you were wrong about Android not using javac but were too proud to admit it. Don't worry, we can fix your pride problem with these tasks below:

1. Dalvik and ART don't take jar files as input, so it is impossible to get your brownie points. Learn why.

2. Oracle's Hotspot targets x86 and x86-64, and Dalvik and ART are mostly focused on ARM. Learn the difference between ISAs.

3. Hotspot and Dalvik make different tradeoffs between CPU and memory both in their choices of garbage collectors and in their JIT strategies. Think about why that would be.

4. The word "compiler" by itself refers to a program that translates source code into object code. Notably, an assembler is not usually considered to be a compiler, and JIT "compilers" were originally called dynamic translators for three decades, with JIT compiler only appearing in the 90s. Given that terminology background, figure out why most people would call javac a compiler but not Hotspot or Apple's Rosetta.

> Already done and wrote a non-optimizing lisp compiler and an optimizing toy compiler with common subexpression elimination and fancy register allocation.

And yet failed to grasp the difference between frontend, backend and intermediate execution format.

> I gather from your response that you've realized you were wrong about Android not using javac but were too proud to admit it. Don't worry, we can fix your pride problem with these tasks below:

I don't have to acknowledge anything. Anyone knows that javac does not execute code on the Android platform. As such talking about whatever influence it might have on runtime performance, besides peephole optimizations, constant folding and similar AOT optimizations only reveals ignorance about the Android stack.

> 1. Dalvik and ART don't take jar files as input, so it is impossible to get your brownie points. Learn why.

Yes they do. Jar files get converted into dex files, which means the same file can be used as canonical input for both platforms.

Then again we are learning about Android aren't we?

> 2. Oracle's Hotspot targets x86 and x86-64, and Dalvik and ART are mostly focused on ARM. Learn the difference between ISAs.

Maybe you are the one that should inform yourself about Oracle and certified partners Java JIT and AOT compilers for ARM platforms.

Learn about the Java eco-system.

> 3. Hotspot and Dalvik make different tradeoffs between CPU and memory both in their choices of garbage collectors and in their JIT strategies. Think about why that would be.

Of course they do different tradeoffs. The ones made by Dalvik and ART are worse than approaches taken by other Java vendors, hence why they generate worse code, which leads to bad performance.

Learn about commercial embedded JVMs.

>4. The word "compiler" by itself refers to a program that translates source code into object code. Notably, an assembler is not usually considered to be a compiler, and JIT "compilers" were originally called dynamic translators for three decades, with JIT compiler only appearing in the 90s. Given that terminology background, figure out why most people would call javac a compiler but not Hotspot or Apple's Rosetta.

Learn about Xerox PARC documentation and its references JIT compilers.

Or better yet feel free to dive into OS/400 documentation about its kernel level JIT compiler.

All of which go back a little earlier than the 90's

Interesting... Fast 486 (120mhz) with 12mb at 640x480 was swapping my so much fun could watch the logo screen rendering slowly down the screen

I remember 386DX (40 MHz) with 4 MB to be unusable at all (yes, it was possible to install Win95, but that's all) and Pentium 120 with 16 MB and S3 card running 800x600 hicolor to be great. In 1996.

I actually worked on a game that was running on the iPad2 (some gameplay https://www.youtube.com/watch?v=uaq0Sfp3_5Q ) and while you had to optimize quite a bit to achieve a certain number of drawcalls and memory usage it was still developed in Unity and far from what devs did in the ps1/2 era.

Except unlike smartphones, the PS1/N64 did not support VM or even had any storage to page to. So no, it's not the same.

iOS documentation:

"Instead, if the amount of free memory drops below a certain threshold, the system asks the running applications to free up memory voluntarily to make room for new data. Applications that fail to free up enough memory are terminated."

That has nothing todo with what op is referring to. Even on iOS you have a virtual memory system, like a normal PC does. If your app runs out of ram it will page to storage and the OS handles this for you. The PS1/N64 did not have any storage beyond RAM and also no virtual memory management so you had to write all the paging from CD/Cartridge to RAM yourself. Quite a difference.

Sort of. iOS does have a virtual memory system, but it's not as forgiving as a full OS would be; applications that fail to free up memory when asked to (i.e. in a low memory situation) are killed by the OS. See also: https://developer.apple.com/library/mac/documentation/Perfor...

That should be much easier to optimize against vs having no paging at all though. Are Apps running in the background asked to free up storage or be killed before the active app needs to do the same or are they referring to the active app ?

Background apps are even worse, as by default they are suspended.

If the system requires memory the ones with higher memory footprint are the first ones to go. They aren't asked nicely, just killed.

This might have changed on newer versions though. I am typing this from memory.

Also on the Watch there are also time constraints. How quick an app is allowed to execute.

Windows Phone also has similar constraints.

Embedded yes, but modern mobile is miles away from this. ~1GHz dual core, triple issue out-of-order CPU with 512MB-1GB RAM and maybe 10-20GB storage? That was a reasonable desktop PC not that many years ago.

Yet we have basic apps like the phone app stuttering on such "high-end" hardware.

Is your point something interesting or are you just snarkily pointing out that perfection has not yet been attained?

I guess the OP is referring to the code quality of said apps.

Not so sure about this. Most mobile games are developed with Unity. And what appear to be the cutting edge titles are usually using the Unreal engine. There's not much "to the metal" coding going on.

Besides, most mobile games are being played by the casual crowd. Games don't need graphics that push hardware limits to sell.

> Most mobile games are developed with Unity. And what appear to be the cutting edge titles are usually using the Unreal engine.

Any source by mobile OS that you can point to?

I am quite sure there are other contenders like home grown engines, LibGDX, Marmalade, Cocos (all variants), SDL, MonoGame, DirectXTK, Project Anarchy , Apple own Scenekit and SpriteKit,...

> Besides, most mobile games are being played by the casual crowd. Games don't need graphics that push hardware limits to sell.

Why do you think then all major OS vendors are teaching the devs how to reduce their packages sizes? I can happily post the links of such presentations, just need to hunt them down again.

Game logic + Assets + Engine

Something got to give if one is required to push the size down.

> I am quite sure there are other contenders like home grown engines, LibGDX, Marmalade, Cocos (all variants), SDL, MonoGame, DirectXTK, Project Anarchy , Apple own Scenekit and SpriteKit,...

It doesn't matter how many frameworks are out there. If you hang around the game dev scene long enough, you'll see that most small devs are using Unity, and if not that, Cocos2DX. Just head over to Gamasutra, /r/gamedev, or browse Steam & itch.io and see the # of cross platform mobile ports. Talk to devs, they are using Unity.

> Why do you think then all major OS vendors are teaching the devs how to reduce their packages sizes?

Reducing package size isn't really comparable to writing portions of your game in assembler. I don't consider that low level coding or pushing hardware limits.

> It doesn't matter how many frameworks are out there. ....

Former IGDA member, Gamasutra subscriber and GDCE attendee here, hence why I asked for numbers.

> Reducing package size isn't really comparable to writing portions of your game in assembler. I don't consider that low level coding or pushing hardware limits.

It is not, but the goal of fitting as much code as possible in small packages is.

> Former IGDA member, Gamasutra subscriber and GDCE attendee here, hence why I asked for numbers.

Maybe "former" is why. There are no numbers published to confirm or deny. It's apparent if you keep up with the community and ask developers what they use.

> It is not, but the goal of fitting as much code as possible in small packages is.

You're really talking about reducing sizes of assets & included libs. That has more to do with optimizing DL time than hardware performance, and nothing to do with low level coding where you're writing machine instructions without touching a higher level of abstraction. Not the same at all.

> There are no numbers published to confirm or deny. It's apparent if you keep up with the community and ask developers what they use.

So just an anecdote, kind of "on my neighbourhood...".

Yes, the PS1's 2MB of RAM are pretty generous compared to some embedded platforms.

A typical engine controller ECU in a car might have 256KB of RAM (and maybe 2-4MB of flash).

but you usually don't stuff 3d models and textures in an engine controller ECU ;)

Not with that attitude you don't :)

There is a tremendous number of hardware products and industrial systems where the processing is performed on small and cheap components (microcontrollers, digital signal processors).

Of course there exist very complex components in the category of microcontrollers, some of them even offer enough resources to run Linux, but if you stick to the $1-$5 range the specs are very limited.

Here are two examples, the first one costs around $3 and the second one is less than $1.

http://www.ti.com/product/tms320f28027 http://www.atmel.com/devices/attiny85.aspx

I develop on such platforms and even though there is an interesting challenge in programming these tiny processors and optimizing CPU cycles and memory usage all the time, in the long run it becomes quite strenuous because there is only low-level stuff and I miss the expressiveness and flexibility of more abstract languages.

I feel a little sad every time someone says "We have much higher CPU power/memory now, those things are unnecessary". Maybe they are right but I feel like this is not the path we should take

100% agree. And this goes for everything from server side, to games, to mobile to client side web. Some of my favorite web sites have gone from a 1-2 second load time to well over 6 seconds. And then they're sluggish after they load. It's sad.

You even see this with Google.

A typical Google search from 2009:

"Meaning of Life: Approximately 72,000,000 Results (0,00000042 Seconds)"

A typical Google search today:

"Meaning of Life: Approximately 364,000,000 Results (0,62 Seconds)"

It's not, but what is true is that people can focus more on features than on hacks to squeeze the maximum performance out of their system of choice.

If we're free to not waste time on tons of little performance hacks anymore, why do them anyway?

Making an (state of the art) engine is no easier. Making a game is only easier because the engines are more available now than they were in the past.

>>the other is sad because you hardly use this low level stuff in todays software development world.

During my engineering days(Circa 2005) I programmed in 8085 and our professor would give us all kinds of crazy assignments and small projects. That was the first taste of any genuine programming challenge I faced in my life. Immediately post that, programming in C felt a little boring.

Recently I worked on a embedded systems project for which I relived these kind of days. I had to invent all kinds crazy tricks to work around with resource constraints.

Your true creativity skills are kindled when your resources are constrained in all manners. Time or other wise. Unfortunately you can't academically recreate the same experience.

The amount of unoptimized crap they generate in a few weeks (in some cases) is quite depressive too I'm afraid. Game developers are wasting more and more CPU cycles, this of course reduces development cost. But it would be nice if they put some effort in making things run fast. And this doesn't only count for games!

Making things fast is more important than making them run fast. And it's not about the cost — it's about iterations for the sake of game design.

> and this had to be paged in and out dynamically, without any "hitches"—loading lags where the frame rate would drop below 30 Hz.

This is what gets me. Modern game development seems to say "eh, a little hitching won't hurt anyone", and then we wind up with games that run like shit. Even on consoles.

I work as a programmer in games industry,and I feel like the problem is made worse by artists and level designers who add more stuff without worrying about performance. I can make a super efficient physics system or model loader,but that only means that someone somewhere is going to add more particle effects or lights or whatever, or maybe placing too many props in the scene so PS4/X1 can't handle it. In fact, the separation is huge nowadays - I know our engine inside out, but I personally wouldn't really know how to use the editor to remove things from the scene. Likewise, a level designer will know how to put props in the scene,but they will have no idea how underlying things are wired together or what is the performance cost of doing them.

It's a complicated problem, which might have been made worse by the fact that games are simply easier to make nowadays than ever before.

Back when I was somewhat involved in the gaming industry a very, very smart programmer told me the reason his game with his new fancy, innovative, advanced engine didn't work out.

If you make it possible for level designers to make six square mile levels, they all make nothing but six square mile levels.

The internal Commandos Level Editor had a bug where it computed twice the memory footprint for the level. When the boss found out, he ordered the tool programmer to NOT fix that bug or tell anyone about it.

This is eerily similar to the problems we face in the browser space. Make CSS styling faster and developers will just write more complex CSS to take advantage of it. Then people blame the browser (and the Web in general) for being slow.

The difference is, sadly, that we don't control the assets at all. :(

I am not a game developer, just a developer and a gamer, so please forgive what may be a goofy question: have you seen or engineered systems that capped the level designers' resources? For a super simplified example, I think of Forge in the Halo series, and I'm pretty sure Forge had a certain amount of monopoly money that gamer-designers ran out of eventually. Would such systems be infeasible more for political reasons than technical?

The resources are usually capped in a "soft" way. So if the scene uses too much RAM for a console to handle, it will trigger a chain of emails that is sent to everyone involved, and well.....either I will be given a new task to somehow fit all of it in memory as a programmer,or a designer will be given a task to remove some stuff to go below limit. But framerate can be really tricky to handle,because ultimately, it's not up to me whatever fps drops are acceptable or not.

Is there no standard? Like, if fps drops below 30 on a play through - red light and fix.

The problem is, that fps stays below 30 for 90% of the development cycle. Only in the last few months of a project everyone rips out useless code, debug code, debug overlays, loggers, extra network connections for statistic and performance servers,and finally you can produce a "nearly" final version of the game that doesn't have any debug code in - and only in that state you can see how it will run on actual hardware and start optimizing from there. So basically you arrive at a situation where you have 3-6 months before release,and you have this game running at 20fps on a console,and you have to somehow make it run at 30 or 60fps. Obviously profilers help with that a lot, but it's rarely a process which you can afford to be doing during most of development.

This is why we see all these Early Access games that run like garbage. They are still in the develop new features and fix bugs phase, and have not yet gotten to the optimization/strip out debug code phase.

And yet, I wonder if they couldn't release a production build without all that (maybe crash handling/reporting). Early access games are what agile development is to software - or should be. Going to production several times a day without giving up on the application's performance, and such.

Early access game devs pushing unoptimized releases should set up their release system betterer.

This exactly - no reason why you couldn't ifdef everything that is debug only, if your team is consistent with it from the very start.

Backfilling existing code with ifdefs and dealing with compile breaks and other more weird things can be intimidating, time consuming, with hard to define ROI, so I can empathize if someone doesn't do it.

That sounds like a very, very risky software development process. I think best practice is to make sure release builds are done and are used for testing from day one. QA doesn't test with all this debug code and debug overlays do they?

#ifdef DEBUG

They do. The project I'm on has been in development for 5 years, has hundreds of people working on it globally, and not a single #ifdef DEBUG in it. It's currently my job to add that in,it's going to take weeks to finish.

I remember an interview with some guys from criterion who said they always made sure to maintain 60fps through the development of their game.

At least in my own experience, this is not standard throughout the industry.

Oh yeah, I know it isn't. But it seemed to work for them back when they made 60 fps games.

This is also why improving fuel/power efficacy in industry usually doesn't help the environment.

I was playing recently Axiom Verge...

Great game, except it has NES graphics and stutters like hell.

Wasteland 2 on my machine also ran really, really badly, it was unplayable (it was the first time I got pissed for kickstarting something).

Kerbal Space program also has some performance issues, but not bad as the previous two.

Then I go play some graphics heavy game made by some studio that like to make good tech, or play emulated Wii or PS2 games, and there are no issues and games look awesome.

Two of those games use the Unity engine, and Axiom Verge apparently uses MonoGame.

Garbage collection is often a big problem for real-time performance, especially on the old version of Mono that Unity has.

The worse thing that ever happened to game development was the on-demand updates. There is nothing worse than buying a game on release day only to wait for it to download a patch.

It is both good and bad. It allows delivery of critical fixes and new content, but it also decreases the demand for code quality from the start as well as increasing DLC. The bigger issue though is for the people who can't get it, like those whose only options for internet are dail up or satellite (with a 5GB per month limit). Sadly the market doesn't care about this small group enough to matter and we get left behind.

But, even with having a worse experience than a day 1 patch (that being unable to get the day 1 patch because it is 10GB and you only have 5GB for everything for a month), I wouldn't call it the worse thing ever to happen to game development.

Console games have survived for many, many years without the ability to issue critical fixes and they seemed to do ok. I've played console games since 1991 and I've never ran into a game that had bugs that made it unplayable. I think the QA cycle would be more complete if developers didn't rely on on-the-air patches.

It shouldn't really become acceptable to ship an knowingly subpar product with the attitude "we can always issue an update later."

There is an art in exploiting bugs in old games and working to glitch your way to worlds you aren't supposed to enter at that time. The kind of time and effort in finding these is really amazing. Finding and exploiting those bugs is an art form in itself. Take a look at this - https://www.youtube.com/watch?v=aq6pGJbd6Iw Skip to 12:15 for the real insanity.

Yup, yup. Running into a game breaking bug, bringing the game to the store and being told that you've got a "damaged" disk that they will replace (with a patched game, if it's been already "reved" or with an exactly same disk otherwise) was so much more fun. Miss those days too.

When we submit games to Sony/MS/Nintendo for publishing, we usually have to do it 2-3 months in advance. What are programmers supposed to do in that time between "end" of development and release date? Of course everyone works on little things that were left, you might as well release them as a patch!

I get the sentiment (they should have tested more thoroughly), but I for one appreciate the on-demand update mechanism. Would you rather play a game with previously unknown bugs, or have them smashed on launch day and get a patch to make your experience more stable?

I'd rather play a game that, if I should want to play it 20 years from now and there are no update servers - which I do still do with my old consoles - I can pop it in and not worry about bugs I have to figure out how to get patched. Games will inevitably have bugs, but developers have become too reliant on the update mechanisms and games have shipped completely busted.

That's an interesting point. I hadn't considered the impact of ondemand updates to future abandonware - looks like another case where pirated illegal versions might have better preservation than official ones.

Some fans even made patches for bugs in Master of Orion, in the binary.

Command&Conquer: Red Alert 2 and its expansion pack also received community patches (I contributed to them a lot) to its binary. When EA later released The First Decade bundle with all the old games in it, they removed the old copy-protection. Most games in the bundle were simply recompiled to remove it, but for the expansion pack they hex-edited the copy-protection to keep it community patch compatible.

Games were stable enough before. And the stability of an average game in the first few month definitely went down with the introduction of on-demand online updates. Not only that, it's not entirely uncommon to see half-gigabyte patches.

Is that so much when the games themselves have ballooned into the double-digits of gigabytes? A .5 gig patch isn't a whole lot when Dragon Age: Origins takes up 20gb in the first place (and thats a pretty old game)

A lot of that is sounds, textures, models, animation info.

> Modern game development seems to say "eh, a little hitching won't hurt anyone"

Or just like in the PS1 days, game developers still have to make trade-offs to meet dead drop dates set by publishers.

Online enabled patches can allow developers to be a little more cavalier with the quality, as they push to build more features closer to ship date.

"Online enabled patches can allow developers to be a little more cavalier ..."

That just means that we get to deal with buggy crap while they tell themselves it's OK because they can ship another update.


I'm going to give game developers the benefit of the doubt by believing they're not OK with shipping bugs.

... yet the bug count keeps rising.

Most Nintendo games still seem to be locked at 60Hz (sometimes 30 maybe?). Look at the fuss when people realised that Mario Kart 8 dropped to 59Hz sometimes ;-)

Nintendo generally picks a framerate and sticks to it. The N64 zelda games were actually locked at 20hz.

(It is possible to do 60hz on the N64, but it's really hard)

With internet based games I wonder if that loading screen/hitch is essentially finding you a server as you move from one zone to another if the server you were on was too crowded.

I also wonder if they are doing the equivalent in modern consoles - pushing the limits with graphics, etc. where you simply can't avoid the hitch.

It would be good to hear someone's perspective on this that works on these types of games.

It shouldn't be, networking should always be in the background and for console games there's no reason why clients should move to another zone - loadbalancers direct new clients to available servers, they don't or shouldn't move existing clients to other servers to make room for new clients.

Chances are they're not doing anywhere near the same thing with modern consoles; maybe in off-the-shelf engine code to get maximum FPS, but the games themselves, not likely. Also because modern video games are millions of lines of code - you don't want to duplicate those tenfold by squeezing every bit of performance out of it. Maybe only in the most frequently accessed codepaths.

Moving from one server to another can be instantaneous. Just use the cellphone model where they do proper handoffs between base stations. As to loading assets, plenty of games used a streaming model where you can explore huge worlds without issue.

PS: Skyrim is an interesting case where there is a player made patch to make city's open world where the original game has a loading screen. http://www.nexusmods.com/skyrim/mods/8058

> As to loading assets, plenty of games used a streaming model where you can explore huge worlds without issue.

The streaming model isn't simple though. You need to decide which assets to load, when, and when to discard them. You also have to consider how your level is designed, i.e. if you've got three tiles, and a player travels from tile a to b, and then b to c. When he moves from b to c, you can remove a from memory, but what if zone b + c is too big to fit, whereas a to b is okay.

A really interesting presentation was given at GDC this year from Insomniac games on streaming http://s3.crashworks.org/gdc15/ElanRuskin_SunsetOverdrive_St...

I agree that it's harder and limits the graphics somewhat. But, a lot of things are hard and we don't give companies a free pass when they mess up pathing.

Also, when you get close there many ways to hide the loading going on so it seems cleaner. Like a player going from planet A to planet B though a more limited space ship. On arrival they see a larger but still limited space port, giving the game time to load the new planet. Or even just boosting the glare when someone steps outside.

However, IMO these things can easily be over done.

Compare this to the Super Nintendo's 128KB of working memory.

It's hard to tell which games used more or less of that memory; the big thing about game complexity in that era was always ROM size limiting asset complexity, rather than RAM size limiting computational complexity, so the games released toward the end of the console's lifecycle were just ones with the biggest ROMs and therefore most assets, rather than games that used the available CPU+RAM more efficiently.[1]

Now I'm considering writing a memory profiler patch for a SNES emulator, to see how much of the 128KB is "hot" for any given game. I would bet the hardest-to-fit game would be something like SimCity or SimAnt or Populous.

On the other hand, the SNES also had "OAM" memory—effectively the 2D-sprite equivalent to GPU mesh handles. And those were a very conserved resource—I think there was space to have 128 sprites active in total? Developers definitely had problems fitting enough live sprites into a level. Super Mario World's naive answer was to basically do aggressive OAM garbage-collection, for example: any sprite scrolled far-enough offscreen ceases to exist, and must be spawned again by code. Later games got more clever about it, but it was always a worry in some form or another.


[1] There were also those that used expansion chips to effectively displace the SNES with something like an ARM SoC in the cartridge, once that became affordable. It's somewhat nonsensical to talk about how much SNES system memory some games used, because they came with their own.

Pretty awesome. Goes to show the great lengths taken even on state of the art hardware at the time.

And the PS1 wasn't even the worst of it. The Sega Saturn and N64 were both considerably more difficult to develop for. And the PC market was terribly fragmented and had a high rate of obsolescence.

This stuff can give you nightmares: http://koti.kapsi.fi/~antime/sega/docs.html

I sometimes wonder what could have been had Sega released proper documentation and a decent dev kit earlier for their difficult to program for Saturn. It was only later on in its life did Sega start to utilise the second processor. Shenmue was originally being worked on for the Saturn. How's that for ambitious. It was rumoured to require the 4 meg ram cart. That I believe 100%.


Absolutely. I've seen the Shenmue video and it's stunning. It's more impressive than anything that launched on the PS1 or N64.

>I sometimes wonder what could have been had Sega released proper documentation and a decent dev kit earlier

That is a complaint I have heard a lot, and it is valid. But ultimately I think Sega shot itself in the face by even having a second CPU. Concurrency is a hard problem, and certainly game devs back in the mid 90s were not up to the task of utilizing a second CPU. They had enough on their hands with transitioning from 2D to 3D already. I have read that most games developed on the Saturn only used one CPU.

Choosing quads over polygons was also a major blunder of the Saturn's design. The list of Sega's mistakes with the Saturn is so lengthy that it's impossible to think it had any chance of succeeding.

But I still play mine. :)

The sad thing is while Sega did everything wrong with the Saturn, they did everything /right/ with the Dreamcast and it still failed miserably. I kind of think of them as the Commodore of the games industry. Technology that was ahead of the curve, awesome products, but ruined by terrible management and stomped out by juggernauts (MS Windows, Sony Playstation).

Resources blowing the memory was a common problem on the PS1. A common trick was to bake the resources into the exe. I worked on a game where the final disc was a directory of 25 exe's. Each exe was a level, even the front end was a separate exe. You could see that the size of each file was under 2mb, so you knew it would work. You would never/hardly ever dynamically allocate memory on the PS1.

There was a lot of duplication, but the CD was a huge resource and memory was thin on the ground. Also meant that we could use the CD for audio during the game.

As mentioned in the answer, there were levels that were a lot bigger than 2 megabytes. What their goal was to ensure that these bigger levels would seamlessly be in the memory at all times

That was the "easy" way of handling it and most devs resorted to it, but then you couldn't load things dynamically and seamlessly in the middle of a "level".

I am waiting for Mr. Baggett book on the Lisp internals of Crash Bandicoot.

I don't care about the game, I never liked it, but the Lisp code of this man should be on par with PG's.

That would have to be Andy. I'm really kind of a hack when it comes to lisp programming. :)

But yes, Andy's lisp code is certainly great -- all the more so because he also wrote the lisp compilers that compiled it. :)

He didn't use a commercial lisp? I thought he used Allegro?

He did use Allegro, but only to host his own compilers.

Dave Bagget could write a book on the making of crash and I'd buy it in a second. There was something very special that went to making that game and I was just the right age to appreciate it.


It's on my bucket list to write that book -- also including many humorous tales from my 10+ years at ITA Software, and anecdotes from my current startup (http://inky.com).

If I live long enough, that is. :)

"It's behind you" - http://bizzley.imbahost.com/download.html

Biography of the guy who did the R-Type port for the ZX Spectrum, it's a riot of a read and has a lot of the old info.

Sounds impressive.

Might also be interested to know that the PS1 version of the first Ridge Racer ran completely from RAM (aside from music tracks): https://en.wikipedia.org/wiki/PlayStation_models#Net_Yaroze

Ridge Racer was such a great game and always felt so smooth to me and now I know why. Thanks for this fact! :)

You're welcome. :-)

Old console games are great examples of how creativity benefits from constraints.

Or that survival bias is a real thing. People only remember the good games that survived obscurity for tens of years, while ignoring the majority of terrible games that didn't.

A lot of people like to claim "old games used to be better!" but pick any month of the 1990s and look at the newest releases for that month, I bet for the average month there might be one title you've even heard of.

(Incidentally, this problem—producing the ideal packing into fixed-sized pages of a set of arbitrarily-sized objects—is NP-complete, and therefore likely impossible to solve optimally in polynomial—i.e., reasonable—time.)

Aren't there polytime algorithms that approximate to a certain percentage of the optimum?

yes[0], the author mentions "first fit" which presumably is "first fit decreasing" which is one of those. His approach was, from what I understand, to try a few approximate techniques and choose the best result without trying to run an exact algorithm with unbounded time.

[0] https://en.wikipedia.org/wiki/Bin_packing_problem#Analysis_o...

Exactly. You said it better than I. :)

Yes, but approximate algorithms for NP-hard problems hadn't been as thoroughly studied back then -- this was 1996, after all -- and even if I'd had the appropriate papers at hand, I probably wouldn't have invested the effort to implement anything complicated. There was just simply too much to do in too little time. The simple approximation using greedy packing did pretty well in practice; I'm sure how close that comes to optimal is well understood by now.

I wonder if anyone controls the physical layout of bytes on the disk, at least for things like installing large software packages on an HDD, or a major release of an operating system on a DVD. It doesn't look likely: every time I install Ubuntu, I feel the process could be made much faster.

We do for PS4 disc images. Basically you want to lay out the data on the disc in such a way that the game can still be played while the system copies everything to the HDD. There are pages upon pages of manuals in PS4 SDK how to do it efficiently.

At least for Windows there's now a way to install the OS so that it only occupies one continuous large filesystem image with the compressed install files, and only stores the changes to this frozen filesystem images in the "traditional" way. It's called "wimboot".


That way installing the OS mainy consists of copying this huge image file, hence the "physical layout of bytes on the disk" will mostly be fixed.

I'd guess that this could also be replicated in Linux, but personally I don't know if this is being done. I usually "debootstrap" or "pacstrap" my installs from a bootable USB stick :-).

Sure. On one project we built a custom Linux distribution by mounting a file as a loop back file system and copying the right files in. We installed a grub boot sector at the start. Then we copied it as a file into a simple live usb. The live usb also containing a script that used dd to copy the data from that file over the start of the first disk of a system you plugged it into. Nice quick install usb, minimal work.

We were installing into standard hardware so it was pretty good. But if your hardware varies this approach isn't great.

We could also have created a fresh filesystem on the device and untarred into it but we wanted to keep se Linux attributes.

dd'ing to the target device, and then growing the fs to the full disk extent seems to be a splendid idea. Especially as Linux tends to replace quite a lot of code over the usual curse of updates...

For a cluster system I had built a netbootable system for reinstall that would restore a dump of the reference system's fs on a node and adjust hostnames, ssh host keys etc... Also was quite fast at that time..

Yes, you could replicate that in Linux using overlayfs, with a readonly mount of the static image as the the lower layer and a writeable upper layer using an ordinary filesystem. It would be interesting to try it.

That's effectively how CoreOS works.

Ubuntu targets a LOT of different hardware, it would be much harder to optimize the layout on their ISO image than it is to optimize for a fixed hardware platform like consoles.

Do you still use a CD / DVD to install your OS? That's probably the problem right there; CD's/DVD's are relatively slow. Go for an install off a flash disk (where physical layout shouldn't matter), and / or a web install where only the relevant software compiled for your hardware is downloaded.

CDs/DVDs don't have to be slow (or at least not as slow as they typically are). Most speed problems are the same speed problems one encounters with spinny / non-SSD hard drives: namely, that data isn't sequential, requiring a lot of skipping around the disk in order to piece together files.

Web installs are also much slower unless you have a reliable internet connection (which is not a given). Installing from a USB stick (preferably of the 3.0 variety in a 3.0 port) is indeed the best option for all but the oldest of machines (in which case you'll need a boot floppy to start the machine up with a bootloader capable of initializing the boot medium (USB, CD/DVD/BD, network) and booting from it).

One really nice benefit of SSDs is that there is no needle or laser to seek. :)

Looks like the system RAM on playstations increased by the factor of 2ˆ4 / 16x with every generation. 2MB to 32MB (PS2) to 512MB (PS3) to 8GB (PS4)

Imagine the uproar among developers when the 360 was going to have 256MB (from the Xbox's 64MB).


Unfortunately, I can't find a reference (probably in a video), but I heard that Bethesda Softworks had a party when Microsoft announced that the 360 would have 512MB RAM.

I suspect such a party was figurative.

Then they found out where that extra memory had come from.

Following the same trend, the PS5 having 128GB of RAM seems a little ridiculous now, but consider that the PS4 was released in 2013, and the PS3 in 2006, so perhaps in 2020 it may really have that much.

Yeah i thought about that too. My PC already had 8GB of RAM in 2009 and even today that is still totally fine for most games. However, consoles usually share RAM between GPU and CPU though and high end PC GPUs today have up to 8GB of VRAM which will only increase once 4K content becomes more common place, so something like 64GB shared RAM for a supposed PS5 does not sound so outlandish as it may seem at first glance.

I think that by then we'll have solid-state memory or something like that.

> The PS1 had 2MB of RAM, and we had to do crazy things to get the game to fit.

That's what? About $80-100 worth of RAM back then? On a product that sold at $299 (July 1995 pricedrop) that's incredible. Nearly a third of your cost was ram alone.

Crash Bandicoot was quite a technical feat, but also (perhaps less talked about) - a marketing feat. The series went to sell ~50 million copies IIRC and was also the best selling Playstation game of all time.

Games that are well engineered tend to be good games for some reason :-)

See: Doom :)

That was a fun read. An insane solution I probably wouldn't have attempted. Impressed that they threw together a C++ parser and translator that worked that well in such a limited time.

I still play Crash (2/3, not 1) quite a lot on the OpenPandora & on my PS2 when i'm in my wood cabin. I still like it and that's partly of how much constraints there were on the PS1 to get something 'big' like this out there.

I keep hoping for a book with annotated (including lisp) source... Please please!!

Crash on OP is a real treat - its also one of my favourite wastes of time from the Pandora repo .. truly wonderful to have it so easily portable!

Yup, did it on OP not too long ago as well :)

Back in the days of the 8-bit micros there were some games that were absolutely incredible for what they could pack into a tiny machine. Elite on the BBC micro (32k) was one, but being poor I had a Spectrum 128. There was a helicopter simulator by Digital Integration for the 48k Spectrum called Tomahawk which was particularly good, and another 3D game called Starglider, which was originally for the Atari ST but there were ports to the Spectrum 128 and 48k Spectrum! I seem to remember an article in Sinclair User where they interviewed the developers who explained how they got it all to fit into such a small machine. Self-modifying code and using parts of the frame buffer to store code and data IIRC...

Time + Internet = A smaller world that doesn't forget

I recall reading that at one point game programmers were using portions of their code as textures. Which reminds me of the story of Mel. http://www.catb.org/jargon/html/story-of-mel.html

Indeed, the old FTL game, Dungeon Master (an old RPG, its most direct modern successor-by-inspiration being Legend of Grimrock) put some copy-protection code in one of its sprites, where it hoped you wouldn't notice.

One part of it would repeatedly read one "weak" sector that had been incorrectly encoded on the disk in such a way that it would usually read inconsistently, and would crash the game with an obscure system error if it didn't read differently.

Cute, but kind of a problem when you have a much better quality floppy drive which read the weak sector consistently.

Now here's a crazy idea:

Stenography via rearranging executable code within a binary file so that the binary file looks most like a particular bitmap.

confirmed: I'm a shitty developer

Found while looking for other articles about this: http://web.stanford.edu/group/htgg/sts145papers/jdelahunt_20...

I could read stuff like this 24/7 ... something visceral about plowing though some ridiculous constraints and finally making history really gets to me ...

Great post! Unfortunately it's on quora

I don't mean this sarcastically but what's wrong with quora? I've always found it to have pretty good content but your comment seems to be implying it's assumed to be a bad place to read things. Why?

To read the rest of this comment, please log in or append a secret URL parameter to the URL.

Or disable JavaScript :D (e.g. NoScript/ScriptSafe)

What's the secret URL parameter?

?share=1, though it's not a secret, it was posted about when this change went in: https://blog.quora.com/Making-Sharing-Better

Then again, who's going to find that old blog post :P

Thank you!!

(I figured out you don't have to pass anything specific into the share parameter, you can do "?share=poop" or even just "?share."

Hah, nice :)

Also, this parameter used to (maybe still does) set a cookie so that you can browse around the rest of your session without worrying about a sign up modal.

Gotta love Naughty Dog. Still leading the industry! Can't wait for Uncharted 4

Gloriosus tales of a forgotten past.

For those of you who don't want to endure Quora's forced account setup:

Here's a related anecdote from the late 1990s. I was one of the two programers (along with Andy Gavin) who wrote Crash Bandicoot for the PlayStation 1.

RAM was still a major issue even then. The PS1 had 2MB of RAM, and we had to do crazy things to get the game to fit. We had levels with over 10MB of data in them, and this had to be paged in and out dynamically, without any "hitches"—loading lags where the frame rate would drop below 30 Hz.

It mainly worked because Andy wrote an incredible paging system that would swap in and out 64K data pages as Crash traversed the level. This was a "full stack" tour de force, in that it ran the gamut from high-level memory management to opcode-level DMA coding. Andy even controlled the physical layout of bytes on the CD-ROM disk so that—even at 300KB/sec—the PS1 could load the data for each piece of a given level by the time Crash ended up there.

I wrote the packer tool that took the resources—sounds, art, lisp control code for critters, etc.—and packed them into 64K pages for Andy's system. (Incidentally, this problem—producing the ideal packing into fixed-sized pages of a set of arbitrarily-sized objects—is NP-complete, and therefore likely impossible to solve optimally in polynomial—i.e., reasonable—time.)

Some levels barely fit, and my packer used a variety of algorithms (first-fit, best-fit, etc.) to try to find the best packing, including a stochastic search akin to the gradient descent process used in Simulated annealing. Basically, I had a whole bunch of different packing strategies, and would try them all and use the best result.

The problem with using a random guided search like that, though, is that you never know if you're going to get the same result again. Some Crash levels fit into the maximum allowed number of pages (I think it was 21) only by virtue of the stochastic packer "getting lucky". This meant that once you had the level packed, you might change the code for a turtle and never be able to find a 21-page packing again. There were times when one of the artists would want to change something, and it would blow out the page count, and we'd have to change other stuff semi-randomly until the packer again found a packing that worked. Try explaining this to a crabby artist at 3 in the morning. :)

By far the best part in retrospect—and the worst part at the time—was getting the core C/assembly code to fit. We were literally days away from the drop-dead date for the "gold master"—our last chance to make the holiday season before we lost the entire year—and we were randomly permuting C code into semantically identical but syntactically different manifestations to get the compiler to produce code that was 200, 125, 50, then 8 bytes smaller. Permuting as in, "for (i=0; i < x; i++)"—what happens if we rewrite that as a while loop using a variable we already used above for something else? This was after we'd already exhausted the usual tricks of, e.g., stuffing data into the lower two bits of pointers (which only works because all addresses on the R3000 were 4-byte aligned).

Ultimately Crash fit into the PS1's memory with 4 bytes to spare. Yes, 4 bytes out of 2097152. Good times.

very good read.

Multiple orders of magnitude less artwork. Multiple orders of magnitude less polygons in models. This would explain a lot of the size difference.

But it also has multiple orders of magnitude less in terms of memory. So they have so little or no room for excess.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact