Hacker News new | past | comments | ask | show | jobs | submit login
Super Mario Bros. game was just 31 Kilobytes. How's that possible? (freecodecamp.org)
116 points by rahuldottech on Oct 10, 2019 | hide | past | favorite | 48 comments

No, no, no.

1. Super Mario Bros. is 40KB (PRG ROM is 32KB, and CHR ROM is 8KB). That said, I suppose if you were to compress the ROM, you probably would end up with value like 31KB.

2. Levels don't use RLE or LZ77, that would use too much space. Rather, the game uses commands like "from now on, use this ground fill pattern", "draw pipe here", "put a staircase", "put a goal". Super Mario Bros. uses two bytes per object, representing the following information: XXXXYYYY PSSSVVVV, where X is horizontal position within the page, Y is vertical position within the page, P moves to the next page, S is a command, and V is its parameter (length or number of an extended object). Y positions bigger than 0b1011 are used for special commands (as such parameters would be rendered out of bounds).

3. Graphics aren't compressed as they are stored in CHR ROM, and are directly accessed by the PPU (PPU cannot access RAM). However, as the graphics are 2bpp, they don't use as much space. Palettes are used to not restrict the entire game to 2bpp - note that 16x16 blocks have different color palettes. Some tricks are used to reduce the number of necessary tiles - for instance, clouds are the same thing as bushes, just with a different palette applied.

4. Demoscene is misleading. The way demoscene works is that most things are using procedural generation, meanwhile the levels in Super Mario Bros. are actually designed, and graphics were drawn by an artist, they aren't using some arbitrarily picked mathematical function chosen because it looked cool. The game has to be playable after all.

Indeed, nothing in this article has anything to do with how the NES worked. If you want a real introduction to the hardware, "I Am Error" by Nathan Altice is a great book about how the NES worked. It includes a chapter about how SMB1 in particular stored its level layouts.

I once did a project scripting AI for Mario on SMB1 on an emulator, and I watched the game's memory pages as I did so I could learn more. To my surprise, I saw the pages of arrays of hex bytes in the memory mirroring what I saw on the level, the layout was very literally transferred directly. When Mario moved, the current screen and the next one were clearly visible scrolling by in memory, which different hex bytes representing different tile types.

You get used to it, though. Your brain does the translating. I don't even see the code. All I see is blonde, brunette, redhead.

They used that tiling system all the way up to NDS IIRC. I remember tinkering with gameboy advance homebrew programming back then trying to implement infinite scrolling and it works just like you described.

Thanks for the reference. Hadn't heard of this book and just now ordered it.

"I Am Error" is not the best name for that book. I wouldn't have thought it was an in-depth technical book for the NES based on that name.

The whole "Platform Studies" series is a guarantee of in-dept technical work. While all have deep knowledge, "Racing the beam" (Atari 2600) is amazing in its simplicity, while "The future was here" (Amiga) is borderline snobbish. YMMV.

> Rather, the game uses commands like "from now on, use this ground fill pattern"

That's RLE, isn't it? https://en.wikipedia.org/wiki/Run-length_encoding

Or are you saying that they didn't even include the "count" for the ground fill?


I'm confused about the thesis of the article. It seems like the article's purpose isn't to describe NES graphics, but instead to talk about PNG vs JPG files?

The article starts as "Great question Dion!", but it never actually says what the original question was. (Or if the question was "Where do all the pixels come from", then that's an inadequate question and doesn't seem to serve as a thesis for the article)

EDIT 2: It seems like the HN title "Super Mario Bros. game was just 31 Kilobytes. How's that possible?" is misrepresentative.

“From now on, use this ground fill pattern” is a type of run length encoding scheme, but implemented as a tile-drawing command in the game engine, not as an image file format.

I’d speculate that the use of this run-length tiling command predates the RLE image format, since the Mario Bros arcade game that predates SMB1 (with some of the same programmers) and the RLE format patent both happened at the same time (1983) and run-length encoding schemes were already common before that.

IIRC Mario 1 works by drawing "sprites" of tiles. The level data says "pipe at 73,30" where pipe is a 2x3 array of the tile ids that make a pipe or "cloud at 34,3" where cloud is a 3x2 array of tile ids. I don't know what percentage of NES games used a technique like that. AFAIK the majority of NES games with scrolling levels used a simple 2D tile map but Mario 1 does not. I shipped a NES game, ours had an MMC3 chip which if I recall gave us +8k ram for a total of 10k. We decompressed the current level's tile map into a portion of that.


> I shipped a NES game

One of my favorite games of all time. You signed a copy of it via mail for me around 2006/2007 ;) Cheers.

> We decompressed the current level's tile map into a portion of that.

I reverse engineered it to dump PNG format maps of both it and the leaked prototype. Pretty straightforward RLE but it was fun learning about 6502, NES programming, and bank switching.

Level dumper source: https://gitlab.com/mcmapper/mcmapper/blob/master/main.c

Generated images: https://tcrf.net/Proto:M.C._Kids

Very different from how SMB1's level construction works. Like the OP comment said, it basically has a modal command structure. Something like "start drawing the ground two tiles up from the bottom. At x=18, draw a pipe." etc. This is why the level scrolls infinitely after the flagpole, it's just running the default "draw ground" command forever. M.C. Kids on the other hand actually stored the whole level tile grid, using RLE for compression.

Very much agree with the parent.

If you want a better idea on how you can cram everything into 40KB — using every trick in the book, including ones not considered 30+ years ago — take a look at a how a recently released NES game (Micro Mages) managed it.

Their primary design constrain was that they restricted themselves to 40KB, despite wanting to do far more than the games that were done in 40KB back in the 80s.


This video is so well done! @dang this would make a much better top link than the nonsensical article posted originally

the game uses commands like "from now on, use this ground fill pattern"

Code is often a good form of data compression.

The first sentence of this post strikes me as needlessly combative and the post would be better without it.

Wow as someone who knows the NES well this article contains an error in almost every paragraph. They really got all the technical stuff wrong! So many made-up numbers everywhere!

To explain Super Mario a bit better, it's really not using many tricks. The graphics are uncompressed and take up 8KB. The code is uncompressed. The only thing really compressed is the levels. The explosion in video game size from then to now is because asset detail grows O(N^2) - if you double the detail of a texture you end up squaring the size.

Developers are also just lazy or plain believe in snake oil. On PC, Titanfall 1 had uncompressed wav files for audio which was 35GB of its 48GB file size, the reason being 'decoding audio incurs a performance cost and we want the maximum performance for our players'.

Not only is using lossless audio unnecessary (no in the world is gonna hear the difference between a wav/flac or 320kbps mp3 gunshot), the decoding cost for mp3 or ogg is miniscule.

That is just brutally asinine. Probably one guy in the whole dev team believed this load of crock and the rest of the team had to follow him off the cliff because he had a senior title.

I guess reading and processing 10x the data had better performance.

> They can cram 30minutes of an entire 3d shooter into 64kb because they understand and know so much more about their data.

Erm, not really. Most of the time it's because they spend like 3 minutes of loading time generating textures and all in memory before displaying anything on screen. And they rely on the giganormous graphic libraries of Windows (for Windows demos) + the huge driver stack that would never fit in 64kb in the first place.

I am much more impressed by demos built to run on "the hardware" back in C64 and Amiga days.

About the driver stack that would never fit:

When you have a single hardware (C64, Amiga...), it is not more complex to address than addressing drivers. Peeking/poking ports and directly addressing video memory is not much costly in bytes.

I think, though, that this is why, back in the '90s, there was so much hand-wringing about recent developments (at the time) in computing leading to the spiritual death of the demoscene. All that hardware heterogeneity meant that democoders end up far removed from the machines they work on.

It's turned out to be at least somewhat true, insofar as oldschool demos still carry a lot of cachet despite the hardware not being anywhere near as capable. I don't know which one took more effort or skill to build, but, on a gut level, 8088MPH just feels cooler than kkrieger.

Maybe that's just me showing my age, though.

Considering they had to write custom tools for pretty much everything, including a visual language and editor to design all art assets almost two decades before that became a thing in AAA games and create custom compression schemes down even -IIRC- custom floating point representation that would compress better, i'd say that .kkrieger was the one that took more effort and skill.

(though a lot of that work was largely based on their initial work for .the .product).

8088MPH is certainly very cool, but a large reason is that nobody had bothered with the original IBM PC much before.

> they spend like 3 minutes of loading time generating textures

this implies that they do 'know so much more about their data' than a simple bitmap

Would you rather have to know about your data in such detail that you can define a general function for generating it... or just define it?

let me show you my one byte compressor, it only relies on a 1TB data set

My reading enjoyment was really harmed by the indiscriminate occasional lower-casing of the units. According to the notation I was taught, 1MB is 1 megabyte, while 1mb is 1 mili-bit - a fractional unit (used really rarely in information theory) that is 8 billion times smaller.

Note: I think you can actually talk about fractional bits in information through

Super off topic but....

> Which, for the next 2 billion folks in emerging markets, on 2G connections, is like the worst idea ever.

Which emerging markets still use 2G? In almost all of the world 2G is being phased out and it’s in developed economies where it is scheduled to take the longest. Many emerging markets had their infrastructure created after 2G was obsolete and are already on better technology.


I know that some hub/station on which remote sensors to monitor cows' temperature communicate with a web server with a 2G technology as they were designed many years ago.

And it's an issue as operators phase out 2D...

This article gets so many things wrong it’s painful.

Just clever use of tiles, proper encoding of assets, and making you code lean.

Back in the '00s, I made a j2me mobile realtime multiplayer online strategy game, which included animations, music and sfx. It was 60kb, because I wanted to target as many devices as possible.

Later it grew till about 200kb

Lots of space and performance optimizations.

What game was this? (Can I find JARs anywhere?)

> "Basically, we’re sending more data to smaller devices simply due to screen resolution. Which, for the next 2 billion folks in emerging markets, on 2G connections, is like the worst idea ever. But I digress. That’s a different post."

I'll take the devil's advocate position on this tangent. Mobile data infrastructure is remarkably reliable and cost-effective in emerging markets.

Anecdotally, I live in Canada. We have some of the most expensive and restrictive mobile phone service in the world. 4G coverage is still spotty, even in major cities. On a recent trip to Morocco it was really astounding to me how much cheaper, faster and more ubiquitous 4G service was.

Here is some actual data:

[0] Global pricing of 1GB of data: https://www.cable.co.uk/mobiles/worldwide-data-pricing/

[1] Mobile data speeds by country: https://www.opensignal.com/reports/2018/11/global-state-of-t...

This was also on reddit today: https://pay.reddit.com/r/interestingasfuck/comments/dfv6n3/l...

Interesting that a single screenshot of the game weighs more than the entire game.

That is because they didn't try, this screenshot[0] is 1980 bytes which is much smaller than the linked image (and something like OptiPNG might make it even smaller) :-P

[0] https://i.imgur.com/ABFJbYW.png

I prefer the inverse of this question. Something like... Why is the iOS Uber app taking up 250mb on my phone?

Terrible developers, poor framework and design choices.

Wow. Really ? Thats like, really bloated;

Yep. To be fair, it’s 211 app and 39 docs and data which I assume is a map cache

Facebook is 235 app size. These are both basically just clients for backend Rest, seems crazy at this size

This article shows how the author has a vision about developing nations still working on 2G and SNES to web game development as almost being the same. Back then people didn't even talk about RAM but ROM so it's a totally different scenario, and about the 2G maybe you think South Korea is a developing country too well they're the first ones with the 5G technology in the world implemented at comercial scale.

Very good talk by David Braben about the original Elite where he describes tricks they used to cut the size: https://www.gdcvault.com/play/1014628/Classic-Game-Postmorte...

It's unlikely that the final image displayed is 180kb for a 256 * 240 resolution, this would imply that the color uses 3 bytes (RGB). i think it's coded in one byte (256 colors).

So i'd say the final image would be 60 kbytes. Can someone confirm ?

Ask HN: I wonder what the fewest lines of code that have earned the most revenue is...

This article is from 2015. The title should reflect that.

.kkrieger wasn't 64K, it was 96K.

Great write-up! Pondering this question is the thing that brought my into software development in the first place. It was fascinating back then how my big, expensive, PC/XT would struggle to make anything run half as smoothly as a Game Boy. PC games with contents spread over several floppy disks, taking MEGABYTES of space and slowly crunch-crunch-crunching whenever I'd go to a different screen in Quest for Glory, but you could play in the immersive game of Zelda seamlessly.

It's good to look back and appreciate the fundamentals, since there was a lot of clever thinking in this we can learn from today. There are often times I'll come up with an overly hefty solution in a generic case, instead of thinking like Nintendo did about focusing on making something cool to look at with a fun experience, though with a much narrower scope. NLP is one place I see this a lot - it's often very easy to make aws happy when I go through GBs of data, for something where 95% of the value to the end user could be reached with a 50kB file on the S3. Dapps are another one - helps a lot to think like a Nintendo dev in the 80s about what the reusable parts of your data are.

> Great write-up!

It's an awful post, it would take less time to point out what is correct than highlight all the mistakes.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact