1. Super Mario Bros. is 40KB (PRG ROM is 32KB, and CHR ROM is 8KB). That said, I suppose if you were to compress the ROM, you probably would end up with value like 31KB.
2. Levels don't use RLE or LZ77, that would use too much space. Rather, the game uses commands like "from now on, use this ground fill pattern", "draw pipe here", "put a staircase", "put a goal". Super Mario Bros. uses two bytes per object, representing the following information: XXXXYYYY PSSSVVVV, where X is horizontal position within the page, Y is vertical position within the page, P moves to the next page, S is a command, and V is its parameter (length or number of an extended object). Y positions bigger than 0b1011 are used for special commands (as such parameters would be rendered out of bounds).
3. Graphics aren't compressed as they are stored in CHR ROM, and are directly accessed by the PPU (PPU cannot access RAM). However, as the graphics are 2bpp, they don't use as much space. Palettes are used to not restrict the entire game to 2bpp - note that 16x16 blocks have different color palettes. Some tricks are used to reduce the number of necessary tiles - for instance, clouds are the same thing as bushes, just with a different palette applied.
4. Demoscene is misleading. The way demoscene works is that most things are using procedural generation, meanwhile the levels in Super Mario Bros. are actually designed, and graphics were drawn by an artist, they aren't using some arbitrarily picked mathematical function chosen because it looked cool. The game has to be playable after all.
That's RLE, isn't it? https://en.wikipedia.org/wiki/Run-length_encoding
Or are you saying that they didn't even include the "count" for the ground fill?
I'm confused about the thesis of the article. It seems like the article's purpose isn't to describe NES graphics, but instead to talk about PNG vs JPG files?
The article starts as "Great question Dion!", but it never actually says what the original question was. (Or if the question was "Where do all the pixels come from", then that's an inadequate question and doesn't seem to serve as a thesis for the article)
EDIT 2: It seems like the HN title "Super Mario Bros. game was just 31 Kilobytes. How's that possible?" is misrepresentative.
I’d speculate that the use of this run-length tiling command predates the RLE image format, since the Mario Bros arcade game that predates SMB1 (with some of the same programmers) and the RLE format patent both happened at the same time (1983) and run-length encoding schemes were already common before that.
One of my favorite games of all time. You signed a copy of it via mail for me around 2006/2007 ;) Cheers.
> We decompressed the current level's tile map into a portion of that.
I reverse engineered it to dump PNG format maps of both it and the leaked prototype. Pretty straightforward RLE but it was fun learning about 6502, NES programming, and bank switching.
Level dumper source: https://gitlab.com/mcmapper/mcmapper/blob/master/main.c
Generated images: https://tcrf.net/Proto:M.C._Kids
Very different from how SMB1's level construction works. Like the OP comment said, it basically has a modal command structure. Something like "start drawing the ground two tiles up from the bottom. At x=18, draw a pipe." etc. This is why the level scrolls infinitely after the flagpole, it's just running the default "draw ground" command forever. M.C. Kids on the other hand actually stored the whole level tile grid, using RLE for compression.
If you want a better idea on how you can cram everything into 40KB — using every trick in the book, including ones not considered 30+ years ago — take a look at a how a recently released NES game (Micro Mages) managed it.
Their primary design constrain was that they restricted themselves to 40KB, despite wanting to do far more than the games that were done in 40KB back in the 80s.
Code is often a good form of data compression.
To explain Super Mario a bit better, it's really not using many tricks. The graphics are uncompressed and take up 8KB. The code is uncompressed. The only thing really compressed is the levels. The explosion in video game size from then to now is because asset detail grows O(N^2) - if you double the detail of a texture you end up squaring the size.
Not only is using lossless audio unnecessary (no in the world is gonna hear the difference between a wav/flac or 320kbps mp3 gunshot), the decoding cost for mp3 or ogg is miniscule.
Erm, not really. Most of the time it's because they spend like 3 minutes of loading time generating textures and all in memory before displaying anything on screen. And they rely on the giganormous graphic libraries of Windows (for Windows demos) + the huge driver stack that would never fit in 64kb in the first place.
I am much more impressed by demos built to run on "the hardware" back in C64 and Amiga days.
When you have a single hardware (C64, Amiga...), it is not more complex to address than addressing drivers. Peeking/poking ports and directly addressing video memory is not much costly in bytes.
It's turned out to be at least somewhat true, insofar as oldschool demos still carry a lot of cachet despite the hardware not being anywhere near as capable. I don't know which one took more effort or skill to build, but, on a gut level, 8088MPH just feels cooler than kkrieger.
Maybe that's just me showing my age, though.
(though a lot of that work was largely based on their initial work for .the .product).
8088MPH is certainly very cool, but a large reason is that nobody had bothered with the original IBM PC much before.
this implies that they do 'know so much more about their data' than a simple bitmap
Note: I think you can actually talk about fractional bits in information through
> Which, for the next 2 billion folks in emerging markets, on 2G connections, is like the worst idea ever.
Which emerging markets still use 2G? In almost all of the world 2G is being phased out and it’s in developed economies where it is scheduled to take the longest. Many emerging markets had their infrastructure created after 2G was obsolete and are already on better technology.
And it's an issue as operators phase out 2D...
Back in the '00s, I made a j2me mobile realtime multiplayer online strategy game, which included animations, music and sfx. It was 60kb, because I wanted to target as many devices as possible.
Later it grew till about 200kb
Lots of space and performance optimizations.
I'll take the devil's advocate position on this tangent. Mobile data infrastructure is remarkably reliable and cost-effective in emerging markets.
Anecdotally, I live in Canada. We have some of the most expensive and restrictive mobile phone service in the world. 4G coverage is still spotty, even in major cities.
On a recent trip to Morocco it was really astounding to me how much cheaper, faster and more ubiquitous 4G service was.
Here is some actual data:
 Global pricing of 1GB of data:
 Mobile data speeds by country:
Interesting that a single screenshot of the game weighs more than the entire game.
Facebook is 235 app size. These are both basically just clients for backend Rest, seems crazy at this size
So i'd say the final image would be 60 kbytes. Can someone confirm ?
It's good to look back and appreciate the fundamentals, since there was a lot of clever thinking in this we can learn from today. There are often times I'll come up with an overly hefty solution in a generic case, instead of thinking like Nintendo did about focusing on making something cool to look at with a fun experience, though with a much narrower scope. NLP is one place I see this a lot - it's often very easy to make aws happy when I go through GBs of data, for something where 95% of the value to the end user could be reached with a 50kB file on the S3. Dapps are another one - helps a lot to think like a Nintendo dev in the 80s about what the reusable parts of your data are.
It's an awful post, it would take less time to point out what is correct than highlight all the mistakes.