
Memories – 256 bytes demo winner of Revision 2020 - guiambros
http://www.sizecoding.org/wiki/Memories
======
clan
I am always very impressed when I see these demos and how much can be done
with so little. If you are like me you just jumped to Youtube[1] to see it in
action.

When trying to make my significant other to understand what was happening I
wanted to run it myself. I was amazed how simple that was!

\- Install the assembler[2]

\- Install dosbox[3]

\- Get the source[4] and put it into c:\temp\demo\memories.asm

\- Start nasm and enter:

    
    
        cd c:\temp\demo
        nasm.exe memories.asm -fbin -o memories.com
    

\- Start dosbox and enter:

    
    
        mount d c:\temp\demo
        d:
        dir
        memories
    

\- Press [ALT][ENTER] for fullscreen

The dosbox config is not optimized but it runs with sound with the default
settings!

For me this is somehow much more impressive than simply watching the video.

[1]
[https://www.youtube.com/watch?v=Imquk_3oFf4](https://www.youtube.com/watch?v=Imquk_3oFf4)

[2] [https://nasm.us/](https://nasm.us/)

[3] [https://www.dosbox.com/](https://www.dosbox.com/)

[4]
[http://www.sizecoding.org/wiki/Memories#Original_release_cod...](http://www.sizecoding.org/wiki/Memories#Original_release_code_.28complete.29)

~~~
jeffhuys
For macOS users (with brew):

    
    
      brew install dosbox nasm
      nasm memories.asm -fbin -o memories.com
      dosbox
      mount D ~/Development/memories (or whatever)
      D:
      memories
    

So, almost the same!

Hit FN+Ctrl+F12 to speed it up (it's time-independent, for smoother
animations, hit that combination quite a few times).

It didn't output any audio for me, but that's probably fixable.

~~~
kawzeg
Using `dosbox .` skips two steps by mounting the current directory (at least
on linux where I tried it)

------
MrGilbert
On a related note: 2019, "Dope on Wax" was 1st in the PC 64k section.

There is a breakdown of this demo on youtube, it's roughly ~2 hours. They
explained how they made this demo. Really interesting to watch.

Demo:
[https://www.youtube.com/watch?v=QhqT0DhV9yE](https://www.youtube.com/watch?v=QhqT0DhV9yE)

Breakdown:
[https://www.youtube.com/watch?v=hFIyj5Yv440](https://www.youtube.com/watch?v=hFIyj5Yv440)

~~~
antirez
It's wonderful. More impressed with this being 64k than the originally posted
256 bytes.

~~~
TorKlingberg
PC64k is the main size constrained demo format, where people do seriously
impressive things. 256b is the masochists category, where doing anything at
all is hard. 4k intros are in between.

It's interesting that the demo scene is very Windows/DOS focused, unlike other
hacker scenes. Linux or Mac demos are basically not a thing. You're far more
likely to see C64 or Amiga demos.

~~~
antirez
Well I guess it makes sense after all, because in DOS you kinda have an "API",
for instance using the default interrupts you can select video modes, and the
video memory is mapped at a fixed offset and so forth. In Linux due to API
fragmentation it would be hard to agree on something that works in the future,
and even know likely more setup boilerplate setup code is needed.

~~~
HellMood
Agreed. Just as a heads up regarding "future safety", the guys at NVIDIA - for
now - seem to keep the door open for Dos/Bios with very high possible
resolutions
[https://www.pouet.net/prod.php?which=63522#c858522](https://www.pouet.net/prod.php?which=63522#c858522)
and even without the need for going the VESA way. Nothing i would really rely
on in a business case ;) but neat anyway. (Mode List for several current GPUs
:
[https://www.pouet.net/topic.php?which=11672&page=1](https://www.pouet.net/topic.php?which=11672&page=1)
)

------
yread
> In 320x200 mode, instead of constructing X and Y from the screen pointer DI
> with DIV, you can get a decent estimation with multiplying the screen
> pointer with 0xCCCD and read X and Y from the 8bit registers DH (+DL as
> 16bit value) and DL (+AH as 16bit value). The idea is to interpret DI as a
> kind of 16 bit float in the range [0,1], from start to end. Multiplying this
> number in [0,1] with 65536 / 320 = 204,8 results in the row before the
> comma, and again as a kind of a float, the column after the comma. The
> representation 0xCCCD is the nearest rounding of 204,8 * 256 ( = 52428,8 ~
> 52429 = 0xCCCD). As long as the 16 bit representations are used, there is no
> precision loss.

Řrřola's trick. A bit like
[https://en.wikipedia.org/wiki/Fast_inverse_square_root](https://en.wikipedia.org/wiki/Fast_inverse_square_root)

~~~
pjc50
This explanation was so confusing that I had to write a program to get it
clear in my head.
[https://gistpreview.github.io/?9b252f267cd1fdf9754059bb73a18...](https://gistpreview.github.io/?9b252f267cd1fdf9754059bb73a18487)

More clearly: DI = (y * 320) + x

Multiply by 0xCCCD => (y * 0x1000040) + (x * 0xcccd)

Take top byte is equivalent to divide by 0x1000000. So that gives you Y. The
next lower (third) byte is then (x * 0xcccd / 0x10000) == (x * 52429 / 65536)
=~ (x * 256/320). And the lower two bytes are noise.

~~~
HellMood
(author here) you're right (about confusing), i wasn't expecting more than a
few people to actually read this ;) at least i quickly repaired the
float/fixed thing.

------
guiambros
The source code really looks like black magic. It's incredible they were able
to cram the tunnel effect into 64 bytes [1].

The entire video is here [2], if you just want to watch the final product.

EDIT: link fixed.

[1]
[https://www.pouet.net/prod.php?which=85227](https://www.pouet.net/prod.php?which=85227)

[2]
[https://www.youtube.com/watch?v=Imquk_3oFf4](https://www.youtube.com/watch?v=Imquk_3oFf4)

~~~
lvturner
There's something beautiful about the fact that the video is FAR larger in
size than the program that initially generated the output. Almost worth
watching for that fact alone.

~~~
kubanczyk
Your comment's html source code (the entire <div>) is also larger with its 380
bytes.

~~~
Cthulhu_
It always stings when I make a website/app pulled through all the optimizers
and compression algorithms, and the content people fuck it all up by adding
10MB of images :/.

~~~
tiborsaas
You are building a passenger airliner optimized to don't crash and burn, don't
worry about the cargo :)

------
d_silin
A couple of other famous short demos:

[https://www.youtube.com/watch?v=_YWMGuh15nE](https://www.youtube.com/watch?v=_YWMGuh15nE)
("Elevated", 4k)

[https://www.youtube.com/watch?v=fp0t2jCMGZE](https://www.youtube.com/watch?v=fp0t2jCMGZE)
("One of those days", 8k)

------
wgx
Not MS-DOS, but my favourite 256 byte demo is for the C64: "A Mind Is Born",
check it out:
[https://www.youtube.com/watch?v=sWblpsLZ-O8](https://www.youtube.com/watch?v=sWblpsLZ-O8)
that music is astonishing.

~~~
jcims
If i had experienced this out of my dads humble little c64 back in the 80’s i
think i would have passed out. That music is incredible esp considering how
concisely it is stored.

------
rsiqueira
There is a JavaScript implementation of this parallax checkerboards effect
with just 140 characters of code, including 3D animated perspective:
[https://www.dwitter.net/top/all](https://www.dwitter.net/top/all) In this
page you can also find an implementation of Pouet's tunnel effect.

~~~
poutrathor
I immediately think about the dwitter crowd when seeing the video. I think I
recognized several patterns. Seems sound that in the shortest size goal,
everyone ends up using the same function classes to generate maximal impact
with minimal bytes.

------
aaronbwebber
Can anyone describe at a high level for a complete noob how this kind of thing
works? Someone who is not going to be able to read a bunch of ASM and
interpret it? I'm guessing that it is something along the lines of:

\- the graphics "driver" reads values out of certain registers (AL and AH?) at
a set interrupt (maybe every X clock cycles?) and writes one pixel to the
screen of whatever color those registers had in them

\- by writing values into those registers and aligning the number of
operations the program does with the frequency of the interrupts, you can get
animation?

Even achieving any sort of flow control so you can switch between the effects
is mind-boggling to me.

~~~
pjc50
It gets much simpler when you realise that in the original PC there's no
"driver" in the way but bits of hardware are wired directly to various
processor buses.

This is sixteen-bit assembly, so you have the famous 640kb of RAM available to
the user and a 64k bit of RAM beyond that (see "0xa000" in the program). The
graphics hardware is continuously rendering frames out of there at 320x200,
one pixel per byte, using the default system palette.

The rendering is rather like a pixel shader. There is a big for loop over all
the pixels, and at each point it computes a pixel value. First it decides
which frame number it is on (stored in BP register I think), then calls an
"effect" for that pixel.

It then jumps three pixels. This gives that nice "dissolve" transition between
effects.

Keyboard controller is wired directly to the bus, so you can read the keyboard
with a single instruction.

A MIDI controller is wired directly to address 0x330 (not standard equipment,
back in the day this required a Roland card or SoundBlaster 32?), so you can
just write MIDI to that.

There is a system timer interrupt configured for the music. The graphics
appear to run continously, I can't see a link to the timer or vertical sync in
the graphics code, that appears to just run continuously.

~~~
HellMood
(author here) The "three pixel jump" is just for the looks, and it smoothes
the animation for more calculation heavy effects (f.e. raycast tunnel). The
transition effect is not bound to this, it is rather using the "noise" (as you
described it) from the coordinate calculation to offset the time (desribed in
the writeup). The graphic output is linked to the timer via register BP, which
is modified in the interrupt routine.

------
raverbashing
This is so cool

MS-DOS programming was overall a pain in the... byte but what I miss most
about it was the simplicity of graphics.

Wanna draw? Just write to memory. Setting a mode was one instruction

(Wanna play sound? Fumble with 2 levels of IRQ controllers one DMA controller
then sob uncontrollably. Or use Allegro. Wanna do multithreading? What's that?
)

------
nchelluri
Here's a size comparison with a "hello world" program in my favorite language,
Go.

    
    
      nchelluri@grugbarn:~/dev/hello $ cat > hello.go
      package main
    
      import "fmt"
    
      func main() {
          fmt.Println("hello world")
      }
      nchelluri@grugbarn:~/dev/hello $ go build
      nchelluri@grugbarn:~/dev/hello $ strip hello
      nchelluri@grugbarn:~/dev/hello $ ./hello 
      hello world
      nchelluri@grugbarn:~/dev/hello $ du -h
      1.4M .

------
ajxs
This is amazing!

Here's another awesome 256b demo that I love:
[http://www.pouet.net/prod.php?which=66372](http://www.pouet.net/prod.php?which=66372)

~~~
HellMood
That one is really amazing! I still don't understand how this didn't win the
"Meteoriks" award (my "hypnoteye" did
[https://www.pouet.net/awards.php#2015tiny-
intro](https://www.pouet.net/awards.php#2015tiny-intro) ) Sadly, Baudsurfer
has not been "around" for quite a while now ...

~~~
z0r
Memories is... going into the memory bank for greatest 256 byte demos for me,
right above the one you linked and immediate railways. Dang dude

------
rwmj
The best demo I've found, also 256 bytes, is Pyrit by Řrřola (Jan Kadlec, a
Czech developer). It's frankly incredible, something I wouldn't have believed
was possible:

[https://www.pouet.net/prod.php?which=78045](https://www.pouet.net/prod.php?which=78045)

I ported it to a boot sector so you can run it with a single (rather long!)
Linux command line in qemu:

[https://rwmj.wordpress.com/2019/12/08/pyrit-by-rrrola-
incred...](https://rwmj.wordpress.com/2019/12/08/pyrit-by-rrrola-incredible-
raytracing-demo-as-a-qemu-bootable-disk-image/)

The source code for Pyrit is worth reading too (see first link). It's very
clever and quite readable.

~~~
mateuszf
Video on youtube: [https://youtu.be/eYNoaVERfR4](https://youtu.be/eYNoaVERfR4)

------
haberman
How do you handle time with such small code size? I see a timer interrupt for
the music, but what about the animation? Is it dependent on the speed of the
underlying CPU?

~~~
pvg
It depends on the speed of the CPU - if you look at the archive at
[https://www.pouet.net/prod.php?which=85227](https://www.pouet.net/prod.php?which=85227),
you'll find a DOSBox config specifically for this demo. If you run it in
DOSBox you can fiddle with the emulation speed by pressing C-F11 and C-F12 and
you'll notice the speed of the animation change.

 _Later:_ Your question made me wonder what the performance of virtual 'target
CPU' is - the 'cycles' setting in the config is 20000 and there's a rough
estimate of what these numbers translate to here

[https://www.dosbox.com/wiki/Performance](https://www.dosbox.com/wiki/Performance)

So it looks like it's something along the lines of 'a 486 in the prime of its
life'.

~~~
HellMood
(author here) "you'll notice the speed of the animation change" that might be,
but the demo is designed to run at equal speed on all systems (it hooks into
the timer) if you experience animation speed differences, that means your
system can not handle what dosbox (on high cycles) demands. It should be noted
that DosBox is far slower than people expect it to be, and also, that in
actual competitions in the demoscene, real modern hardware is booted to
Freedos, but has no sound. So if you want sound (with MIDI) in a competition,
you have to stick to the rather slow dosbox, and even optimize against an
emulator which can be really really weird.
([https://www.pouet.net/topic.php?which=11881](https://www.pouet.net/topic.php?which=11881))
I wouldn't claim the demo runs fine on a real 486, but a pentium should do, as
a variation of the raycast tunnel part indicates
([https://www.youtube.com/watch?v=5_3CU6shKlY](https://www.youtube.com/watch?v=5_3CU6shKlY))

~~~
pvg
Interesting, thanks! I _thought_ I saw it get faster when I turned up the
cycles but maybe I'm misremembering or it's some other effect/artifact.

------
ddrdrck_
I didn't know about sizecoding.org, it seems to be a very valuable and
interesting resource explaining the "black art" of tiny demos, thanks ! I did
not check yet all pages, but "Memories" entry in particular is very well
written and explained.

------
HellMood
The final freedos version is available. It includes the Amiga Ball as extra
effect. The filesize is still 256 bytes.

[https://www.youtube.com/watch?v=wlW84fEHngM](https://www.youtube.com/watch?v=wlW84fEHngM)

------
imtringued
Since these are so small I don't see why we couldn't have a "demoscene
launcher" with a "mailto:" style protocol handler and just let people click on
base64 encoded links to start the demo.

~~~
7777fps
A handler for executing arbitrary code. What could possibly go wrong?

~~~
KMnO4
We already allow arbitrary code to execute by clicking a link, in the form of
JavaScript.

You may argue that JS is sandboxed, but so is DOSBox. At least DOSBox can’t
easily connect to remote servers over the internet.

~~~
7777fps
Correct me if I'm wrong, I haven't used DOSBOX for a decade but doesn't it
have the ability to access hard drives and mount them?

Given that, it's not much of a sandbox.

Or does that require intervention from the host system rather than auto-
mounting home and similar?

~~~
HellMood
I would clearly prefer a web browser with a dosbox to the "real" dosbox when
it comes to safety...

~~~
7777fps
Then I agree completely, and must have misunderstood what was proposed by a
handler. Typically a handler will launch an external application such as
mailto, ftp, magnet, etc.

If we want to run code in browser there is WASM.

So is the proposal that it would it be beneficial to have a DOS-like OS or x86
emulator in WASM for running COM files?

Yes, that would be better and more sandboxed than dosbox running outside the
browser.

------
ChrisArchitect
fun to watch this, like seeing the odd demo pop up on HN once in awhile and
that code/techniques breakdown is incredible. Back in the day we rarely had
that kind of insight into the mastery that went into a demo unless we really
got into a discussion with the creator about the code.

another reminder of the 'art' of the demoscene and it's recent recognition as
a piece of UN heritage in Finland which I thought was pretty cool
[http://demoscene-the-art-of-
coding.net/2020/04/15/breakthrou...](http://demoscene-the-art-of-
coding.net/2020/04/15/breakthrough-finland-accepts-demoscene-on-their-
national-list-of-intangible-cultural-heritage-of-humanity/) (HN discussion:
[https://news.ycombinator.com/item?id=22876961](https://news.ycombinator.com/item?id=22876961))

------
smabie
I've always thought the demo scene looked cool. Problem is, I don't really
care about graphics and sound and am not an especially creative person. Are
there are competitions that are purely objective? As in, the objective
criteria is quantitative?

~~~
encom
The International Obfuscated C Code Contest

[https://www.ioccc.org/](https://www.ioccc.org/)

~~~
smabie
But people vote right? What quantitative metric could be possibly used to
determine the most obfuscated C code?

------
mellow2020
note the "type" dropdown at the top, and enjoy.

[https://www.pouet.net/toplist.php?type=32b&platform=&limit=5...](https://www.pouet.net/toplist.php?type=32b&platform=&limit=50&days=0)

------
azuwey
Wao this is pretty cool, I didn't know this site, thanks.

------
chrisweekly
That is pretty amazing.

------
starpilot
Pretty leet.

------
1cvmask
The site has a login and lacks SSL. Hopefully this will be remedied soon.

------
spiritplumber
256 bytes is in the "let's try every combination" range, I think. So, write a
program that tries all of them and determines if any do something interesting
enough to forward to a human for review.

~~~
LocalH
Wouldn’t that be 256^256? That’s certainly a hell of a haystack

~~~
exikyut
Unreasonably impossible as of yet, yes.

For handy reference:

8^8: 16,777,216

8^16: 281,474,976,710,656

8^32: 79,228,162,514,264,337,593,543,950,336

8^64:
6,277,101,735,386,680,763,835,789,423,207,666,416,102,355,444,464,034,512,896

8^128: ...

8^256: ?

8^512: hello from the other side of the quantum dimension

8^8 sounds interesting. 16 million reboots of a real
{PC,C64,ST,Amiga,Mac,Z80,...} sounds like a collectively highly entertaining
kind of hilarious. The issues only begin when you start wondering if any of
the programs wedges the hardware into "interesting" states that are preserved
across reboots - or at least the _what if_ of that dimension of entropy...
then the problem space becomes 8^8^8...

[I decided to compute 8^8^8. The result is apparently 15 million digits long.
(`echo 8^8^8 | bc -ql | wc` -> `222814 222814 15596963`)

~~~
zokier
Thats weird "reference". Why 8 as the base number? Nobody works with 3bit
bytes. 8^8 == 2^24 == 2^(8 * 3) == 256^3, i.e. the combination space of _three
bytes_.

The smallest category in pouet for reference is 32b (or 256 bits), so 2^256
combinations to brute force. For comparison usually 128bit encryption is
considered "safe" and infeasible to brute force.

You might be able to constrain the search space to only valid IA32
instructions, but realistically I don't see it helping that much

~~~
vidarh
You could further constrain it to exclude a lot of instructions and
instruction pairs that makes no sense given the context. E.g. any instruction
pair where the second one makes the first one redundant, such as the second
instruction clobbering the same register the first one modified. Or a "ret" in
the first few instructions...

It'd probably not constrain the search space nearly enough though.

But even if it did and you'd somehow manage to even generate every
combination, you'd still face the second problem of how to evaluate if they do
something "interesting enough" to be worthwhile reviewing.

------
iszomer
Tiny binaries probably relied heavily on the native OS's system libraries.

~~~
GuB-42
A common complaint on tiny demos. Here the OS is only used for setting
graphics mode and setting up a timer. Plus all the boot code of course. Not
much really, with 512 bytes you can probably do it on the bare metal, if
someone didn't already.

There is even more debate in 4k. After all, most rely on graphics drivers that
take hundreds of megabytes. But the thing to understand is that in any case,
the intro ships all the code that produces the sound and image. The OS is just
an abstraction layer. The exception would be fonts and MIDI instruments, that
can be stored in the hardware or OS.

But not all intros have text, "Memories" doesn't. And many intros do their own
sound synthesis, though in PC 256 bytes you are usually limited to MIDI or to
that horrible buzzer.

~~~
HellMood
(author here) Not quite, a 256 bytes PC intro CAN have decent non MIDI music
as showcased here
[https://www.pouet.net/prod.php?which=79281](https://www.pouet.net/prod.php?which=79281)
(won the "outstanding technical achievment" award) I did some intros in 32
bytes and 16 bytes having that "horrible buzzer", looks like some ASCII
effects and a dutch gabber bassline is the maximum you can get in this
category :D (
[https://www.pouet.net/prod.php?which=76093](https://www.pouet.net/prod.php?which=76093))

~~~
GuB-42
That's why I said "usually", the moment someone says something is impossible,
someone does it ;)

Anyways, great job. I was there during the compo, it was epic, with everyone
double checking the executable size, even the old guys who have seen it all.
You got my vote BTW.

~~~
HellMood
#truckbreaker ;) yes, the overall reception was overwhelming, i didn't expect
that. i completely agree with your post before, just wanted to point to
"ikubun" =)

------
londons_explore
Normally these demos are filled with all kinds of 'tricks' to make things
smaller.

Things like self modifying code, using bits of the bios or video ROM in ways
they weren't intended by jumping into the middle of them, saving space by
using code as data or vice versa, tiny packers which compress or uncompress
the code, massive pregenerated buffers to do runtime lookups to generate data
in one order but use it in another, etc.

This seems very vanilla in comparison...

~~~
IvyMike
Note your comment about how unimpressed you are is 214 bytes longer than the
demo.

~~~
nullc
Obviously he didn't do any 'tricks' to make it smaller.

~~~
tripzilch
There's quite a few 'tricks' in the small bits of asm in the article.

Also, the tiny unpackers are generally used from 4096b and upwards. The size
of the unpacker takes too much space and doesn't make up for the compression
ratio at 256b.

