
BOX-256: a tiny game about writing assembly code to pass the graphics tests - ingve
http://juhakiili.com/box256/
======
zokier
Love it, very Zachtronicesque. If the author is reading, here are couple of
improvement ideas:

* Show the level number somewhere so its easier to discuss with others

* Add some way to know how well I did on the level, either by having some leaderboards or rating system/target cycle count

* Allow pausing the execution when running

* Not sure if it would make sense to have full 8-bit color palette instead of 4 bits. Probably does not matter to game design too much

I think it might be neat idea to implement this on real hardware with these
sort of rgb led modules
[http://i.imgur.com/4YnDwWu.jpg](http://i.imgur.com/4YnDwWu.jpg) and some way
to input code, maybe row of toggle switches Altair/PDP style or small keypad
like in KIM-1

~~~
tehbeard
8 bit palette means trying to match the colour against 256 possibilities
instead of 16.

~~~
zokier
That reminds me of one additional improvement suggestion: allow viewing the
color values of the target pixels somehow.

------
keely
Hi, I'm the author.

Looks like the game went a bit viral and my personal website bandwidth limits
were exceeded.

You can now play here instead: [http://bit.ly/1V1PiHt](http://bit.ly/1V1PiHt)

~~~
golergka
Could you build OS X version too, please?

~~~
keely
I will try to build an OSX version tomorrow(ish), when I get my hands on an
OSX machine.

------
azeirah
Made a subreddit for this game:
[https://www.reddit.com/r/box256](https://www.reddit.com/r/box256)

------
tromp
Visiting that webpage causes my SUSE Firefox 45.0 browser to say "You need a
browser which supports WebGL to run this content. Try installing Firefox."

~~~
fpgaminer
Probably a driver issue. Firefox 45.0 supports WebGL, but it may be disabled
on your machine if it can't find a suitable OpenGL driver. For reference, the
site worked fine on my Ubuntu 15.10 machine, Firefox 45.0, Intel GPU.

------
impomatic
Four squares solved in 7 cycles, 16 threads and overlapping code :-)
[https://twitter.com/john_metcalf/status/716901521036861440/](https://twitter.com/john_metcalf/status/716901521036861440/)

------
kencausey
I looked at this a couple of times today before finally digging in and as I
completed each level I looked forward more and more to the next level. And
then there were no more levels, far sooner than I expected. I hope the author
adds more in time.

Edit: Oh and earlier today after having looked at it many times I bought
[http://www.lexaloffle.com](http://www.lexaloffle.com) 's combination of
Voxatron and Pico-8. I've spent sometime today learning about Pico-8 (to which
I'm going to limit my attention for now) and I think anyone interested in the
linked game might be interested in these as well.

~~~
keely
There will be more levels in few days.

~~~
kencausey
Great, thanks!

------
SilasX
Neat! Would like my own editor though. This doesn't have a lot of text editing
capabilities and is fixed size.

How well does this map to real x86 assembly programming?

~~~
zokier
Real x86 is very different; you have limited registers (especially 32bit x86),
very weird variable length instruction encoding, gazillion different
instructions, memory access takes significant amount of time. Of course having
near infinite amount of memory compared to the 256 bytes of BOX256 changes the
way you program your code too.

------
kencausey
This has moved to [http://box-256.com/](http://box-256.com/)

------
azeirah
Copying and pasting is not working for me :(

~~~
jsnell
The manual implies that's a limitation of the Unity WebGL player, but works in
the standalone version.

~~~
azeirah
Oh I was confused for a moment, thought that only counted for the "external"
clipboard. Obviously not :(

------
amstocker
Very cool, reminds me a lot of TIS-100

------
johnlinvc
It's HNed, I got 509 Bandwidth Limit Exceeded. Looking forward to play it.

------
OMGWTF
My current scores:

    
    
      Square:       0x07 cycles
      Checkerboard: 0x4C cycles
      4 Squares:    0x09 cycles
    

I will post my solutions in 24-48 hours.

~~~
ekimekim
Checkerboard: 0x24 cycles.
[http://imgur.com/C1UeEdn](http://imgur.com/C1UeEdn) (EDIT: 0x16, see below)

My key insight was twofold:

1\. If you start every thread at 0x00 and make the first N commands "THR @00",
you get 2^(n+1)-1 threads running by the time your first thread exits the
chain of THR commands.

2\. You can use a single array MOV instruction to set a range of program
counters, effectively causing every thread to jump to a set position.

So in the first 5 cycles I start 31 threads, then my main thread drops into a
loop where all it does is set every thread's position (including its own) to a
start point every 2 cycles. Every single thread is now a 2-cycle loop with no
JMP cycle spent to return to the start each time. Then I simply make 16 loops
of:

    
    
      PIX counter color
      ADD counter 1 counter
    

with color being one of two memory positions, which the main thread swaps out
each loop. I initialize counter such that each thread is doing its own row of
the output (0, 10, 20, etc). The leftover threads I don't actually want (but
it's hard to fine tune the thread count, so I left them in), so they share the
first thread's loop. The repeated actions are effectively a noop.

Note that, due to space constraints, I use the unused 4th byte of each PIX
instruction as the counter for that loop.

Doing this got me to 0x24 cycles. I shaved a further 4 cycles by using the
leftover space to add 4 more loops. These ones run vertically and operate on
the last two columns (two threads doing one column together). By the time the
row-moving threads have reached the 2nd last column, the column-moving threads
have finished the last two columns, and so the program is finished 4 cycles
earlier.

I think I could get another 2-cycle saving if I fine-tuned the number of
threads started, compressed everything down again and got another two column-
moving threads.

The bigger improvement is converting the whole thing to 1-cycle loops instead
of 2-cycle, and having half the loops incrementing the counters and the other
half doing PIX instructions. This might just be possible in the available
space if I've run the numbers correctly, and would result in a total time of
0x14.

EDIT: Did the latter: [http://imgur.com/zD1Gw5H](http://imgur.com/zD1Gw5H)

It came out to 0x16, as I failed to account for the extra time to spin up more
threads.

~~~
keely
Awwww this is great stuff. I would love to hear your thoughts on how the
language could be changed/improved to make puzzles and optimizing even more
fun.

~~~
ekimekim
The biggest thing for me is being able to save/load solutions. I ended up
keeping multiple browser tabs open as I tried to tweak solutions / attempt
different puzzles.

As for the language itself, a MOD operation to go with the DIV operation is
useful. bitwise AND / OR / NOT / XOR could be interesting.

Something that would make threads a lot more versatile would be some way to
have them behave differently from one another even when executing the same
code.

Perhaps a new instruction like JTI X Y: Jump if Thread Id (defined as 0xFF -
location of program counter) is equal to Y

Or much more versatile but harder to account for in opcodes: a new prefix #XX
which behaves like @(XX + thread id). So you could, for example, write 0x42 to
a per-thread slot in an array starting at 0xA0 by doing "MOV 042 #A0"

For making the puzzles interesting: The problem is most shapes and images can
simply be described in memory and subsequently brute-force PIXed in sequence.
After the simplest puzzles, the trick is having images that do not fit easily
in memory and require procedural description. Checkerboard is a good example
here. What about something like fizzbuzz but described in colors (like this:
[http://imgur.com/RMEaKAg](http://imgur.com/RMEaKAg))? Or tricky questions
where a seemingly random image actually has a simple pattern (like this:
[http://imgur.com/TbTPZnh](http://imgur.com/TbTPZnh) (spoilers here:
[http://imgur.com/73De5QF](http://imgur.com/73De5QF) ))? Unfortunately you're
very limited as there's no actual input; the program's exact output is always
entirely known.

Several people here have mentioned TIS-100: It's not the exact same thing as
what you're doing, but it could be an interesting thing to compare with if you
haven't played it yet.

EDIT: Just for fun, my best time for that second puzzle I posted is 0x103.

~~~
keely
Interesting thoughts. Thank you for the feedback.

The copy-pasting to clipboard doesn't work in the webgl for browser security
reasons, but you can download a Windows build that has working copy to
clipboard. I'll make an OSX build later also. See the link here:
[http://bit.ly/1V1PiHt](http://bit.ly/1V1PiHt)

How to make threads operate differently on same code is interesting. I don't
really want to do another prefix, because I want to keep the realism of 255
opcodes and I've already used 128. The permutations of new prefix would not
fit anymore. I also like the JTI suggestion, but perhaps it doesn't go as far
as I would like.

Currently I'm thinking that maybe in addition to program counter, the thread
would have another slot where it can define "memory offset". You could state
the offset when you start the thread, but it is also modifiable later on, as
its in memory. You would call for example "THR @023 010 000", which would
essentially mean that for ALL operations for that thread, ALL memory addresses
get a offset of +010. I think this is very close to what you suggested with
threadID. I think the offset would live next to program counter (FE & FF). Do
you think that would work?

I've played TIS-100 and named this game similarly BOX-256 as a homage, since
TIS-100 pretty much inspired the whole thing. It's a wonderful game.

Great ideas on the new levels. I will shortly update with more levels and I'll
be sure to add your suggestions in to the mix :)

~~~
ekimekim
You can't access localstorage or anything for persistence? Can't create a
popup with copy-pastable text? Any means of input or output from the game
short of screenshots and manual typing at all?

The "all operations offset" sounds tricky but interesting. It'd be problematic
in a typical usage scenario, but probably work well with my horrible "every
line of code runs in a 1-cycle loop on its own thread" style :P

~~~
keely
I will go for the tricks you mentioned for the input/output in browser, but
will take some time to implement.

I'll take a step back and think about the threading some more.

------
azeirah
The `thr` statement is not very useful at the moment.

It's impossible to pass parameters to a piece of code using threads :(

I have to duplicate my code in order to use the threads

~~~
chipsy
It's really ungainly to work with -- but it does drop the cycle counts to
solve the problems by running more threads.

~~~
ekimekim
What's really funny is when the program's trivially paralellizable (any of
them where the entire thing to draw fits in memory) and the main thing your
threads are doing is to start more threads, then finally all the threads do 2
draw instructions each.

------
doomrobo
FYI this gives a javascript error in Safari 9.1 private browsing mode.
Disabling private browsing fixes the problem.

------
danjayh
Wish it had a 'mod' instruction :(

------
achikin
Will there be a standalone version for Mac?

~~~
keely
Yes, in a few days hopefully

