
Design and Implementation of a 256-Core BrainFuck Computer [pdf] - bryanrasmussen
http://sigtbd.csail.mit.edu/pubs/veryconference-paper2.pdf
======
wybiral
> The BrainFuck computer is an attractive solution for servicing high
> throughput BrainFuck cloud services, both in terms of performance and cost.

BrainFuck as a Service?

------
zokier
> Considering each BrainFuck command on average takes 5 or more assembly
> instructions to implement, even assuming a perfect 1 instructions per second
> on a 3GHz processor, it would require almost one hundred cores to compete
> with this performance

I assume the authors intended to mean "1 instructions per cycle" here, but
even with that amendment isn't that pretty poor performance for modern CPU? I
was under the impression that modern CPUs have peak performance way above 1
IPC, although if that is realizable with BF interpreter is another question.
It would have been nice to see comparison to some reasonably high-performance
BF compiler.

~~~
DSingularity
Not way above, but yes. We use simultaneous multi threading and out of order
execution to enable superscalar execution I.e >1 instruction per cycle. But
it’s not likely to be more than 2-3 for a variety of reasons.

~~~
monocasa
Just adding that you can be superscalar, while not being out of order or
multithreaded.

The Xbox 360's cores fit that model, as well as the original Pentium. They'll
execute multiple instructions, but will serialize if there are dependencies
(or other constraints that are uarch specific).

------
elchief
I'm a simple man. I see an article about Brainfuck and I upvote it

------
notananthem
[https://en.wikipedia.org/wiki/Brainfuck](https://en.wikipedia.org/wiki/Brainfuck)

For reference. This would basically let you almost use BrainFuck. If you
wanted to.

------
gorpomon
Would a mind greater than I please weigh in on a potential path to using
BrainFuck to do some type of meaningful task (simple server, cli tool, etc)?

From what I can tell, the best bet is using it to to write the source code for
another language and run that code since most examples seem to just print
strings or increment values.

Is there a meaningful set of primitives one can incrementally build on the
core language to make usable code?

~~~
shakna
Thanks to BFBASIC there's a small digital jewel safe running brainfuck as it's
core in a couple casinos.

It was doable, and matched security expectations, but it feels like twenty
years ago.

The tooling isn't quite there, so you end up working around the compiler and
injecting hand written bf, like we used to with assembly.

~~~
aeontech
That’s kind of amazing... could you share why? Just to prove it could be done?

~~~
shakna
Security requirements by the hotel. They wanted everything to be running
obfuscated code.

They gave me a choice of Malboge or Brainfuck, neither of which I knew before
the contract.

~~~
dkersten
That seems like a ridiculous security requirement, but also like a rather
interesting project to work on! The ultimate security through obscurity.

~~~
shakna
It was certainly fun at the beginning, and I think they were aiming for crazy
levels of security, and obscurity sounded nice to someone.

But the tooling really isn't there.

Ended up using 'make' and 'm4' as preprocessors to work around things.

------
tomsmeding
Come on, 5 instructions per BF operation? They're on x86 right? Let's assume
your memory pointer is in ebx. (Substitute rbx for x64 code) (nasm syntax)

    
    
        +: inc byte [ebx]
        -: dec byte [ebx]
        >: inc ebx
        <: dec ebx
        [:   cmp byte [ebx], 0
             jz endlabel
           startlabel:
        ]:   cmp byte [ebx], 0
             jnz startlabel
           endlabel:
    

That's ignoring . and , which I expect do not occur very often. (If they did,
their compute-focused architecture wouldn't be a good choice anyway.)

This is more like 1.3 instructions per command. How did they get their "5"?

~~~
imtringued
Even basic optimizing compilers collapse consecutive + and - operations into a
single constant size instruction and does a dozen other optimizations so in
practice the instruction density is crazy high compared to naively executing
every BF operation individually. It could have been interesting if they used
an optimized ISA inspired by [1].

There's a reason why CPUs that can execute JVM bytecode directly never caught
on: They cannot apply any of the optimizations that a JIT or compiler can.

[1] [http://calmerthanyouare.org/2015/01/07/optimizing-
brainfuck....](http://calmerthanyouare.org/2015/01/07/optimizing-
brainfuck.html)

~~~
tomsmeding
Of course you collapse sequences of BF instructions, but then you get an even
lower cpu-instructions / bf-command ratio. The paper gives 5/1, I claim it's
at MOST 1.3/1, and with a basic optimising compiler you can get far below that
of course.

------
peterwwillis
Tuition well spent.

------
evadne
See
[https://news.ycombinator.com/item?id=12714846](https://news.ycombinator.com/item?id=12714846)

------
monocasa
The architecture really reminds me of TIS-100.

~~~
yvdriess
Zach, if you're reading this, please make a Brainfuck expansion for TIS-100
[1] and Shenzhen I/O [2], your games are clearly not hard enough already ;)

[1] [http://www.zachtronics.com/tis-100/](http://www.zachtronics.com/tis-100/)
[2] [http://www.zachtronics.com/shenzhen-
io/](http://www.zachtronics.com/shenzhen-io/)

------
inteleng
Can't believe they misspelled Virtex.

------
kruhft
Nice.

