Hacker News new | comments | ask | show | jobs | submit login
Design and Implementation of a 256-Core BrainFuck Computer [pdf] (mit.edu)
108 points by bryanrasmussen 9 months ago | hide | past | web | favorite | 33 comments

> The BrainFuck computer is an attractive solution for servicing high throughput BrainFuck cloud services, both in terms of performance and cost.

BrainFuck as a Service?

> Considering each BrainFuck command on average takes 5 or more assembly instructions to implement, even assuming a perfect 1 instructions per second on a 3GHz processor, it would require almost one hundred cores to compete with this performance

I assume the authors intended to mean "1 instructions per cycle" here, but even with that amendment isn't that pretty poor performance for modern CPU? I was under the impression that modern CPUs have peak performance way above 1 IPC, although if that is realizable with BF interpreter is another question. It would have been nice to see comparison to some reasonably high-performance BF compiler.

Not way above, but yes. We use simultaneous multi threading and out of order execution to enable superscalar execution I.e >1 instruction per cycle. But it’s not likely to be more than 2-3 for a variety of reasons.

Just adding that you can be superscalar, while not being out of order or multithreaded.

The Xbox 360's cores fit that model, as well as the original Pentium. They'll execute multiple instructions, but will serialize if there are dependencies (or other constraints that are uarch specific).

Regardless of the IPC level, 5 CISC instructions for 1 BF instruction sounds exceptional low, it's something that an 200 bytes naive compiler would produce.

An optimizing compiler capable of generating the full gamut of x86 instructions would increase the IPC by at least one order of magnitude, the majority of BF code are boilerplate structures that set up loops, move data around, repeated increments to generate constants etc. Being a very simple language with limited and well defined side effects, it lends itself exceptionally well to reordering.

Because of a number of factors, mostly to do with memory latency, modern superscalar OoOs achieve ~2, sometimes 3 IPC, so not "way above".

I'm a simple man. I see an article about Brainfuck and I upvote it


For reference. This would basically let you almost use BrainFuck. If you wanted to.

Would a mind greater than I please weigh in on a potential path to using BrainFuck to do some type of meaningful task (simple server, cli tool, etc)?

From what I can tell, the best bet is using it to to write the source code for another language and run that code since most examples seem to just print strings or increment values.

Is there a meaningful set of primitives one can incrementally build on the core language to make usable code?

Your best bet is to transpile a language you’re more productive in to it. See https://esolangs.org/wiki/Brainfuck_code_generation

Thanks to BFBASIC there's a small digital jewel safe running brainfuck as it's core in a couple casinos.

It was doable, and matched security expectations, but it feels like twenty years ago.

The tooling isn't quite there, so you end up working around the compiler and injecting hand written bf, like we used to with assembly.

That’s kind of amazing... could you share why? Just to prove it could be done?

Security requirements by the hotel. They wanted everything to be running obfuscated code.

They gave me a choice of Malboge or Brainfuck, neither of which I knew before the contract.

That seems like a ridiculous security requirement, but also like a rather interesting project to work on! The ultimate security through obscurity.

It was certainly fun at the beginning, and I think they were aiming for crazy levels of security, and obscurity sounded nice to someone.

But the tooling really isn't there.

Ended up using 'make' and 'm4' as preprocessors to work around things.


Did they give you access to a super computer?

This is a joke right? That's a crazy requirement

Not a joke. They weren't a great client. I think some of the managers had watched too many heist movies or something.

I offered an equally crazy budget because of the time and effort for something this crazy, and they said yes on the spot.

I think the likely path would be a compiler that compiles some other language to BrainFuck. As BrainFuck is Turing-complete, such a compiler can be constructed.

An example of a compiler for an even more restricted environment might be


Edit: in the same vein there's also


Further edit: The user Someone posted a link in this thread to several existing compilers that already target Brainfuck (!). That's awesome.

There is also some ability to explore optimization strategies. http://calmerthanyouare.org/2015/01/07/optimizing-brainfuck.... As it’s such a limited language, it is something that may be more accessible (compare optimizing MIPS).

It would be an interesting exercise to design a lambda calculus evaluation strategy and then a compiler to convert an arbitrary lambda expression into this dialect of concurrent brainfuck. This would let you access the massive parallelism already apparent in lambda calculus expressions 'for free'.

The issue of course is finding an evaluation strategy that doesn't explode the memory complexity as well.

Or skip the brainfuck and start from the simplest possible lambda calculus machine



I read a while back that it's a good language for generating code using genetic algorithms: http://lapinozz.github.io/project/2016/03/27/generating-brai...

Come on, 5 instructions per BF operation? They're on x86 right? Let's assume your memory pointer is in ebx. (Substitute rbx for x64 code) (nasm syntax)

    +: inc byte [ebx]
    -: dec byte [ebx]
    >: inc ebx
    <: dec ebx
    [:   cmp byte [ebx], 0
         jz endlabel
    ]:   cmp byte [ebx], 0
         jnz startlabel
That's ignoring . and , which I expect do not occur very often. (If they did, their compute-focused architecture wouldn't be a good choice anyway.)

This is more like 1.3 instructions per command. How did they get their "5"?

Even basic optimizing compilers collapse consecutive + and - operations into a single constant size instruction and does a dozen other optimizations so in practice the instruction density is crazy high compared to naively executing every BF operation individually. It could have been interesting if they used an optimized ISA inspired by [1].

There's a reason why CPUs that can execute JVM bytecode directly never caught on: They cannot apply any of the optimizations that a JIT or compiler can.

[1] http://calmerthanyouare.org/2015/01/07/optimizing-brainfuck....

Of course you collapse sequences of BF instructions, but then you get an even lower cpu-instructions / bf-command ratio. The paper gives 5/1, I claim it's at MOST 1.3/1, and with a basic optimising compiler you can get far below that of course.

Tuition well spent.

The architecture really reminds me of TIS-100.

Zach, if you're reading this, please make a Brainfuck expansion for TIS-100 [1] and Shenzhen I/O [2], your games are clearly not hard enough already ;)

[1] http://www.zachtronics.com/tis-100/ [2] http://www.zachtronics.com/shenzhen-io/

Can't believe they misspelled Virtex.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact