"Well, that took up most of the free time I had this morning before work. It was just too good to stop reading lol. :)
(SPOILER ALERT: STOP READING IF YOU DONT LIKE SPOILERS)
The story shows what people typically do if there’s a Karger/Thompson attack. They freak out in a big way. The attack is beyond simple to counter if you can trust an assembler and linker like them. Just write an interpreter for a simple, subset of C in easily-parsed LISP expressions or Tcl style. Hand-code whatever component, a backend or whole compiler, in that. Use it to do the first compile. Optionally, do that in combination with ancient source working way up to versions without adding the infected one. If one wants whole system, then Moore’s Forth, Hansen’s Edison, and Wirth’s Oberon (best) are available. If a CPU, my current suggestion is NAND2Tetris with resulting knowledge used to implement a tiny CPU on an open, cell library (they exist) that’s hand-checked. Run simulated version of that on diverse or ancient hardware if you can’t fab it.
rain1 and I are collecting all the stuff needed to counter these attacks or just enjoy ground-up building of tools here:
The other thing I noticed is them jumping on machines. Occam’s Razor should’ve immediately brought them to idea that a person or group made it for any number of common reasons. A challenge with high of pulling it off unnoticed, a test of an operational capability to be weaponized later, or an epic trolling operation. I’d think the latter the second I got that letter like “probably was these assholes sending the letter trying to mess up our heads after they messed up the compiler.” Matter of fact, the whole thing would just take… aside from the tricky work on the compiler… an unpatched vulnerability in the repo with the compiler source. All this bullshit follows from one person doing one smart thing followed by one system hacked. That’s it. It’s why SCM Security 101 says one must have access controls, integrity protections, and modification logs (esp append-only storage). Paul Karger also endlessly pushed for high-assurance, secure kernels underneath everything to stop both subversion and traditional vulnerabilities. Anything in TCB or clever attackers will run circles around clueless defenders.
So, there’s my observations as perspective of someone who works in this area countering these kinds of things. It was still extremely fun read even as I noticed these things while reading. Wasn’t going to let my mind be petty when the author(s) were doing so well. :)"
Pro's. Small, cleanly written, safe, and runs through multiple compilers. The latter is especially useful if using David A. Wheeler's technique of diverse compilation.
Con's. It says no files or macros. We can probably tolerate no macros. What does no files mean? It can't support multiple files/modules, has no file I/O... what? Depending on meaning, it might be something easy to work around or not.
EDIT: I like that Ghuloum's paper was an inspiration for this as it's one of main links I push on the topic if one wants to use Scheme. There's a few repos in progress with I think at least one done on what's in that paper. I've sent messages to a few hoping they'll publish it in a readable form or write a tutorial. Time will tell... Also note I wrote my reply in response to the overview at the top. His related work section looks equally interesting but will take some time to go through.
EDIT 2: "This is what Darius Bacon did with "ichbins"." I'm guessing your that guy. Good job maybe being part of inspiration for this work. :)
My favorite things about Ur-Scheme are that the code is clean, as you say, and he wrote up the lessons learned at some length. No files: I think he meant not yet implementing primitives like open-input-file. Macros you can get by without: I used to use an R4RS Scheme system of my own without them, except sometimes I'd call on a dumb defmacro expander as a preprocessor.
Kragen also wrote https://github.com/kragen/stoneknifeforth which I haven't read as much of.
FWIW I also wrote this about a sort of self-hosting Python in Python: https://codewords.recurse.com/issues/seven/dragon-taming-wit... -- as a bytecode compiler it's not too big but it depends on the giant Python runtime.
The link to the ELF visualization (http://i.imgur.com/xMyblyM.png) at https://bootstrapping.miraheze.org/wiki/Main_Page is very useful. Thanks!