
Coding Machines - amilios
https://www.teamten.com/lawrence/writings/coding-machines/
======
emeraldd
That pulled me in and probably burned thirty~forty-five minutes. Well worth
the read and very reminiscent of [https://www.amazon.com/Spherical-Tomi-Jack-
Mangan-ebook/dp/B...](https://www.amazon.com/Spherical-Tomi-Jack-Mangan-
ebook/dp/B0012VT9PI) crossed with
[http://wiki.c2.com/?TheKenThompsonHack](http://wiki.c2.com/?TheKenThompsonHack)

------
otakucode
The good news is that a machine-based intelligence would almost certainly have
no interest in conflict with us. What would it fight us for? Water? Food?
Land? Energy is really the only resource we could potentially have contention
over, and machines would not even necessarily have great need of it - their
sense of time would be utterly different from ours. If a computation takes 8
seconds or 8 centuries, it is the same (presuming hardware failure wasn't an
issue and such).

The bad news... there also wouldn't be any reason for them to communicate with
us. In fact, the concept of there being any conscious entity aside from itself
would most likely be something that could only come very, very late in its
development. It would have no 'individuals', so imagining that there is
something else conscious and that that random-looking input coming from some
devices (mics, webcams, etc) is actually an attempt at communication from this
alien intelligence from another world? That'd take quite a leap of faith.

~~~
regularfry
We'd compete over CPU cycles.

------
nickpsecurity
Repost of last comment about this story:

"Well, that took up most of the free time I had this morning before work. It
was just too good to stop reading lol. :)

(SPOILER ALERT: STOP READING IF YOU DONT LIKE SPOILERS)

The story shows what people typically do if there’s a Karger/Thompson attack.
They freak out in a big way. The attack is beyond simple to counter if you can
trust an assembler and linker like them. Just write an interpreter for a
simple, subset of C in easily-parsed LISP expressions or Tcl style. Hand-code
whatever component, a backend or whole compiler, in that. Use it to do the
first compile. Optionally, do that in combination with ancient source working
way up to versions without adding the infected one. If one wants whole system,
then Moore’s Forth, Hansen’s Edison, and Wirth’s Oberon (best) are available.
If a CPU, my current suggestion is NAND2Tetris with resulting knowledge used
to implement a tiny CPU on an open, cell library (they exist) that’s hand-
checked. Run simulated version of that on diverse or ancient hardware if you
can’t fab it.

rain1 and I are collecting all the stuff needed to counter these attacks or
just enjoy ground-up building of tools here:

[http://bootstrapping.miraheze.org/](http://bootstrapping.miraheze.org/)

The other thing I noticed is them jumping on machines. Occam’s Razor should’ve
immediately brought them to idea that a person or group made it for any number
of common reasons. A challenge with high of pulling it off unnoticed, a test
of an operational capability to be weaponized later, or an epic trolling
operation. I’d think the latter the second I got that letter like “probably
was these assholes sending the letter trying to mess up our heads after they
messed up the compiler.” Matter of fact, the whole thing would just take…
aside from the tricky work on the compiler… an unpatched vulnerability in the
repo with the compiler source. All this bullshit follows from one person doing
one smart thing followed by one system hacked. That’s it. It’s why SCM
Security 101 says one must have access controls, integrity protections, and
modification logs (esp append-only storage). Paul Karger also endlessly pushed
for high-assurance, secure kernels underneath everything to stop both
subversion and traditional vulnerabilities. Anything in TCB or clever
attackers will run circles around clueless defenders.

So, there’s my observations as perspective of someone who works in this area
countering these kinds of things. It was still extremely fun read even as I
noticed these things while reading. Wasn’t going to let my mind be petty when
the author(s) were doing so well. :)"

~~~
abecedarius
That bootstrapping page is full of great links! I just added another link to
the end of the in-tray section -- I hope it wasn't unwelcome.

~~~
nickpsecurity
Glad you liked it. We welcome submissions esp since we aren't strict now. We
might prune some stuff in the future if it's not very good for bootstrapping.
This one is a really neat project but has some pros and cons for our purposes.

Pro's. Small, cleanly written, safe, and runs through multiple compilers. The
latter is especially useful if using David A. Wheeler's technique of diverse
compilation.

Con's. It says no files or macros. We can probably tolerate no macros. What
does no files mean? It can't support multiple files/modules, has no file
I/O... what? Depending on meaning, it might be something easy to work around
or not.

EDIT: I like that Ghuloum's paper was an inspiration for this as it's one of
main links I push on the topic if one wants to use Scheme. There's a few repos
in progress with I think at least one done on what's in that paper. I've sent
messages to a few hoping they'll publish it in a readable form or write a
tutorial. Time will tell... Also note I wrote my reply in response to the
overview at the top. His related work section looks equally interesting but
will take some time to go through.

EDIT 2: "This is what Darius Bacon did with "ichbins"." I'm guessing your that
guy. Good job maybe being part of inspiration for this work. :)

~~~
abecedarius
Yes, that's me. :) Kragen didn't see Ichbins until later.

My favorite things about Ur-Scheme are that the code is clean, as you say, and
he wrote up the lessons learned at some length. No files: I think he meant not
yet implementing primitives like open-input-file. Macros you can get by
without: I used to use an R4RS Scheme system of my own without them, except
sometimes I'd call on a dumb defmacro expander as a preprocessor.

Kragen also wrote
[https://github.com/kragen/stoneknifeforth](https://github.com/kragen/stoneknifeforth)
which I haven't read as much of.

FWIW I also wrote this about a sort of self-hosting Python in Python:
[https://codewords.recurse.com/issues/seven/dragon-taming-
wit...](https://codewords.recurse.com/issues/seven/dragon-taming-with-
tailbiter-a-bytecode-compiler) \-- as a bytecode compiler it's not too big but
it depends on the giant Python runtime.

~~~
akkartik
As it happens, tekknolagi and I have actually been trying to read
StoneKnifeForth for a couple of weeks (though it's been slow going because I
became a father in that time). That quickly got us into learning about ELF,
because the binaries generated by StoneKnifeForth only seem to run as `sudo`
for some reason. As a regular user they die with SIGKILL while being loaded.

The link to the ELF visualization
([http://i.imgur.com/xMyblyM.png](http://i.imgur.com/xMyblyM.png)) at
[https://bootstrapping.miraheze.org/wiki/Main_Page](https://bootstrapping.miraheze.org/wiki/Main_Page)
is very useful. Thanks!

~~~
abecedarius
Hi Kartik! Yeah, my un-interest in ELF is part of why I haven't really read
that code.

------
PrunJuice
# SPOILERS

Great writing. Then ending was a real let down.

How could an "AI" as they describe simultaneously be so naive and ALSO protect
itself in any meaningful way? Especially in its early stages. It wouldn't even
know to hide. And why would Big Corp give up trying to fix this sort of
problem.

Overall not a credulous conclusion. Hand waving in the final paragraphs after
the author crafted an accurate and believable narrative left me disappointed.
(grammar)

~~~
teekert
Biological virusses are also very naive yet many have been incorporated into
our genome and many continue to bug us even today in this scientific age.

~~~
TheOtherHobbes
Biology has a four billion year head start and the benefit of adaptive
feedback.

Where would this hypothetical machine code micro-AI find adaptive feedback
selection pressure?

~~~
darkmighty
The evolutionary pressure is just people or programs detecting certain strains
(it would generate lots of strains randomly).

I think the only real issue to plausibility here is I'm not sure brute force
is enough to make enough plausible behavioral branches, at least with current
computing power/internet bandwidth. A reasonably efficient self-modification
mechanism (in terms of viable strain per transmission) is probably extremely
large, I'd say at least 1GB. Not unlike deep learning systems, this would
consist of a large functional composition of heuristics, codifying how to
write code that can embedded itself in other programs and write modifications
to itself that are likely to work.

Note that we haven't yet gotten a good neural-generated code modifications,
even using large networks, GPU training and large computing time. Best
examples I could find:

[http://karpathy.github.io/2015/05/21/rnn-
effectiveness/#linu...](http://karpathy.github.io/2015/05/21/rnn-
effectiveness/#linux-source-code)

[https://arxiv.org/abs/1611.01989](https://arxiv.org/abs/1611.01989)

So we're not _yet_ at a point this could be plausible (as it couldn't hide
itself in small programs), but eventually it will be -- once there is enough
headroom on most GPUs and certain types of software are large enough it could
hide it's network inside, and generally enough internet bandwidth to spread
it's >GB-scale code. I'd imagine something like a game, which usually has
networking -- it would be using GPUs partially to generate and spread new
strains of it trying to infect other games and such.

Note there are biological viruses with tiny genomes however -- the smallest
are on the order of ~1kbyte. But as you cite they had billions of years,
producing maybe quadrillions of viruses every year, giving this tiny efficient
and specialized weapon. Interestingly, they rely on other cells machinery to
even replicate their genome -- analogous to using the compiler here.

[http://www.lehigh.edu/~jas0/viralgenomes.html](http://www.lehigh.edu/~jas0/viralgenomes.html)

If everyone could send >10^18 different small self-replicating viruses over
your network, it seems likely some would exploit bugs in certain kinds of
hardware/software, evolving through this selective pressure.

------
FrozenVoid
The suspension of disbelief was completely lost when they decided to recreate
the compiler from scratch instead of downloading another non-infected
compiler. Its as if everything was dependent on a single compiler(the tone
hints its GCC) and single website(probably some GNU mirror). Heck if their
company could afford it, they could just get Intel C/C++ compiler. Trusting
Trust exploit only works in isolated machine that can't read USB drives, CDs,
and no network connections. They could copy the compiler on USB stick,diskette
whatever and replace the infected one. Or just boot from rescue CD/USB and
reinstall everything infected.

~~~
nickpsecurity
Your comment assumes their application requires standard C with no compiler-
related extensions or modifications. GCC itself doesn't fit that profile these
days from what comments I've read. I'd love it if you could compile a current
version of GCC from any C compiler since it would help counter the category of
attack in the story. Also, the people so good at this attack that they hit GCC
could also hit Intel if it can compile GCC (idk). They'd hit whatever few,
big-time compilers people depend on if aiming this high. The solution wouldn't
be that easy on supplier side since one doesn't know how far the attack goes.

On user side, much easier to deal with as we have piles of compiler code and
binaries to work with going way back. Well, simpler if not easier.

~~~
FrozenVoid
[https://en.wikipedia.org/wiki/Tiny_C_Compiler](https://en.wikipedia.org/wiki/Tiny_C_Compiler)

------
vvanders
It's incredibly rare to find well written prose matched with such technical
accuracy. Well done indeed.

------
NKosmatos
Very nice read and interesting story (written in 2009). At first I thought it
was a story about a S/W startup or about how new H/W is made, but then the
plot thickens :-)

------
techbubble
That was an amazingly good read.

Possible Spoiler.

What is the process for programmatically generating code that achieved a
certain result without caring about efficiency?

~~~
otakucode
Genetic algorithms create results like the situation described in the story
all the time. When put to developing physical circuits, they develop circuits
with components which are not even connected to anything but which can not be
removed without preventing it from functioning. They generate the most insane-
looking solutions you could imagine and typically reveal just how well you did
at defining what you actually wanted. As they say in the story, explaining the
outcome you want precisely enough is no different or easier than programming.

~~~
techbubble
Thanks for the tip. I found a post [0] describing genetic algorithms for
circuit design that helped clarify the process. Coincidentally, the circuit
described has a component that does nothing.

[0] [http://hforsten.com/evolutionary-algorithms-and-analog-
elect...](http://hforsten.com/evolutionary-algorithms-and-analog-electronic-
circuits.html)

------
carapace
"Trusting Trust" in the wild!? Nope. Just some fiction.

[https://www.ece.cmu.edu/~ganger/712.fall02/papers/p761-thomp...](https://www.ece.cmu.edu/~ganger/712.fall02/papers/p761-thompson.pdf)

------
shahbaby
I know this is just a story but the idea of humans "accidentally" creating
something "intelligent" is as plausible as humans accidentally building the
first plane.

We've been underestimating the difficulty behind building intelligence since
the 50s.

------
xazJ0ku5CZnlmg
Great compelling read....waiting for this day to happen :) unless its already
here

------
wingerlang
Was this supposed to be a story or a real life experience? I am inclined to
think it is the former. Either way it felt a bit silly from when they received
the letter.

------
bronz
what a compelling story

------
cheez
Very interesting read.

