Hacker News new | past | comments | ask | show | jobs | submit login
A Magnetized Needle and a Steady Hand (nullprogram.com)
182 points by omnibrain on Nov 18, 2016 | hide | past | favorite | 62 comments



It's interesting to compare with DOS, where a do-nothing program is a single byte:

    C3
A Hello World is roughly 20 bytes, the bulk of it being the string itself:

    95 BA 07 01 CD 21 C3 48 65 6C 6C 6F 20 77 6F 72 6C 64 21 24
The ELF in the article, which is 130 by my count, could be reduced to 45 with some tricks:

http://www.muppetlabs.com/~breadbox/software/tiny/teensy.htm...

Also worth looking at is what the demoscene has done with binaries of 128 bytes or less:

http://www.pouet.net/prodlist.php?type%5B%5D=32b&type%5B%5D=...


When working on this[1], I spent three sleepless days shaving bytes from the included Brainf*ck interpreter. The last byte probably took about 5 hours.

Then I showed it to a security researcher I knew and he immediately replied "oh, this is so cool! also, you can replace the last four bytes with C3". Oh man, that hurt :-)

[1] https://news.ycombinator.com/item?id=7943514


On Unix, an empty file is a working implementation of the "true" command (provided that is in the PATH and set as executable). This is because it is interpreted as a shell script, and an empty script of course exits successfully.

I'm pretty sure this was actually used in some Unix/Linux version, and they got bug reports due to the poor performance (executing a shell to do nothing), which makes for a lot of bugs per line of code. Unfortunately I can't find a reference. Instead I found that AT&T Unix implemented "true" as an empty file... with a copyright notice! See http://trillian.mit.edu/~jc/humor/ATT_Copyright_true.html


    joey@darkstar:~>touch true
    joey@darkstar:~>chmod +x true
    joey@darkstar:~>ls -l true
    -rwxr-xr-x 1 joey joey 0 Nov 18 10:53 true*
    joey@darkstar:~>if ./true; then echo yay; fi
    yay
Just saying.


Also, you don't actually need a "true" binary for a shell of any reasonable vintage:

  # which true
  /usr/bin/which: no true in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin)

  # true; echo $?
  0


>demoscene

the smallest prod on pouet is a zero byte com file. of course it errors. With C3 you have skewed the definition of a program, already, anyhow. Edit: Queue the story of the empty dos program that was nevertheless useful to prompt the OS's program loader to do some memory management with useful side effects


That's a little misleading because what you described is a .COM file. They're limited to 64K, non-relocatable, and a holdover from the CP/M days. An .EXE is closer to an ELF in format. It's relocatable and necessarily larger in size.


This article uses the ingenious technique of accreting a minimal executable file via the Unix echo command, which can indeed write binary output using C-style escape codes. A must for anyone interested in compiler writing because it does a decent job explaining the ELF format in terms of the data structures required ton hold one.


From a modern perspective it's fun to think about the layers of sometimes-trivial complexity that we fly over with modern development tools, but when I wrote Apple Writer in 1978, I first had to write an assembler, and to do that, I first had to enter seemingly endless amounts of object code to get to the tipping point where I could enter comparatively efficient 6502 assembly-language mnemonics and actually get something accomplished.

https://en.wikipedia.org/wiki/Apple_Writer


Very nice; tried it and got a 129 byte working true command. Now I have to understand what's in the 28896 bytes of the true command that comes with the distro I have here.



The real kicker is this comment in GNU true(1).

    /* Note true(1) will return EXIT_FAILURE in the edge case where writes fail with GNU specific options.  */
If you can't program a bug free true(1), please stop with whatever you think you are doing.


So it's basically

    int
    main (int argc, char **argv)
    {
      return EXIT_SUCCESS;
    }
plus all the stuff required to handle --version and --help.

But even compiling just that results in 60 kB on my system. What's in those other kilobytes, indeed!


I don't know how you compiled it, but you can often cut down by using the '-O3' command line option when you are compiling. Then you can use the 'strip a.out' (or whatever your executable name is) to cut it down further.

Then if you open it in a hex editor (or use od -x), I'll bet you'll see pages and pages of zeroes, which are used for aligning it appropriately in RAM when it loads.

At least, those are all things that work on some systems, can't be sure about yours.


There is also an sstrip tool [1], written by the same Brian Raiter already mentioned here. It strips a bit more eagerly:

  $ gcc tiny.c -O3 -o tiny; wc -c tiny
  8456 tiny

  $ strip tiny; wc -c tiny
  6224 tiny

  $ sstrip tiny; wc -c tiny
  4140 tiny

[1] http://www.muppetlabs.com/~breadbox/software/elfkickers.html


I'd guess most of code weight will be setting up and tearing down glibc, initializing stdio etc. Then there is all sorts of metadata that gcc (and ld etc) include by default. There are plenty of resources on how to create minimal C executables, but `-nostdlib` gcc option would be a good starting point as a low-hanging fruit.


People like to dig at the size of GNU's /bin/true and /bin/false, but the reality is - who cares? Does anyone actually ever use it?

    $ type true
    true is a shell builtin


I've frequently bypassed tests/checks by replacing them with a symlink to /bin/true. So, yes. But no I don't care how big it is.


/bin/false is commonly specified instead of a shell in /etc/passwd for accounts that are not supposed to have interactive logins.


A long, long time ago I actually did this, in order to sneak a small exploit in a COM file onto the school's pre-internet Novell Netware system.

I also had to rebuild a partition table by hand once with only a copy of MSDOS debug on a boot floppy, and Peter Norton's excellent book on the PC.


I don't remember enough of the details to find a link, but I recall reading a story where some system error destroyed the binaries for most of the usual command line tools like cat and the author couldn't restart or rebuild the server so they hand entered the hex (or something) for enough tools to bootstrap themselves into a running system again.


http://www.ee.ryerson.ca/~elf/hack/recovery.html

(Note that 'gnu' in the story is GNU Emacs!)


Thank you, this is such a great story.


The background story is a fun thought experiment and obviously it requires some suspension of disbelief...

...but in the best HN tradition, let me poke some holes into it ;).

C-based technology we are all used to has some set of particular paradigms, one of which being that programs are "write-only". You code, compile and ship the binary. It wasn't the case with many pre-C technologies though, and it isn't the case now.

So, looking just at the software I have on my PC right now, I wonder what this potent computer virus would do to all the programs I wrote in Common Lisp - they all ship a native compiler within them, that's accessible in runtime. Moreover, it's often essential in runtime for the program to work.

So I guess, if that virus came and wiped all the dev tools, the C world would die, while we'd be looking for any compiled Lisp application and working to break into the REPL ;).


It strikes me that a native compiler, accessible at runtime, would fit the definition of "software development tool", so either the entire program, or the portion contains the compiler, would be deleted.

Another challenging aspect is that just about every command shell can execute a scripting language, so they'd be included too. I mean, it's just a repl with easy access to the filesystem, right?


Ah, but what if the man page for elf has been sneakily changed by your superhuman-intelligent adversary?

(cf. A Fire Upon the Deep, Virnor Vinge)


I think if there was a massive attack that infected a trusting trust attack on every computer on the planet, we'd just have to start from scratch


Super-scratch. We'd have to trash every processor manufacturing plant on the planet; Every single one almost certainly uses computer-controlled machinery working from computer-stored plans.

This would be a fun novel to read. An alien intelligence that can only act at a distance infects our computer systems, and we have to recover. A sort of post-apocalyptic scenario, except without the magic "engines don't work, but matches do" flavor. I hope Neal Stephenson is bored and looking for something to write.


> post-apocalyptic scenario, except without the magic "engines don't work, but matches do" flavor

I don't know if you're referring to "Revolutions" TV series here, but I've seen exactly those accusations of "magic" leveled against the show in the past, on HN. I decided to watch it anyway, and...

(spoiler alert)

...by the half of the first season it's becoming slowly revealed that there's no magic that happens to disable only electronics and combustion engines - it's omnipresent advanced nanotech explicitly designed to do just that.


Actually, I instantly thought of Dies The Fire, a novel with the same concept.


That was probably what I was thinking about, though I didn't remember the name. And we all know what Clarke said about advanced tech...


"Three-body problem" has an interesting variant on this: an alien intelligence which can sabotage physics experiments, thereby preventing humanity from exceeding a particular level of development.


I was definitely thinking of that when typing that out, although I think that the particular way the antagonists did this in Three Body Problem was like using a jet fighter to shield your infantry from the rain.


Indeed they could have done much more "fun" stuff with the tool they used. It seems that they lacked the mentality of human trolls.

(BTW. I just binge-read the entire trilogy within the past two weeks, thanks to HN recommendations of the first book. And I have to say, it's great sci-fi, totally worth the time it takes to read it.)


I read the first one, and it was incredibly interesting, but mostly from the cultural and historical perspectives. The "magic" bit I referenced seemed ... overpowered and yet underused. It's incredibly difficult to imagine an intelligence that can come up with that, but not extrapolate any further.

Potential spoiler alert: Using that particular plot device to fool scientists into believing the laws of physics weren't constant struck me as ludicrous. The part where physicists start committing suicide instead of seeing this as the start of a revolution in their understanding of the universe was almost plausible by comparison.

I need to read the second two, but I heard that the translations aren't as good :/


My current "head canon" theory explaining the underutilization of sophons is like this: at first, they weren't meant to be discovered, so they were used primarily for covert communications and locking down particle physics research. Later, they may have decided not to abuse them to wreck havoc with advanced technology in order not to incentivize humans to invest too heavily into sophon-proofing.

I don't know Chinese so I can't evaluate the accuracy of the translations of the other two books, but taken at face value, IMO they're pretty good pieces of writing. I didn't notice any obvious problems, and on the other hand I very much appreciated frequent inclusion of translator's notes that explained things like untranslatable humour, and provided the cultural background for things that may be obscure to Western readers.


Taking the plot device entirely literally misses the point of the book somewhat; it's so highly metaphorical and so much of it is told through the "VR" dream sequence exposition.

It's really a book about the question: how do we know what we know? Specifically, how do we find truth when powerful forces have devoted themselves to obfuscating it for us? In other words, the "cultural revolution" in China. And by extension, the present day wǔmáo dǎng era.


I haven't read the sequels, but the Sophon bit seemed pretty explicitly literal. How can I not read that part literally? Either the Sophons have the power ascribed, or the plot is nonsense.


Obviously it is literal, it has to be - it's relevant to the object-level plot both in the first book and in the latter two. Whether or not it has layers of additional "meta" meaning associated with it, that's another topic.

Though I'm not sure about the indirect references to various political events that people seek in the story - the author himself wrote explicitly in the afterword for American readers that he is not writing sci-fi to explore contemporary issues in different settings.


> using a jet fighter to shield your infantry from the rain

Great imagery, and similar to the thoughts I had when reading that series. They can imprint arbitrary images on human retinas, you'd think they could at least disable computers if not suborn our networks completely. I suppose that their... mental handicap could prevent them from coming up with the idea of our computers "lie" to us, but that seems a little flimsy.


I must be forgetting something, or something is in the sequels - what mental handicap? And I thought the entire point of the Sophons was to have the computers lie to us.

(Versus, say, blocking out all light from the sun and freezing us to death. Or reflecting more and burning us alive. Or slicing through every human's spinal cord. Or detonating nuclear devices. Or whatever.)


What man page? What 'man'? Isn't the premise of the article, that all software's gone? Good luck reading anything digital. Even better if you only have PDF files..


once again, kragen marathon comes to mind https://www.reddit.com/r/programming/comments/9x15g/programm...


This is really cool. Has anybody gameified this?


There are analogous discussions about open-source hardware. How far back must one go to be assured that there are no backdoors? It's a hard problem.


This reminds me a bit of a thought I had regarding how long would it take civilisation to return to it's current level of technology if we'd all be magically transferred back to the stone age retaining only our memories.


Generation one would be almost completely occupied finding the most efficient way to archive as much of that knowledge as possible, IMO... I'd imagine it might take a dozen generations or so, or more, depending on how much could be archived before that first generation passed away and took that experience with it. I see apprenticeships rampant!


This is basically the plot of the Foundation series.


Wouldn't it mostly depend on how many people died?

For example, after the destruction of world wars, most industries bounced back quickly because the knowledge was still there. Of course, that wasn't complete stone age.


Indeed it would.

Our civilization depends on the complex supply chains - you need people mining materials, people refining materials, people quality checking those materials, people providing various precise tools, people providing various precise tools for making the precise tools, people providing food for everyone, people moving everything around, people making the stuff that moves everything around, etc. etd.

So I think the first generations would have to spend time slowly building up all the components of the supply chain, while furiously breeding to return the population to the levels that allow as big professional specialization as we have today.


There is an episode of Freakonomics [1] that talks about this exact thing. The supply chain to make something as simple as a pencil is complex enough that no single person can make a pencil truly from scratch.

[1] http://freakonomics.com/podcast/i-pencil/


$ touch true; chmod 777 true; ./true && echo Success

Success


Congrats: Your shell just spawned another shell to run an empty script.

Also:

    # rm /bin/true && true && echo success
    success


The storyline backing this article is not just distracting, it is misinformative: it teaches people the wrong indirect lessons about software and engineering :/. If you don't have an assembler, you write an assembler, you don't just make do without and try to bootstrap directly from a C compiler; and if you have the luxury of extra hands and need people to work on stuff in parallel--stuff you can't run yet anyway as if you don't have really simple things like true coded I bet you don't have a shell (and there are only a scant couple reasons true would even exist if you don't have a shell that exposes branch comparisons as shell commands)--you don't set them to work building the binaries by hand, you give them a rough outline of a programming language you will have available at the right point in the development cycle (maybe you throw them a C89 manual, as manuals still seem to exist in this world?) and tell them to write it in that on the promise that it will be useful later.


it's obviously just a fun thought experiment.

and we'd probably dust off the old PDP7s and stuff that are more suited to this kind of bootstrapping, instead of jumping straight to ELF and x86


>it teaches people the wrong indirect lessons about software and engineering

I am sure most readers can figure that out from the elaborate set up. Reverse engineering shell code, or decompiling programs would be a more realistic application of the conveyed skill set.


`echo` is a shell builtin on my machine. In that case, we certainly have access to an interpreter (the shell), which means we should have access to other interpreters. Why not define `true` as

   #!/bin/sh
   exit 0
? Similarly, why not use `sh` or one of those other interpreters to write the rest of the utilities? `cc` probably isn't that important after all. `chmod` and `sh` are!


This article isn't really about solving a silly problem with hands, arms & tongue tied behind your back. It's about learning a little bit more about the layers which put our systems together.

I don't think the author was seriously suggesting you use butterflies or a steady hand either.


I would alias it as

    echo ''
Since echo always returns true anyway :) But that's just sidestepping the problem, in any case.


I don't understand when Chris Wellons find the time to write all those fine blog posts.


Ye gods. Github was intermittently inaccessible the other day and ruined about half an hour of my time. World-wide EMP leaving all HDDs in ruins. Well, at least it would keep us occupied for a decade rebuilding everything .....


tl;dr: inspired in part by http://xkcd.com/378/ on writing the "true" program in machine code (that is, hex AKA binary bits). Using the echo program....




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: