
Smallest x86 ELF Hello World - shawndumas
http://timelessname.com/elfbin/
======
drv
The classic resource for this (does the same sort of tricks, taken to an
extreme: code overlaid with tables, tables overlaid with other tables, etc.):

<http://www.muppetlabs.com/~breadbox/software/tiny/>

(62-byte hello world among other things)

~~~
linhat
how about 45 bytes...

[http://www.muppetlabs.com/~breadbox/software/tiny/teensy.htm...](http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html)

~~~
iseyler
How about 31 bytes?

<http://www.returninfinity.com/baremetal-helloworld.html>

Why use ELF or Linux? Flat binaries (and a custom OS) to the rescue! :)

~~~
Someone
Why limit yourself to an OS? Design your own instruction set, with a single-
bit 'hw' instruction!

As with all games (even Nomic) a game is only a game if there are rules that
limit what one can do.

~~~
kpreid
The esoteric language HQ9+ does this.

------
sambeau
Other than as an interesting intellectual exercise is there any reason to do
this on a modern computer with Virtual memory?

Surely, once you get below page-size (e.g. 4K) you are just wasting memory
that will get loaded anyway?

~~~
DanielRibeiro
Well, on one of the Dart threads, bashing its hello world size, this[1] came
up:

 _Compile "Hello world" into an executable, and see how many kilobytes it
takes on most architectures, despite the object file being a few bytes._

This thread pretty much responds to that. It is no 17000 lines, nor 17000K, it
is just 142 bytes (other comments suggest it is 62).

[1] <http://news.ycombinator.com/item?id=3098660>

~~~
sambeau
It makes more sense for VM byte-code as the bytes not only have to be loaded
(probably by the page, again) but the bytes have to be run and possibly
jitted, too.

------
surki
Doing it in C with mixed assembly

    
    
      [surki@linux-vrse tt]$ cat | gcc -nostdlib -x c - -o helloworld
      #define SYS_exit  1
      #define SYS_write 4
      #define stdout    1
      
      int strlen(const char *str)
      {
        long len = 0;
        while (str && *str++)
        {
            len++;
        }
      
        return len;
      }
      
      
      void print(const char *str) 
      {
          int len = strlen(str);
      
          long ret;
      
          /* Can't touch ebx directly, PIC uses it */
          __asm__ __volatile__ ("pushl %%ebx\n"
                                "movl  %%esi, %%ebx\n"
                                "int   $0x80\n;"
                                "popl  %%ebx"
                                
                                "a" (SYS_write),
                                  "S" ((long) stdout),
                                  "c" ((long) str),
                                  "d" ((long) len));
          return;
      }
      
      void _start()
      {
        main();
      
        __asm__ __volatile__ (
             "xorl %%ebx, %%ebx\n"
             "int $0x80\n"
             
             "a" (SYS_exit));
      }
      
      int main()
      {
        print("Hello World\n");
        return 0;
      }
    
      [surki@linux-vrse tt]$ strip -R .comment -R .comment.SUSE.OPTs -R .note.gnu.build-id helloworld
      [surki@linux-vrse tt]$ ll helloworld 
      -rwxr-xr-x 1 suresh users 540 2010-07-21 13:19 helloworld
      
      [surki@linux-vrse tt]$ ./helloworld 
      Hello World

------
castorpilot
can get a few bits off by using some well known shellcode techniques. I think
xor ebx,ebx is smaller than mov ebx,0, for instance.

~~~
VeXocide
You are right in that

    
    
        31 c0             xor %eax, %eax
    

is shorter than

    
    
        b8 00 00 00 00    mov $0, %eax
    

but that's not the main reason it's often used in shellcode. The xor reg, reg
is preferred because it means the resulting shellcode will not contain a null
byte. This is desirable since the null-byte is often used as a string
terminator, in environment variables for example.

~~~
limmeau
Another reason to use xor: in the old days, ALU reg-reg commands were
generally faster than reg-mem moves.

The 8086 assembly tutorial book I used then recommended to use "sub ax, ax"
instead because it was less surprising.

------
jjguy
same for win32 PEs: <http://www.phreedom.org/solar/code/tinype/>

------
joelthelion
I wonder if this could be improved with genetic algorithms or similar. The
code is pretty small, but the seach space is still pretty huge.

~~~
ars
Did you post in the wrong thread?

~~~
joelthelion
Nope. Why couldn't you use a generic optimizer to optimize a piece of code?

~~~
ars
Because it doesn't make any sense? What, are you planning just randomly try
every permutation of 142 bytes?

You can't have a genetic algorithm run on something where almost every attempt
will simply produce invalid code. Genetic algorithms work when you can try
various strategies to maximize a fitness function. But here the fitness is yes
or no, does it work or not. There is nothing to maximize - and you can't try
to maximize the "size fitness" because pretty much all the attempts will
produce invalid code, so there is nothing to mix together for various
generations.

~~~
joelthelion
I disagree completely. The fitness function is clear, and that fact that many
children will produce invalid code is not that much of a problem, really. You
just eliminate them.

Google "automatic programming" if you think it's never been done, your mind is
about to be expanded.

You should really learn to think for a while before you dismiss ideas as
"nonsensical".

------
pgbovine
a cool project would be to make a webapp that implements the following
contest: create an ELF binary to do X with the smallest number of bytes.

~~~
DanBC
...why not just use the Stack Exchange "Code Golf" website?

(<http://codegolf.stackexchange.com/>)

------
adestefan
Is this the next "how fast is Fibonacci" fiasco?

------
waffenklang
Very impressive.

------
garethsprice
$ echo "Hello World" > hw.php && php hw.php Hello World $ ls -l hw.php -rw-r--
r-- 1 gareth staff 12 Oct 11 11:23 hw.php

By this stupid metric, PHP is the greatest programming language ever made.

12 byte Hello World _must_ mean that it's also the ideal solution for every
other programming problem, right?

~~~
yan
I believe you _entirely_ missed the point. This was more of an "Look at how
you can abuse the loader and play with ELF headers" and in general show an
interesting 'hacky' article, not "this is smaller, thus more efficient"
argument.

~~~
garethsprice
I figured this was posted as a related article to the Dash "Hello World" issue
where people are pointing at 17,000 lines of Hello World code, because it
includes the runtime when compiled to Javascript.

There was another comment about how this will be the next "Fibonacci debacle",
related to how Ted Dzubia trolled the node.js crowd by pointing out it's
weakness for CPU-intensive tasks, then had a ton of people wasting time on
writing faster Fibonacci algorithms in node just to prove him wrong.

My point was that comparing languages by the size of their Hello World
programs is a useless exercise.

