Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Self-Printing Machine Code (2005) (susam.net)
55 points by susam on March 16, 2024 | hide | past | favorite | 15 comments



The author points out that for binary machine code, the only notion of a quine that makes sense is for the executed code to output its own binary.

The 12 byte version exploits the fact that it has access to its own copy in memory, which as the author admits is kind of cheating, making it an improper quine.

They then show a 40 byte proper version that uses embedded data to reproduce both the executing code and the data.

Another example of a binary code quine, not corresponding to any existing machine code, is the quine for binary lambda calculus [1].

[1] https://tromp.github.io/cl/Binary_lambda_calculus.html#a_qui...


> The author points out that for binary machine code, the only notion of a quine that makes sense is for the executed code to output its own binary.

That’s not really true, though. A quine is a program that outputs its own source code, not a program that outputs itself in binary form. I’d therefore argue though that a “proper” quine should in this case output not the actual ASCII bytes, but the hex code for the bytes. After all, that is how they were typed in—the source code, so to speak. We wouldn’t expect a C quine to output anything but the actual C source code that was typed in, so why is this any different?

The normal recursive flow for a quine is source code -> (interpreter | compiler) -> execution -> source code. A program that literally outputs an executable version of itself doesn’t really meet that definition. I’m not sure what it is, but I don’t think it’s accurate to call it a quine.


It depends on whether you consider the binary to be the Programm, the ELF file, the hexdump, the assrmbly code, etc. All of these are a kind of programming language.


Not that I disagree with you, but…

> A quine is a program that outputs its own source code, not a program that outputs itself in binary form. I’d therefore argue though that a “proper” quine should in this case output not the actual ASCII bytes, but the hex code for the bytes.

Punch cards could legitimately be called source code. Those weren’t in hex, but rather in binary (at least the ones I’m remembering).


I think a punch card program that output to the console (or whatever punch card systems used for output, a teletype?) instructions for creating a new set of punch cards containing itself would meet the definition of a quine. The source code is the program itself, the “compilation” is the process of punching new cards, the execution is feeding the cards into the machine, and the output is the source code again.

What is described in the original article is more like feeding a program into the machine that does nothing and declares the fact that the original deck is now in the output bin to be a quine.


About the proper quine the author says

> While both these programs take care not to read the same memory region that is being executed by the CPU, the data bytes they read look exactly like the executable bytes. This is what I meant when I mentioned earlier that the lines between code and data are blurred in an exercise like this. This is why I don't really see a point in keeping the executable bytes separate from the data bytes while writing machine code quines.

Which I though was an interesting philosophical point. There is no difference at all between the embedded data and the code. They are the same bytes in the same memory. What is the point of duplicating the code? Why not just read the first copy as the first quine does.


> They are the same bytes in the same memory

The difference is whether you rely on some form of introspection to get read access to all of the executing code, or whether you use embedded data like regular string literals.


I think the point is that you are reiterating the same distinction without a difference.


You all may collapse the difference, but I agree with John that it is meaningful.


It’s definitely meaningful in that we can talk about it! But what if I put the code in the data section, but left the data section executable and jumped there? The very idea of a data section in some sense presupposes a higher level understanding of the program.

In any event quines are only interesting because of the rules we place around them so I agree in general with arguing the rules!


> It’s definitely meaningful in that we can talk about it!

No, it's much more meaningful. Different languages or computation models allow different levels of introspection. In some models quines are trivial: just ask for your source code and print it out. In some models it's non trivial that quines exist at all (but it turns out they exist).


> But what if I put the code in the data section, but left the data section executable and jumped there?

Data sections are usually not executable. Even if it were, that still leaves some code (the jump) outside of the data section.


I made something like this a few weeks ago and I looked everywhere to find out more about this is was a quine or not. Thank you so much for sharing this.


Are there any non-determenistic compilers? I mean if we freeze source code and compiler flags, will binary result (compiled program) always be the same?


dc(1) has a compact quine: "6581840dnP"

https://literateprograms.org/quine__dc_.html




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: