
Code that is valid in both PHP and Java, and produces the same output in both - adamnemecek
https://gist.github.com/forairan/b1143f42883b3b0ee1237bc9bd0b7b2c
======
alephnil
When reading this, I remember that while I was studying computer science back
in the early 90s, such programs was known as "polyglot" programs. One such
example that circulated on email and usenet groups back then had at least 6
programming languages. After a bit searching I actually found back to that
program:

[http://www.csd.uwo.ca/~magi/personal/humour/Computer_Audienc...](http://www.csd.uwo.ca/~magi/personal/humour/Computer_Audience/polyglot.\[cob|pas|f77|c|ps|sh|com\].html)

It is both valid C, Pascal, COBOL, FORTRAN, Postscript, Bash, ksh and
apparently also 8086 machine code (.com).

~~~
piaste
Pretty disappointing that it relies so much on comments.

You can write an arbitrary 'polyglot' program in any N languages that have
incompatible comment delimiters: just write the same program in each language
and wrap it in the comment delimiters of the other N-1 programs. That's not
interesting, though.

~~~
Retric
That's not enough, if you start with language A's escape sequence /* that must
not break languages B, C, D, E or F when starting a file with /* next B's
escape sequence must not cause problems with C, D, E, or F. So, they must be
different but still compatible.

------
IANAD
Phava is great!

But, quine-relay still wins: [https://github.com/mame/quine-
relay](https://github.com/mame/quine-relay)

~~~
terminado
That was pretty much THE thing that won my respect for the Ruby community.

[https://github.com/mame/quine-
relay/blob/master/thumbnail.pn...](https://github.com/mame/quine-
relay/blob/master/thumbnail.png)

That and the radiation-hardened quine:

[https://github.com/mame/radiation-hardened-
quine](https://github.com/mame/radiation-hardened-quine)

Prior to seeing those, I was kind of dismissive of Ruby, but seeing those
totally changed my mindset about code, and my prevailing opinion of execution
environments and interpretted languages and scripting in general.

~~~
skrebbel
Great share, but I'd like to kindly suggest that you think again about your
conclusions there. You were dismissive of an entire programming language and
its community (we've all been there), and then 1 member of that community did
something impressive, and you changed your entire opinion of said language and
community (I guess most of us have been there too).

Wouldn't it be a much more interesting conclusion that there's fantastic
people and not-so-fantastic people everywhere and it mostly just depends on
where you look?

I mean, HN likes to be very dismissive of PHP but at the same Composer (the
de-facto PHP package manager) avoids dependency conflicts entirely and
provably because it contains a home-cooked SAT solver[1]. In PHP. Tackling an
NP-complete problem without breaking a sweat. To me that's both super-awesome
and sensible at the same time.

Maybe stereotypes about programming communities have more to do with marketing
and accidental who-happens-to-be-most-prolific and less with the language or
community just being good or bad.

[1]
[http://www.naderman.de/slippy/src/?file=2012-06-07-Composers...](http://www.naderman.de/slippy/src/?file=2012-06-07-Composers-
SAT-Solver.html#1)

~~~
splintercell
> I mean, HN likes to be very dismissive of PHP but at the same Composer (the
> de-facto PHP package manager) avoids dependency conflicts entirely and
> provably because it contains a home-cooked SAT solver[1]. In PHP. Tackling
> an NP-complete problem without breaking a sweat. <

Any idea where can I find more details on that. The slideshow you linked does
not provide enough information.

~~~
skrebbel
I wish I did! I can't remember how I learned that it's SAT based and google
isn't very forthcoming :-(

~~~
terminado
Probably because most Google searches for " _SAT_ " involve prep courses that
help you achieve a score of 1600... [._.]

On the other hand, jargon searching wikipedia helps narrow results:

[https://en.wikipedia.org/wiki/Boolean_satisfiability_problem](https://en.wikipedia.org/wiki/Boolean_satisfiability_problem)

------
david-given
A bit back I wrote a Forth interpreter that's a single file containing shell
(a tiny bit), awk, C, and (a limited dialect of) Forth.

[https://github.com/EtchedPixels/FUZIX/blob/master/Applicatio...](https://github.com/EtchedPixels/FUZIX/blob/master/Applications/util/fforth.c)

The awk byte-compiles the initial Forth dictionary into the C file; if you
edit the C file, you then run it (as a shell script) and it updates the byte-
compiled word definitions. Then you can compile it as a standalone C program.

Amazingly, this was actually the _simplest_ way to solve the problem...

------
0xmohit
Java is fun. The following is valid Java code:

    
    
      \u0070\u0075\u0062\u006c\u0069\u0063\u0020\u0020\u0020\u0020
      \u0063\u006c\u0061\u0073\u0073\u0020\u0055\u0067\u006c\u0079
      \u007b\u0070\u0075\u0062\u006c\u0069\u0063\u0020\u0020\u0020
      \u0020\u0020\u0020\u0020\u0073\u0074\u0061\u0074\u0069\u0063
      \u0076\u006f\u0069\u0064\u0020\u006d\u0061\u0069\u006e\u0028
      \u0053\u0074\u0072\u0069\u006e\u0067\u005b\u005d\u0020\u0020
      \u0020\u0020\u0020\u0020\u0061\u0072\u0067\u0073\u0029\u007b
      \u0053\u0079\u0073\u0074\u0065\u006d\u002e\u006f\u0075\u0074
      \u002e\u0070\u0072\u0069\u006e\u0074\u006c\u006e\u0028\u0020
      \u0022\u0048\u0065\u006c\u006c\u006f\u0020\u0077\u0022\u002b
      \u0022\u006f\u0072\u006c\u0064\u0022\u0029\u003b\u007d\u007d
    

[Hint: Hello world]

~~~
danieldk
But that's not really surprising, it's each character as a unicode escape (see
section 3.2 of the Java spec). It's possible in any language that allows
escaped code points.

~~~
0xmohit
Not surprising, yes.

But what is silly is that the Unicode sequences are also interpreted in
comments.

    
    
      /* Comment region begins. \u002a\u002f
      String s = "Comment was closed in the previous statement";

~~~
flukus
That's a recipe for pure evil.

------
dalke
Those interested in this topic may enjoy Ange Albertini's presentation at the
31st Chaos Communication Congress, on "Funky File Formats",
[https://events.ccc.de/congress/2014/Fahrplan/events/5930.htm...](https://events.ccc.de/congress/2014/Fahrplan/events/5930.html)
.

> Binary tricks to evade identification, detection, to exploit encryption and
> hash collisions. * artistic binaries - why they are possible, how they work.
> - quines - polyglots & chimeras - schizophrenic - AngeCryption - hash
> collisions

The presentation video and slides are available. It has examples of mixing
multiple binary formats together, including valid files for one format which,
when decrypted, produce another valid format.

------
chias
What's the purpose of the:

\u000A\u002F\u002A

bytes? This sequence appears multiple times, but is commented out in both
languages' interpretations.

~~~
fnovd
\u000A\u002F\u002A is the unicode for the beginning of a multi-line comment
(\u000A\u002A\u002F being the end of one). The Java compiler will translate it
as a comment while the PHP interpreter will ignore it (see
[https://www.reddit.com/r/ProgrammerHumor/comments/50guhc/thi...](https://www.reddit.com/r/ProgrammerHumor/comments/50guhc/this_snippet_of_code_is_syntactically_valid_in/d73x8gs)
for more info).

~~~
mistercow
Wow, that seems really dangerous. Apparently, this also applies _inside string
literals_ : [http://javajee.com/unicode-escapes-in-
java](http://javajee.com/unicode-escapes-in-java)

So if you want to sneak in a back door, all you need is the right excuse to
put some unicode escapes in a block comment, and you can hide your code in
plain sight.

Something like:

[https://gist.github.com/anonymous/f33c1e392193a9ec30dcc5b31d...](https://gist.github.com/anonymous/f33c1e392193a9ec30dcc5b31dd324be)

I tried this in Eclipse, and the one saving grace is that while the basic
syntax highlighting doesn't pick up on most of the de-commented part, the code
intel seems to do an actual parse, and does a very subtle highlighting of
"someService". (Edit: After reopening the file, that highlight is gone and it
looks like a normal comment).

But if I saw that, I think I'd just assume the syntax coloring was messing up.
And of course, if I'm looking at the code in GitHub, it will look just like a
normal comment.

~~~
yorwba
I just tried this in Android Studio, Eclipse, Emacs, Gedit, Nano, Vim and
_all_ of them get this wrong.

I feel the sudden urge to test every Java syntax highlighter in use and file
lots of issues.

~~~
i336_
GitHub definitely needs to know about this due to their popularity and
subsequent lowest-common-denominator status (which isn't a bad thing, just the
truth); this sort of attack only requires a PhD in how to use the clipboard,
and not any other particular knowledge or skillset.

~~~
yorwba
GitHub indirectly uses the Java bundle for TextMate, where I filed this issue:
[https://github.com/textmate/java.tmbundle/issues/45](https://github.com/textmate/java.tmbundle/issues/45)

Since the escape sequences have to be handled everywhere, it seems unlikely
that this will ever be fixed completely, but I hope that _something_ will be
done about it.

~~~
yorwba
This issue has also been raised on the Eclipse bug tracker:
[https://bugs.eclipse.org/bugs/show_bug.cgi?id=3533](https://bugs.eclipse.org/bugs/show_bug.cgi?id=3533)

In 2001.

~~~
i336_
That is insane.

I added a pingback to this thread to the bug, along with some general
encouragement that fixing this is a wise idea.

------
setra
"But now we'll never know if Schrodinger's computer is running php or java..."

"I hope you die in a fire of a 1000 java compilers."

------
spullara
The Jurassic Park quote in the comments is perfect.

~~~
mixermf
"You were so preoccupied with whether or not you could, you didn't stop to
think if you should"

------
simcop2387
Here's a polyglot that's valid in a large number of languages:
[https://github.com/mauke/poly.poly](https://github.com/mauke/poly.poly)

At least the following:

    
    
        * C (89)
        * C (99)
        * C++
        * Haskell (multiple extensions, not sure how that works)
        * Bash
        * ZSH
        * Posix SH
        * Make
        * Perl 5
        * Perl 6
        * Ruby
        * Python
        * Brainfuck
        * HTML
    

edit: formatting

------
re
This is often called a "polyglot" program, if you're interested in finding
more examples.

[https://en.wikipedia.org/wiki/Polyglot_(computing)](https://en.wikipedia.org/wiki/Polyglot_\(computing\))

------
0xmohit
Also interesting are document formats wherein one would embed other document
formats. Examples would include embedding malicious files within a PNG; it'd
typically start with distributing a PDF file:

[https://securelist.com/blog/virus-watch/74297/png-
embedded-m...](https://securelist.com/blog/virus-watch/74297/png-embedded-
malicious-payload-hidden-in-a-png-file/)

------
jmspring
Sigh.

I recall the quercus project from Coucho. Translated php into jvm byte code.
One way to truly accelerate php many years back.

------
junke
Basic, Python 2, Perl

    
    
        print "Hello World"

------
yeowMeng
Similar: [http://codegolf.stackexchange.com/questions/55960/im-not-
the...](http://codegolf.stackexchange.com/questions/55960/im-not-the-language-
youre-looking-for)

------
bentona
Now how about a quine that's valid in PHP and Java?

~~~
taneq
[https://github.com/mame/quine-relay/](https://github.com/mame/quine-relay/)

This 100-language quine relay passes through both on its way around the clock.
:)

~~~
spullara
Just posted this link to HN because that is insane. Like clinically, and much
more insane than this one.

~~~
terminado
Original thread from 2013:

[https://news.ycombinator.com/item?id=6048761](https://news.ycombinator.com/item?id=6048761)

------
netgusto
From the gist comments:

> Your scientists were so preoccupied with whether or not they could, they
> didn’t stop to think if they should. - Ian Malcolm

------
gurgus
[https://blog.goeswhere.com/2010/04/java-cpp-
polyglot/](https://blog.goeswhere.com/2010/04/java-cpp-polyglot/) here's an
example of the same sort of thing but with C++/Java

------
vs4vijay
This is Polyglot, Read this:
[http://codegolf.stackexchange.com/questions/7261/write-a-
pol...](http://codegolf.stackexchange.com/questions/7261/write-a-polyglot-
that-prints-the-languages-name)

------
kleer001
Yup, you can do that with just about any set of languages, right?

~~~
ygra
The trick here relies mostly on comments that are valid for one language, but
not the other (abusing Java's Unicode preprocessing). There are language pairs
that offer no provision at all for embedding parts of code that another
language won't see or misinterpret. E.g., I haven't been able to write a batch
file that also works as a PowerShell script yet, although you can write a
batch file that doubles as a VB script through clever use of a conditional
jump.

~~~
kleer001
Oooh, right. So, I guess there's some pairs that are incompatible. Hmm :(

Maybe it also work for computationally equivalent (simulation/emulation) math
where you're decompiling the machine language into a stream of logical NORs or
such. That might connect more languages.

------
znpy
It comes back to my mind that I ran into a source code that was both valid
Java and valid C++... Can't remember what was it though.

------
moonshinefe
amusing and fairly cool. That calling a non-static function in a static
context in the PHP file bothers the heck out of me for some reason, though.
(maybe since my coworkers always mess it up).

------
andersonmvd
Security guys would love this. Multilanguage payloads.

------
Tan__
Polyglot programming. That's wild.

------
83457
Wasabi 2.0

------
deusofnull
Is this just a novelty or does it mean i can hack Enterprise Java apps with
ratchet PHP code?

~~~
spullara
You could always use the 100% Java implementation of PHP in your app server:

[http://quercus.caucho.com](http://quercus.caucho.com)

------
senectus1
hey cool, a Perthian :-)

~~~
hhandoko
I have the same response everytime someone from Perth pops up in HN :)

Quite a recent picture as well... As I recall, the Rio Tinto logo on the
Central Park building was only up a couple years ago.

------
smegel
If by code you mean comments...

~~~
yolesaber
Comments are code

