
Our Ambitious Plan to Make Insecure PHP Software a Thing of the Past - CiPHPerCoder
https://paragonie.com/blog/2018/01/our-ambitious-plan-make-insecure-php-software-thing-past
======
laurencei
Can you get StackOverflow to "sponser" this idea? Not just for crypto (which
you've done) - but for all PHP related answers?

Because the problem is there are many "old" accepted answers, with high
upvotes, that will always come up as number 1 or 2 in google searches.

Given PHP has changed so much, many of those answers are outdated, use
incorrect and insecure methods, and some are now just wrong. This is not just
security - but a whole host of answers.

The "meta" StackOverflow rules will tell you to downvote the old answer and
post a new one - but that is not practical - and will take years to take
effect. Plus, many people simply read the first large upvoted answer, copy +
paste the code, and move on.

edit: I guess it would be nice to be able to "flag" an accepted answer (not a
question) as outdated, get 5x people with gold badges for that tag to accept
it - and then the answer is highlighed as wrong/out of date (or even deleted).
Something like that.

~~~
CiPHPerCoder
> Can you get StackOverflow to "sponser" this idea?

Possibly. I wouldn't even know where to begin.

> Because the problem is there are many "old" accepted answers, with high
> upvotes, that will always come up as number 1 or 2 in google searches.

Yeah, that's my concern. I can definitely edit anything tagged [php], due to
having a gold [php] badge, and I think I can edit anything because I have a
reputation higher than 10,000.

The hardest problem for me, here, is _identifying_ these old accepted answers
with high upvotes.

My general approach is something like this:

[https://stackoverflow.com/questions/tagged/php+encryption?so...](https://stackoverflow.com/questions/tagged/php+encryption?sort=votes&pageSize=50)

[https://www.google.com/search?q=site%3Astackoverflow.com+php...](https://www.google.com/search?q=site%3Astackoverflow.com+php+encryption)

~~~
laurencei
IMO (as someone with 45k rep on StackOverflow, largely in PHP) - the best way
is to modify the "flag" option on answers - and have an extra option called
"out of date". If someone flags an answer as out of date, then anyone with a
gold badge in that tag (i.e. PHP) - can review and accept or decline the flag.

If 5 gold people accept it - then the answer is formally highlighted as out of
date, or even deleted.

You could raise this on Meta Stackoverflow as one possible way to start:
[https://meta.stackoverflow.com/](https://meta.stackoverflow.com/)

edit: There probably should be another flag option - called "security" \-
where even if an answer "works" and is "in date" \- people can flag it as
insecure due to a better option. Think of all the stupid SQL injection
answers. You can downvote it to hell - but sometimes they should just be
flagged/deleted.

~~~
voltagex_
Please don't delete out of date information.

Signed, someone who's done a lot of legacy code maintenance (have you tried to
find VB6 or .NET 1 doco these days?)

~~~
nickpsecurity
Ok, so it's not deleted but maybe it gets a banner with links on up to date
stuff like in OP. Then, if they can't use that due to legacy software, they go
forward with whatever the answer contains for what they're stuck with. That
might help both types of users.

------
mywittyname
I think it would be better to write a plugin for various PHP IDEs that contain
fingerprints of bad code which is used to yell at the programmer if they are
using these old, insecure code snippets. Heck, you could add a feature or note
to the developer to down-vote the source of said code.

It would be a lot of work, but I think it's easier than trying to get authors
of defunked blogs to take down 10 year old answers.

As a side-effect, this increases awareness among developers that you can't
blindly accept what's on SO. If developers begin to lose trust in SO, then SO
is going to be incentivized to do something about, or go the way of
expertsexchange.

~~~
DCoder
There's a plugin for IntelliJ IDEs called "PHP Inspections (EA Extended)" [0]
that detects some of these compatibility and security problems.

[0]: [https://plugins.jetbrains.com/plugin/7622-php-inspections-
ea...](https://plugins.jetbrains.com/plugin/7622-php-inspections-ea-extended-)

~~~
mywittyname
This could certainly be a starting off point for what I was imagining.

I was thinking that they could do analysis on code snippets that are copy-
pasted into the editor. It should be straight-forward to compare the contents
to a finger-print database and issue a warning when they find a match to
online code examples that are marked as vulnerable.

------
tyingq
Reputation management companies struggle to wipe "bad content" off of 3rd
party sites.

Your goal is laudable, just not sure it's practical. Did you consider focusing
on publishing and promoting net new content instead? Get traction that way,
and that content will move up the Google serps, effectively moving the old/bad
content down.

~~~
CiPHPerCoder
> Did you consider focusing on publishing and promoting net new content
> instead?

We've been doing that for years. We've hit diminishing returns because:

1\. There is a lot of incumbent material in the same genre, most of which is
10+ years old, but people still link to it in droves.

2\. We don't support the at-best-sketchy SEO industry that's sometimes arm-in-
arm with adware.

If you read the 2018 guide this post mentions a few times, you'll notice that
it, in turn, links to relevant blog posts (and open source libraries) spanning
back to early 2015.

~~~
tyingq
I don't think that asking people to link to content that's clearly superior is
"SEO".

Asking a 3rd party site to change a link seems easier than asking a different
3rd party site to rewrite content.

~~~
CiPHPerCoder
That's sort of what I'm asking people for, except presented as a choice.

I'll update the article to try to make it clear that "link to clearly superior
content" is a viable alternative to rewriting said content, as long as the
same goal is achieved.

------
kbenson
Well, PHP has finally come full circle in the lifecycle. It's fully following
in the footsteps of Perl. There was a big push and a lot of talk in the Perl
community around 2010-2012 to find and remove old and poor quality Perl
tutorials so people searching didn't end up with bad information. Part of this
push included writing new and better tutorials and trying to get them placed
higher in results for old tutorials that could not or would not be removed.

Since we all saw and remember how Perl regained it's top spot as most popular
programming language, I'm sure this will work out wonderfully. :/

~~~
lkrubner
I almost wrote a similar comment, but I figured I would be downvoted. I
realize it sounds cynical, but why should this effort be made for PHP? I loved
PHP back in the year 2000, and at that time it had advantages that no other
language had, but those days are long gone. There are many better languages
now. I wrote about this in my essay "PHP is obsolete" and there was a fairly
good conversation about that essay on Hacker News:

[https://news.ycombinator.com/item?id=9598309](https://news.ycombinator.com/item?id=9598309)

~~~
tyingq
It is still popular with newcomers, people that don't come from a structured
CS background, etc. Because the barrier to entry is just so low. Meaning, it's
a built in, working option on hosting providers, deceptively simple in the
beginning, etc. Also, PHP still dominates the "host it yourself e-commerce"
space, because of Magento, Opencart, and PrestaShop. Oh, and WordPress...it's
some insane percentage of all sites. And lousy plugins are everywhere.

We could ignore that crowd, but guiding them down the right path is better for
everyone.

Edit: I'm also not convinced that node.js, which is gaining ground with the
same crowd, doesn't have similar issues. Footguns aren't unique to PHP.

~~~
duskwuff
The availability on hosting providers is a HUGE deal. Sure, you can write an
application in Python, or Node, or Java, or whatever else is theoretically
better... but good luck finding somewhere to host it that's cheap and doesn't
require you to do all the sysadmin work yourself (i.e, not a VPS).

PHP may not be the best of all possible languages, but it's by far the most
widely available.

~~~
tyingq
I agree. And, as mentioned, these hosting providers will likely make node.js
just as simple soon, and node will replace php as the whipping boy for
flippant remarks. What's old is new again. This is more about providing
newbies with good advice than it is about PHP.

~~~
duskwuff
Honestly, I don't see that happening soon, if ever. PHP was easy for hosting
providers because it was easy to plug in to existing virtual hosting support
in Apache and FTP servers. Node is more complicated; there's no obvious
"right" way to handle many Node apps running on a single shared server.

~~~
tyingq
There were several ways for PHP too, and still are, FWIW. The shared hosting
providers eventually settled on the best compromise of
price/performance/security.

What they provide now is better and more scalable than old school CGI. Most of
the shared hosts are using LightSpeed[1] as it squeezes as much as possible
out of PHP in a shared environment. That vendor, and their low end shared host
customers, aren't dumb. They will respond to market changes and make node.js a
1st class support item when it is clear that's where the money is.

[1][https://www.litespeedtech.com](https://www.litespeedtech.com)

------
fernly
> we can raze the mountains of collective technical debt that have accumulated
> over the past decade.

Not likely when professionally written code is full of errors, many security-
related (e.g. attempting to load and execute .gifs). I recently presented a
compilation[1] of the errors produced by one widely-used commercial ad
function. If this is how a professionals write commercial code, I don't want
to imagine what amateurs have been doing.

[1]
[https://www.reddit.com/r/web_programming/comments/7o8dzf/cop...](https://www.reddit.com/r/web_programming/comments/7o8dzf/copious_and_varied_errors_from/)

~~~
avenius
Sounds like an intentionally malicious function to me - just waiting for a
proper payload.

------
jjeaff
An admirable goal. One of my pet peeves with lots of php code and extensions
is the unending desire to make every bit of cold backwards compatible to some
of the oldest versions of PHP.

There are some great, new cryptographic functions that have been implemented
in PHP >= 5.5

Password_hash() and password_verify() are so simple, it's hard to mess up
password hashing now. When I upgraded my projects to PHPa7, I replaced dozens
of lines of code with those 2 functions alone.

But I have seen plenty of implementations of them that still fall back to old
more convoluted and error prone methods when you are using some old version of
PHP.

~~~
jccc
_> When I upgraded my projects to PHPa7 ..._

The universe of developers with PHP projects to maintain is filled with people
who do not share the same goals and constraints that you might enjoy.

There _is_ an enormous amount of lousy PHP code, and more being made by
clueless developers. But please do not dismiss the need to support old PHPs as
being driven primarily by those reasons.

1\. The PHP project itself has EOLed 5.3.3, however distros continue to
support it with backported security and bug patches of their own.

2\. PHP 5.3.3 (with backported security and bug patches) remains the default
in CentOS 6.9, which is supported until 2020. More recent versions are not
available via their repositories. Hosts would have to upgrade PHP outside of
the CentOS packages and assume the maintenance burden from then on.

3\. About 50% of sites running PHP are at versions less than 5.5:

[https://w3techs.com/technologies/details/pl-
php/5/all](https://w3techs.com/technologies/details/pl-php/5/all)

[https://w3techs.com/technologies/details/pl-
php/all/all](https://w3techs.com/technologies/details/pl-php/all/all)

4\. Updating PHP is not like simply updating my Web browser. Real-world
production hosts like mine are filled with various work by various developers
over years. Bumping PHP further than a maintenance release would almost
certainly mean unnecessarily breaking things that are hard to find, probably
tricky to fix and written by people I’ve never met who are long gone.

~~~
astrodust
What's the friction for upgrading from 5.3 to _at least_ 5.5?

~~~
fizdoonk
register_global

~~~
CiPHPerCoder
extract($_REQUEST);

Look, one-line register_globals polyfill.

~~~
jccc
This kind of breezy dismissal is really frustrating. Developers and admins in
the field are dealing with this situation for the reasons I gave above, and
not out of simple ignorance.

------
EGreg
I want to recommend another technique these days:

1) Sign (HMAC) all the session ids that your server issues. This allows
requests with bogus session ids to be rejected at the network borders without
doing any I/O or hitting the database.

2) Use web crypto (now supported by all major browsers) to have clients
generate a private key with which to sign all requests. Using session keys as
bearer tokens opens users up to attacks.

3) Do NOT send passwords to the server! Use passwords to decrypt the private
key, and do not expose the decrypted key to external Javascript. And make sure
the key can't be exported.

4) Clients authenticate new devices using two-factor authentication. If the
person is using a previously authorized device and knows their password this
may be considered two factors already. Unless the password was saved, in which
case they better have a password on their device. Ultimately you gotta trust
the OS.

5) Authentication and authorization for data may be done automatically by a
side-channel (QR code via camera, or bluetooth) with the proof submitted to
the server by either device. Revocation ultimately needs a blockchain.

6) If you lost all your authorized devices, the backup should be: M of N
public keys, plus a passphrase you know. This is only for rare cases and the
passphrase can be weak.

No more passwords except for the above!!

~~~
CiPHPerCoder
Please tell me you've seen this before? [https://www.nccgroup.trust/us/about-
us/newsroom-and-events/b...](https://www.nccgroup.trust/us/about-us/newsroom-
and-events/blog/2011/august/javascript-cryptography-considered-harmful/)

~~~
EGreg
I haven't seen _this_ particular article, but having read it, I can tell you:
there are different threat models. You can't be secure against them all on the
Web. Of _course_ we have to trust the server to deliver the initial code. The
same is true with apps delivered via the appstore etc. But that doesn't mean
you should be send and storing password hashes to the server, even if
generated by PHP. It doesn't mean you should be using the session cookie alone
as a bearer token to access a session.

First of all, you have to agree that a security requirement in _addition_ to a
cookie doesn't make things _less_ secure.

Secondly, with Web Crypto the Web has a way to mark keys "non exportable". If
the website is sending you the wrong resources then of course anything can be
sent, and web-based code isn't the ultimate way to protect the user. The same
is true of other approaches. However _if_ the initial code download wasn't
tampered with, then you are far more protected. Because the secret private key
won't be exported from the browser website. And it won't be accessible to
anyone outside the JS environment that asks for your password or finger to
derive a key to decrypt the master key from the local database. And in that JS
environment, you can make sure (via closures) that no one gets access to it in
"userland".

OH AND YOU SHOULD ALSO BE USING A PRIVATE KEY PER USER TO ENCRYPT DATA AT REST
ON YOUR DATABASE, AND STORE THIS KEY IN THE DB MULTIPLE TIMES - EACH ONE
ENCRYPTED BY THE USER'S DEVICE KEY. You don't store these device keys.
Successfully authentication requests from the device send this key. So once
again you need to obtain this key in order to unlock user's info needed for
the request. And users can send permissions to unlock their information to
each other in sidechannels. You can take this security VERY far...

So it's strictly more secure than the server side database for passwords, even
hashed and salted with key strengthening. BUT, don't advertise it because then
it introduces security attacks where people over-rely on this to te detriment
of the vectors mentioned in the article.

PS: Oh. This was written in 2011, before the Web Crypto standard _I am
referring to_ was published and adopted by all web browsers. I do NOT
recommend doing the crypto methods in JS! And yes it has a secure RNG now.

[https://developer.mozilla.org/en-
US/docs/Web/API/Web_Crypto_...](https://developer.mozilla.org/en-
US/docs/Web/API/Web_Crypto_API)

Also _Object.freeze()_ is a thing now.

~~~
CiPHPerCoder
> But that doesn't mean you should be send and storing password hashes to the
> server, even if generated by PHP.

There is prior work in this genre. Something like SPAKE2-EE might be worth
looking into here.
[https://github.com/jedisct1/spake2-ee](https://github.com/jedisct1/spake2-ee)

> It doesn't mean you should be using the session cookie alone as a bearer
> token to access a session.

Well, maybe.

A random ID that just tells PHP where to look for the session data, with all
the data persisted server-side, is secure as long as it's transferred over
HTTPS. Most frameworks/libraries abstract the implementation details away, but
generally:

    
    
      <?php
      use ParagonIE\ConstantTime\Base32;
      $random = Base32::encode(random_bytes(32));
    

This value will be unpredictable and doesn't require an HMAC to ensure this
property. The only time the HMAC adds value is if you're using the user's
computer as a data mule for the entirety of session state rather than "look up
this identifier in a database".

In those use-cases, you enter the usual JWT abuse territory, as outlined here:
[http://cryto.net/~joepie91/blog/2016/06/19/stop-using-jwt-
fo...](http://cryto.net/~joepie91/blog/2016/06/19/stop-using-jwt-for-sessions-
part-2-why-your-solution-doesnt-work/)

In this genre, we're working on PAST (although this is probably going to be
renamed before it's finalized) to solve the cryptography flaws baked into the
JWT standards (collectively, JOSE):
[https://github.com/paragonie/past](https://github.com/paragonie/past)

That doesn't solve the "replay attack" issue (which may be what you were
referring to with bearer tokens).

> OH AND YOU SHOULD ALSO BE USING A PRIVATE KEY PER USER TO ENCRYPT DATA AT
> REST ON YOUR DATABASE, AND STORE THIS KEY IN THE DB MULTIPLE TIMES - EACH
> ONE ENCRYPTED BY THE USER'S DEVICE KEY.

I'm not entirely sure what you're getting at here. If I needed to share a key
across devices, I'd either use Diffie-Hellman or Shamir Secret Sharing to
accomplish the task (depending on use-case and threat model).

Perhaps there's a lot of implementation detail that's not being discussed here
that I'm missing and what you're saying is a conservative local maximum, but
it stuck out a tad bit.

~~~
EGreg
Perhaps. What I mean is, the session id as a bearer token for example can
still be insecure even if sent via https. For example, if PHP scripts use
$_REQUEST and session_start() uses the id sent in $_GET from GPC. So you can
have session fixation attacks. However, if you additionally require devices to
sign all requests with their private key (stored in IndexedDB for the domain
and not exportable) then Web Crypto can help mitigate many classes of attacks,
including session fixation and CSRF. (It can even increase security over http
without https, even though it's an academic point.)

Today we wrote up an article about this, actually, referencing your guide:

[https://qbix.com/blog](https://qbix.com/blog)

It goes into more detail about all this.

~~~
CiPHPerCoder
> Today we wrote up an article about this, actually, referencing your guide:

Neat. I'll have to give that a read in the morning.

Although, it looks like one of your links in the opening paragraph under the
"Web Security in 2018" header is broken, and presumably that was the one meant
to link to our guide.

~~~
EGreg
We fixed it :) Lmk what you think.

------
ewams
I disagree with their suggestion to comment out or remove old PHP code. People
need to have a date and timestamp at their top of the articles AND state what
version of PHP they are coding for.

Information on older version should not be pruned as there are legitimate
reasons to keep them. Not everyone is using PHP 7; what do you do when you
inherit an older code base; what do you do when you just need to modify a few
things on an older code base; what if you are trying to learn security based
programming; what if you are trying to learn how to break into systems (to
then learn how to protect them); what if PHP7 is not available or feasible for
your project; etc.

------
deanclatworthy
Hi Scott, thanks for all the great efforts. This is a good start, and baseline
but security is so much harder once you get past these basics. Once you get
into the realm of timing attacks and all the ways to fuck up password recovery
for example, it becomes a mine field.

What can we do about that, short of “use framework X or Y” as they are the
only ones peer reviewed?

------
leetbulb
I've been a PHP developer for over ten years. Paragon Initiative is amazing.
Their Github repos are full of useful things:
[https://github.com/paragonie](https://github.com/paragonie)

------
graphememes
It will never be secure unless you do it at the system / language level.

~~~
CiPHPerCoder
And we've been working on that.

* [https://wiki.php.net/rfc/libsodium](https://wiki.php.net/rfc/libsodium) (PHP 7.2)

* [https://wiki.php.net/rfc/mcrypt-viking-funeral](https://wiki.php.net/rfc/mcrypt-viking-funeral) (PHP 7.1)

* [https://wiki.php.net/rfc/random-function-exceptions](https://wiki.php.net/rfc/random-function-exceptions) (PHP 7.0, largely the result of a mailing list discussion critiquing the original implementation of the new CSPRNG functions)

There is plenty more we want to work on in the coming months.

~~~
mtgx
How about Noise?

[http://noiseprotocol.org/](http://noiseprotocol.org/)

~~~
CiPHPerCoder
That's more of a "open source library" goal of mine:
[https://github.com/paragonie-scott/public-
projects/issues/6](https://github.com/paragonie-scott/public-
projects/issues/6)

------
stcredzero
Golang community, please take the history of the PHP community as a series of
models for 1) what to do and 2) what not to do! Also do this for Java, Ruby,
Javascript, LISP, C++, and Smalltalk!

And if you think the history of PHP isn't applicable, then go and find the
Golang library authors who are advocating _filtering_ as a defense against SQL
injection!

------
smacktoward
What's needed here is a "naming and shaming" effort. Make a public directory
of bad PHP tutorials/references/etc., with the names of the people and
companies who wrote and host them prominently attached. Maybe even give them a
score, based on how frequently cited/linked to their bad advice has become, or
how many pieces of it they've proffered. Then only take them off the list when
the documents are cleaned up or removed.

If you're a professional PHP developer or a company that builds on PHP, it
would be very embarrassing to find yourself prominently featured on such a
list. Which would create an incentive for those people to clean up their work
so they can get off it.

As things stand currently, publishing outdated and dangerous information costs
the publisher nothing, so they see no reason to stop doing it. Create a cost
by attaching reputational damage to the act, and you create a reason for them
to stop.

~~~
rypskar
If that list become popular search engines will use it as a validation that
the linked articles are good, so the bad tutorials will get better SEO and
even more newcomers will find and use them. The list can even be an incentive
to write bad articles since you get free references from this site, which give
scamy authors more visitors and more income from ads

~~~
Can_Not
Actually I think there are SEO ways to link but also to tell search engines
that you're not endorsing the link, probably called no refer or something.

