First, why on earth would you still be using non-parametrized queries. Hell, I even remember than back in php 5 the docs said explicitly please use paramtrized queries. To think that after all that evangeling mysql_query et. al. would still be used is really saddening.
Second, huge fucking kudos to them for the incredible response time, transparency, and great logging (looking at you Ubiquity).
All in all, PHP was my first real programming experience and will always remain my sweetheart. I hope it grows even stronger of a community.
Legacy software no one wants to maintain, and no one is paid to maintain.
Unfortunately, some of the core developers were in that camp so PHP 4 came without improvements, PDO avoided the opportunity to be safer, etc.
register_globals had a similar arc: people knew it was a risky feature before the turn of the century but turning it off would inevitably get someone whining about how hard it was to explicitly import their variables or check both GET and POST, as if this wasn’t trivially abstracted.
A lot of this goes back to 90s C / Unix culture. PHP was a product of that world and had the same “Real Programmers™ check their inputs & return codes. If your code breaks, it’s your fault for not being a Real Programmer™ and I don’t want to be slowed down by safety checks intended for you.” attitude which has taken decades to stamp out.
MySQL didn't introduce prepared queries until 4.1, which was released in 2004.
Poking around https://metacpan.org/pod/DBI I notice support added by some point in 1997, possibly as early as 1996. Python’s DBI spec had it no later than 2001.
The other thing to remember is that it wasn’t uncommon to have drivers emulate this behavior on databases which didn’t have protocol level support for it. That didn’t help performance but it did accomplish the goal of making sure that data wasn’t confused with code.
To be clear, I’m basing my comments on having used PHP professionally starting with PHP 3 for many projects, including some household name companies. I trained a fair number of people, some of my earliest open source work was in PHP, etc. so I don’t hate the language but I definitely think there are cautionary lessons to learn about the value of defaults and how languages are taught. As we’ve seen with C, telling people to be more diligent is less effective than making the default safe behavior you have to opt out of rather than the reverse.
You optimist. I think we're still trying to stamp it out :-)
WordPress is still using non-parameterized queries, it is assume they have been battle tested over the years. While PHP is 26 years old, we are not surprise legacy software are still existed.
Attackers got the passwords but not the usernames. Turns out the stored input had a character limit and the project stored fields as 'username' and 'pass'. Since 'pass' was shorter the attackers were able to squeeze it in a short query, but couldn't get usernames.
Looking at https://main.php.net/login.php it's 'user' and 'pw' in the form, assuming the db fields match.
Totally not insane if you ask me. Security incidents can happen to anyone. The incident response matters and I think in this case those unpaid volunteers did an excellent job.
"at PhpStorm, we are fans of Nikita’s work. We always supported the Open Source, and this felt like a new opportunity – so here we are! Nikita will continue contributing awesome features to PHP, and together we will experiment on what is possible in the realm of language tooling."
I am personally not qualified to do infrastructure maintenance, knowing next to nothing on the topic.
It's also often the case that those unpaid volunteers have a day job related to the projects they volunteer in, so there is some direct or indirect financial support. It seems reasonable to assume a large part of open-source software and infrastructure is maintained by people having vested interests in it.
And the idea that there is some mischief- or accident-proof alternative is, I'd argue, equally nutty.
- md5 passwords more or less
- “...running very old code on a very old operating system...”
- no parameterized queries
This is pretty much... all large scale enterprise IT: Everyone's good with security improvements until it interrupts a business unit, and then someone up the chain decides it isn't worth the disruption to deal with a hypothetical security risk.
We think of Python as severely under resourced (2 FTE devs on the language IIRC?) but PHP has less. PHP's ethos means there's no FAANG sponsorship; companies like Jetbrains do sponsor core developers but the developer of the de facto debugger, unit testing framework and package manager/repository all fundraise for their own work. It's nice in the 00's-era hacker ethos, but for a language so widely used it's deeply under-resourced.
And that's one reason why the infrastructure is old and creaking.
As recently as 2020 there was pushback by php-internals' members to using GitHub because it wasn't open/their own infrastructure. This incident has been a catalyst for change, and there is much more change needed.
The parameterized queries could have been fixed, though, right? And you can progressively update people's passwords to a different algorithm as they log back in, and warn them that in six months their password will need to be reset if they haven't logged in.
But, yeah, the comment above is right about enterprise IT. I’ve sat in on meetings where 2FA was ruled out as being too unusable by the marketing team, who said quarterly password rotations were a “good compromise”. They just want to be seen to be doing something about security, and pick fixes that don’t involve much cost to anybody, but also tend to provide little value.
Can’t you have a scheme that is like bcrypt(md5(password)) and since you already have md5(password) in the database you can migrate to bcrypt?
kerberos or even basic auth (because the password can be checked on the server against a real algorithm) would've made this impossible.
Yeah, that's not great.
which is why every two years, I switch the servers completely to fresh new machines, installing everything anew. Let those old things that we installed for who-knows-why die off.
Doing what I can, I make a point of having a calendar reminder to update my software's dependencies every month. Doing it frequently reduces the amount of work needed and keeps things up to date.
Moreover, we template system installation further, so I can install the same server with the same config, with the latest packages with just three commands.
Can't that also suggest that the user database is fine (otherwise they would have guessed correct usernames in the first place), but when a user was found they could find their password in some other leak?
At least to me, this seems like the most obvious vector for a leaked password, but not username.
> the new system
supports TLS 1.2, which means you should no longer see TLS version warnings
> The implementation has been moved towards using parameterized queries,
to be more confident that SQL injections cannot occur.
> plain md5 hash
That's not fantastic. I get that people often don't touch working legacy systems for fear of breaking them, but this sounds rather avoidable. Does the PHP project go through any audits or risk assessments? It's rather surprising for, as others have mentioned, the backbone of the majority of ecommerce websites.
PHP only has 2 full time engineers that work on the core. Everyone else is a volunteer, what do you expect? Companies to actually contribute towards employing more active contributors? HAH
Why not consider Sourcehut, or even GitLab, which are both hosted and they don't need to deal with handling their own infra?
It is notable that the attacker only makes a few guesses at usernames, and
from the logs, the hacker seems to be guessing the password too. this is not consistent with a database leak. it is consistent with a team work where a separate reconnaissance job has been done before hand and the hacker had access to profiles of the developers with separate lists of known usernames and passwords.
Now md5 is outdated, bad and was never made for passwords, but still, there's no easy way to simply invert md5. So for this to have happened, the password must've been bruteforced. Which is only practically possible if it was a weak password.
Which makes me think maybe that's not what happened and maybe the real culprit is password stuffing. If someone had a password weak enough that bruteforcing is plausible then maybe that person also used the password for another service.
Back in 2016, hashcat could achieve 200 billion hash crack attempts per second on commodity hardware: https://gist.github.com/epixoip/a83d38f412b4737e99bbef804a27...
That can crack all alphanumeric 8-character passwords in 18 minutes, and all 9-character alphanumeric+symbol passwords in just 10 days!
Then consider that there are precomputed rainbow tables that have had far more computer power thrown at them, and suddenly MD5 starts looking more like a "light obfuscation" than a true cryptographic hash...
...because that's not what a cryptographic hash means. SHA256, a "secure" hash is vulnerable to the same things you mentioned. The term you're looking for is a password hash, or a key stretching function, which is intentionally slow to be resistant to brute-forcing. As for rainbow tables, that's solved at the application level with salts.
git.php.net supported pushing changes not only via SSH, but also via HTTPS
The master.php.net system, which is used for authentication and various management tasks, was running very old code on a very old operating system / PHP version.
Are you ok to sometimes wait up to an hour to log in?
It just wasn't worth the trouble and headaches.
Do not reinvent the wheel. Passwords work. Just store them somewhat sanely with an acceptable degree of security.
Sounds like progress to me.