The discussion on race conditions at the end is an important one, and IMO the bu...

krajzeg · on Nov 10, 2022

Agreed. The fixed logic, at least judging by the commit message, still feels very shaky on correctness grounds ("if we are dismissing something that doesn't seem to be right, ignore it").

Since they're rewriting code and changing method signatures anyway, I would prefer they got rid of the notion of "currently visible screen" and made sure that all dismiss() calls have a unique pointer or token pointing to what exactly is being dismissed. If this was my codebase, their approach would give me all sorts of bad vibes about additional problems lurking deeper.

The whole process and the nature of the fix doesn't inspire a lot of confidence in the security of Pixel/Android in general.

izacus · on Nov 10, 2022

So you'd go out and refactor a major security sensitive component (which dates to time before your career most likely) in a span of a single month for an emergency security patch deadline?

That doesn't inspire a lot of confidence in your risk assesment and decision making.

I'd do what Google did: rollout a patch that addresses the immediate danger and then backlog proper refactors over time.

krajzeg · on Nov 10, 2022

Their fix included a similarly large refactor, they just used the "security screen type" as a newly introduced parameter instead of something unique to the screen instance.

I do agree that in the real world, sometimes you have to settle for a less-than-ideal solution. I hope my post reads less like "those people are idiots", which was not my intent, but more like: this specific fix isn't ideal, and knowing this type of code is live in a device doesn't fill me with confidence, even if I can understand reasons for why it was done that way.

btown · on Nov 10, 2022

Right? This was absolutely the "right" level of refactor for a hotfix, as the full refactor would introduce much more state management that could itself introduce bugs. And especially if behind the scenes there was a detailed audit of what things can currently access the current security screen, it would be fine for now.

But I sincerely hope that in the postmortem, there would be a larger meta-discussion around code review practices and how something like this "global dismiss" became part of the API surface to begin with, and a sincere prioritization of a larger review within the backlog. Though with everyone on edge at this time in big tech, I doubt that ends up happening :(

bragr · on Nov 10, 2022

>Their fix included a similarly large refactor

Their change is hardly a big refactor. This includes all the new code, all the parameter changes everywhere the function is used, and two additional test cases. This is a tiny change.

>12 changed files with 102 additions and 26 deletions. [1]

https://github.com/aosp-mirror/platform_frameworks_base/comm...

gjsman-1000 · on Nov 10, 2022

I don't think that is as much of an issue as the ridiculous process he had to go through.

Think about that first security researcher. You literally found a Screen Unlock bypass (should be Priority #1, right?) - and Google just went and put fixing it on the backburner.

If they will put something like that on the backburner, what else are they ignoring? It isn't confidence-inspiring.

Edit: Also, knowing Google, what are the odds of your full refactor? "Temporary" fixes become permanent fixes quickly.

izacus · on Nov 10, 2022

> Edit: Also, knowing Google, what are the odds of your full refactor? "Temporary" fixes become permanent fixes quickly.

Hahah, I wish that was only Google :D

richardfey · on Nov 10, 2022

Could have been sold for up to 300k or more on the black market.

pmontra · on Nov 10, 2022

Maybe it was an already well known exploit. After all this was a duplicate and Google was sitting on it. Two people found it and reported it to Google. Why not a third one, and sold it?

tehlike · on Nov 10, 2022

Hahah it can go both ways.

You can have 2 major rewrite over 3 years or you can have a new temporary-became-permanent bug fix.

kudokatz · on Nov 11, 2022

> The fixed logic, at least judging by the commit message, still feels very shaky on correctness grounds

This was my experience as a dev on a team at Google for a few years. I saw a LOT of exceedingly lax treatment of correctness in the face of concurrency. There have even been multiple decisions I've seen to guess at how to fix concurrency bugs and just say "well, looks good to me, let's see if it does anything."

It's par for the course, and folks get (got? =P) paid handsomely for doing it.

adrianprestamo2 · on Nov 10, 2022

[flagged]

adrianprestamo2 · on Nov 10, 2022

recursive comment

adrianprestamo · on Nov 10, 2022

recursive comment recursive

cr4nberry · on Nov 10, 2022

This just sounds like you're prematurely optimizing for additional security screens getting added.

Maybe that's not on the table atm? Still odd that they took so long to change a couple method signatures and write a couple test cases

avianlyric · on Nov 10, 2022

They already have multiple security screens, and a demonstrated critical bug with security screen confusion. Not sure how this is premature optimisation.

cr4nberry · on Nov 10, 2022

because if the number of screens is small and there are few tiers (only 2), passing an identifier around could be overkill

sounds to me like it's an optimization for introducing more tiers than what there are

strix_varius · on Nov 10, 2022

> the number of screens is small and there are few tiers (only 2)

Making this kind of assumption, when there are no such guards in the system itself, is exactly what leads to security issues.

If the system enforced two named singletons as security screens, so it was impossible to .dismiss() the wrong thing, then sure. But that's not how the system is, and assuming that "the number of screens is small" and "there are only 2 tiers" without enforcing that assumption with code is pretty much how the original bug was introduced.

roywashere · on Nov 10, 2022

Since they are dismissing the Lock Screen _type_ (SIM, PUK, Pin) and not the instance, a logical example for where this might go wrong is if you have dual SIM. Then again, worst case you dismiss the incorrect SIM Lock Screen. That will not give you full unlock and also the ‘wrong’ SIM will still not work

johnfernow · on Nov 11, 2022

Yeah an attacker may be able to use their own personal dual SIM Pixel phone to bypass the SIM lock screen for a stolen SIM card that they don't know the PIN or PUK code to using a similar technique, but like you said, I'm almost certain that it wouldn't actually let them use it to send and receive texts (and if it does, then that's really an issue in the SIM card's OS, considering anyone could just modify AOSP to ignore the SIM lock screen and then put that on their own phone.)

Even still, being able to bypass the SIM lock screen would still be a bug, just not a vulnerability. Google doesn't pay bounties for non-security related bugs to the best of my knowledge, but I can't help but feel this is still not an ideal way to design the system. It likely is fine today, but as strix_varius said, these kind of assumptions are what led to this vulnerability in the first place. Vulnerabilities do pop up from time to time in otherwise well-designed systems, but this lock screen bypass never would have been present in the first place had better design practices been followed. As krajzeg said [1], The whole process and the nature of the fix doesn't inspire a lot of confidence in the security of Pixel/Android in general.

1. https://news.ycombinator.com/item?id=33545685

oneplane · on Nov 10, 2022

I would indeed expect something more robust like a security state machine where not all states can transition to any other state freely. The UI shouldn't even have a say in what transitions are allowed.

nerpderp82 · on Nov 10, 2022

The other nice quality if your code is in a state machine is that it can verified by a model checker.

You might like this post on statig, an HSM library for Rust

https://old.reddit.com/r/rust/comments/yqp2cq/announcing_sta...

doorman2 · on Nov 10, 2022

The Rx model works nicely. Rather than trying to model state transitions, you model "scopes of work." So if you have something like an event listener, you would tie it to the active unlock session. When that session ends, you dispose of that "scope of work" and all work associated with it would immediately stop.

joezydeco · on Nov 10, 2022

Am I reading this right? This reads like the presence of a UI element holds the unlock state of the phone?

izacus · on Nov 10, 2022

No, not exactly, but Android is old and gnarly enough that a lot of components don't have a clear Model/View separation you'd expect in modern codebases.

DonHopkins · on Nov 10, 2022

The good old "Model / View / Confuser" paradigm.

bongobingo1 · on Nov 10, 2022

Android was invented before MVC?

moffkalast · on Nov 10, 2022

Well it was designed around Java, so definitely before common sense was invented.

btown · on Nov 10, 2022

Well, the UI element’s dismissal is what signals the system that holds unlock state. And the problem was that multiple other systems would automate the dismissal of the current UI element… without checking whether it was the appropriate UI element they expected to be there!

dshpala · on Nov 10, 2022

And of course fix includes magic SecurityMode.Invalid value, which makes dismiss() behave like it did before.

I'd look very hard at places that use SecurityMode.Invalid.

bsder · on Nov 11, 2022

Yeah, I suspect this fix isn't going to hold for very long at all now that everybody knows to look at it.

If Google isn't readying a more extensive fix right now, they're going to get pwned shortly.

zitterbewegung · on Nov 10, 2022

Google invests a great amount of money in their project zero team but does anyone know if they have a specific red team that is dedicated for Android ?

mrb · on Nov 10, 2022

As a former Google Information Security Engineer I can say: I don't know, because the Android security organization is completely separate from Google's security organization. It's something I always found odd, as it created some redundancy between both organizations, lack of homogeneous processes, etc

saagarjha · on Nov 11, 2022

Yes. See for example: https://i.blackhat.com/USA-22/Wednesday/US-22-Karimi-Google-...