
Maximizing password manager attack surface - robin_reala
https://palant.de/2018/11/30/maximizing-password-manager-attack-surface-leaning-from-kaspersky
======
pstch
I don't find it weird for a password manager webextension to use a native
binary, as password managers may be used in other contexts than browsers, and
as it allows to use well-established cryptographic tools.

Of course, serializing to HTML and parsing it out of the browser seems like a
very bad idea. I suppose it was made this way because their neuronal network
was written for HTML.

I wonder what made Kaspersky use a neuronal network to recognize password
fields. It seems over-kill to me, but I may underestimate the complexity of
properly handling password fields.

~~~
tgtweak
As someone who built a machine learning field detector that works only on the
source html, there are a few challenges to detecting these in the browser:

1\. JavaScript is not a great language to build high speed low resource
inference engines.

2\. Sending user's HTML to your cloud to do so is a privacy, security and
latency nightmare.

3\. Regex matching doesn't get you anywhere near 95% field accuracy (not even
talking about form accuracy, identifying all fields on a page correctly) on
the web. You'd be amazed (or maybe not if you're a web dev) at the lack of
consistency in field names - from machine generated field names to missing
names to duplicate names... It's magical. Even to a human observer looking at
a rendered page, it can be confusing which field is which. Just look at chrome
try and fill a complicated address form out for you using a saved persona.

4\. Even if achieved in JavaScript, the model would be simple to pull out and
reuse elsewhere, possibly to learn how to game it.

5\. Good models will be built by good data scientists, whose favorite tools
likely don't produce models that can be serialized for use in a JavaScript
inference engine.

This comes down to the approach taken here - put the inference in the code
where it runs (and that was likely the same that trained the model) and
interface with the page via extension to ship it there and back. It's local so
it never leaves the machine.

It's very high stakes to take user controlled HTML into unsafe memory. Large
surface area doesn't automatically mean an insecure implementation.

~~~
evrydayhustling
These are good points but:

> JavaScript is not a great language to build high speed low resource
> inference engines.

This is a browser plugin analyzing html downloaded at human browsing speed. I
doubt performance is a primary requirement.

> Even if achieved in JavaScript, the model would be simple to pull out and
> reuse elsewhere, possibly to learn how to game it.

The binary model is obfuscated but still distributed. I expect that the added
difficulty of working with the binary model is small compared to the overall
challenge of gaming it.

> Good models will be built by good data scientists, whose favorite tools
> likely don't produce models that can be serialized for use in a JavaScript
> inference engine.

Models built in research-optimized environments can be translated after the
fact to match production needs. (It is getting easier with e.g. standard
interchange formats for neural models.). Kapersky is a resource-rich org
working on security software -- exactly the folks whom, if diligent and well
intentioned, should invest in such hardening.

~~~
tgtweak
Getting a sparkML serialized model to drop into a JavaScript interpreter is
not possible today from what I can see (and certainly wasn't the case 2 years
ago). There does seem to be some good progress in the field with tensorflow.js
and ml.js but nothing I'd put in production with a few million users. In
native Scala the inference engine with a loaded model takes a few hundred MB
of memory, I'd imagine with some JavaScript transpiling with emscripten or
similar that would get ballooned quite a bit.

I'd be really glad if there was a viable method to do this without murdering
the end user device.

------
blattimwind
"Let's unpack malware in the kernel by supporting 127389213 arcane archive
formats in kernel code" — Symantec

~~~
the8472
Microsoft's defender was also guilty of doing various kernel mode/high
privilege parsing things.

~~~
blattimwind
Few security software vendors have shown a particular aptitude for actually
designing secure software, let alone implementing secure software.

~~~
krylon
I find this extremely strange. I cannot even come up with a good metaphor to
explain to non-technical people how untrustworthy this makes most AV software
look to me.

I mean, we all make mistakes, but the kind of bugs that have been found in
various AV products are so obviously stupid one can easily get a concussion
from face-palming. If I were to roll my own crypto, there are all kinds of
subtle mistakes that I could, nay: _would_ , make that I might not even
understand if someone explained them to me very patiently. Like, redirecting
system calls from a "sandboxed" process, but then executing them as SYSTEM?
How can I trust a company that allows such dumb mistakes in their code to
build software that is supposed to make a system more secure? It is like
finding out that the surgeon who is supposed to operate on you lacks basic
knowledge of human anatomy.

~~~
blattimwind
> I find this extremely strange.

I don't. AV software is virtually impossible to vet for a customer; it is the
literal stone to keep tigers away (do you see any? It must therefore work). So
100 % of selling AV software is marketing, _not_ functionality.

"Independent tests", e.g. how much slower the computer is made by AV software
disfavour solid design: All else being equal an AV software where the
signature engine runs directly in the kernel code intercepting FS calls will
always be faster than the properly designed software were the signature engine
runs at lower privileges (requiring task switching), and the FS filter driver
has to delegate.

Tests where well-known malware is fed into engines and the red flag comes up
are also pretty meaningless; signature scanning _works_ , that's not the
problem of the product.

The problem of the product is that the only things a customer _can_ measure
disfavour good execution of it _and_ that signature detection is fundamentally
inept in countering relevant attacks and therefore no real protection.
Heuristics (or ML or whatever) doesn't really work well, either.

------
sgc
I have always somewhat naively assumed it is best to never use a password
manager plugin but instead use a separate binary with less convenient
integration to reduce the possibility of browser exploits exposing my entire
password db. Is there any truth to that?

~~~
chowells
It's a tradeoff. You get that advantage, but then you open yourself up to
phishing attacks that browser plugins avoid by checking the domain. My ideal
would be a browser plugin that sends the domain, and nothing more, to a
standalone process.

~~~
PurpleRamen
A bit more must still be done. The browser-plugin must find the fields to fill
with the data. It must also allow adding new entrys or changing existing
entrys. Actually finding the correct username-field seems to be slightly more
complex on a bad designed page, because it has no special type like the
password-field.

Oh, do we know whether it's really just a simple passwordmanager, and not also
some autofill-manager, session-restorer, or other security/comfort-snakeoil?

------
mosselman
There is an air of arrogance to the article that isn’t explained. Why is the
implementation so weak and how should it have been done? Or rather: are
passwords at risk? Is there an actual possible attack? Is this explained
already and am I missing it?

~~~
palant
As mentioned in the article, I'm bad at reverse engineering. I merely point
out the massive attack surface, and I'm pretty certain that this can be
misused into infecting users with malware. But that's up to other people to
prove, I'm not going to sink more time into this that nobody is going to pay
for.

How it should have been done: stay in the JavaScript sandbox for all the
logic, rely on the browser's existing functionality, use an absolutely minimal
communication interface to the application.

------
viach
When one submits a similar bug on h1, he gets a standard answer - please
demonstrate a practical attack vector. Just saying.

~~~
palant
Yes, bug bounty programs usually dislike being told about inherent
architectural issues. So far I've only come across one where all security-
relevant feedback is welcome: Mozilla. Which is exactly why I'm not submitting
this via the bug bounty program. This way it's far more likely to reach
somebody who is responsible for the code.

------
nytesky
Honest question: how secure would an encrypted excel spreadsheet in Dropbox be
as password manager?

~~~
byproxy
In essence that's what I do, except with a Keepass file in OneDrive. It's been
working pretty well for me the last few years. Not sure how insecure it is,
though. I imagine it'd take someone cracking my OneDrive password to gain
access to the Keepass file which they'd subsequently have to crack. Assuming
the passwords are of high enough entropy (mine are(I hope)), you'd probably be
pretty safe.

~~~
h1d
It's funny they tell people to use different password for all sites except the
presence of master password makes it far more vulnerable especially when
people can be using weak password as they don't want to lose it.

~~~
martamoreno2
If you are using a weak password, that's always a problem...

But a masterpassword for offline storage is a completely different story,
because even if the world knew your masterpassword, they still couldn't do
anything with it, because they need to get to your dropbox file first.

If you use similar passwords everywhere, any site can read your password and
use it to login somewhere else... You are supposed to use long random
passwords for websites because you have to assume that every single password
gets compromised and the website you are on tries to hack all other accounts.

------
farazzz
A neuronal network refers to a network of real neurons. I think the article
meant to say neural network

[https://en.m.wikipedia.org/wiki/Cultured_neuronal_network](https://en.m.wikipedia.org/wiki/Cultured_neuronal_network)

------
DubiousPusher
Is Dashlane guilty of the same? I'm curious as they require you to install a
native desktop app before the plug-in can work, which presumably is farming
out most work to binaries installed with the app.

------
blfr
LastPass also pushes you to install some binary component for their extension.

~~~
Fabricio20
LastPass does NOT require you to install a binary component. That's an
optional you have to access the vault on your desktop.

OnePassword on the other hand, highly encourages people to download the binary
because their standard extension connects to it.

~~~
danieldk
They also have an extension (which is the default version on
Linux/Chromebooks) that does not require the 1Password application. I have
used it since the betas in Linux and it works nicely.

Also, this thread seems to imply that having a binary component is always bad.
It can also increase security. If the extension is rate-limited in retrieving
passwords from a vault, then a compromised browser can only extract a limited
number of passwords, whereas with a browser-native extension the attacker
could extract all passwords.

------
bitdeep
Man... now a days, it's impossible for a modern web app to not use JSON
parsing. Can't see this as problem for a password manager. If you can't trust
in the browser, you are dead. Ok, Kasp want to sell they fish, but, we need
good arguments.

~~~
wtracy
Most web apps do this in a JavaScript context inside the browser's security
sandbox, not as native C++ code with unknown privileges.

------
make3
installing a security suit likely subject to the control of the Russian
government has always seemed like a funny proposition to me to begin with

~~~
gdy
Russian government is less likely to have any interest in you unlike your own
government.

~~~
make3
I'm Canadian so no

~~~
gdy
Then Russian government's interest in you is probably zero.

