
DOMPurify, Security in the DOM, and Why We Really Need Both [pdf] - jorangreef
https://www.usenix.org/sites/default/files/conference/protected-files/enigma16_slides_heiderich.pdf
======
spankalee
Trusted Types are the solution for this:
[https://developers.google.com/web/updates/2019/02/trusted-
ty...](https://developers.google.com/web/updates/2019/02/trusted-types)

With Trusted Types on, unsafe strings are disallowed directly at the unsafe
sink level, ie innerHTML doesn't accept strings anymore, but instances of
TrustedHTML. TrustedHTML can only be created by a Trusted Types Policy, and by
isolating policies from user-generated and other untrusted content you
guarantee that you can't have XSS holes.

* Note for the curious: This is how we're locking down lit-html so that it's completely safe from XSS. We have a simple policy that's only accessible to the template strings processor, so that the only strings trusted in an application are the template literals written by developers. All other strings will not be allowed at unsafe sinks. We don't even trust the other internals of lit-html. See [https://github.com/Polymer/lit-html/blob/ceed9edc0aecdf82588...](https://github.com/Polymer/lit-html/blob/ceed9edc0aecdf82588d58840f08082b45f1fb86/src/lib/template-result.ts#L31)

~~~
rictic
Trusted Types are good for most cases, but the case from the PDF is one where
you're given a blob of untrusted HTML that you do still want to render (an
HTML formatted email).

Trusted Types will prevent a dependency or careless developer from setting
innerHTML without going through a policy you've evaluated and decided to
trust, but it doesn't have an HTML sanitizer, so for those cases a library
like DOMPurify is still necessary.

------
zawerf
I am always irrationally(?) scared of using these sanitizers despite their
successful history. As soon as new html/js/css syntax/features are introduced,
won't your security model need to be reevaluated? Which seems like a lost
cause at the rate new capabilities are introduced to the web. E.g., when CSS
Shaders lands, you might be able to execute arbitrary gpu code with just css
(hypothetically speaking, I don't actually know how it will work. I am sure
it'll be sandboxed pretty well. But the problem remains that there are too
many new possibilities to keep up with!).

~~~
nullandvoid
Isn't that like saying there's no point in using an anti virus as viruses are
always evolving?

You're still catching entire classes of existing issues..

~~~
__s
Bad example. Anti virus software is a scam. Just adds another attack vector
when the anti virus software has a bug in their file parsing & makes it that
you can be impacted by just downloading a malicious file

Windows Defender is sufficient & bundled with Windows

~~~
nullandvoid
I mean I never said anything about buying one you just assumed that. I also
just use windows defender of which part of that is an anti virus..

------
ShaneCurran
In modern browsers that support the Shadow DOM[1] standard, this is a somewhat
solved problem with one caveat: it wasn't built for this use case.

Architecturally, however, it does the job but the challenge is integration
with dated browsers. Polyfills for Shadow DOM inherently break the security
features it provides.

Better cross-browser Shadow DOM support would be a step in the right direction
to making things like DOMPurify safer, but unfortunately it seems like we're a
while away from that according to Can I Use[2].

[1]: [https://developer.mozilla.org/en-
US/docs/Web/Web_Components/...](https://developer.mozilla.org/en-
US/docs/Web/Web_Components/Using_shadow_DOM)

[2]:
[https://caniuse.com/#feat=shadowdomv1](https://caniuse.com/#feat=shadowdomv1)

~~~
nerdkid93
I don't think ShadowDOM can be used for security purposes...
[https://blog.revillweb.com/open-vs-closed-shadow-
dom-9f3d742...](https://blog.revillweb.com/open-vs-closed-shadow-
dom-9f3d7427d1af) makes it seem trivial to access closed shadow roots via side
channels like prototype manipulation

------
floatboth
Wait, how exactly does iframe sandbox not solve everything? Emails definitely
should be shown in them, even with client side decryption, you can create an
iframe from a data: URI. iframe sandbox is the strongest sandbox possible.
Unique origin, no JS execution…

~~~
jorangreef
I used to think the same, except iframe sandboxes:

1\. Don't resize dynamically to fit the email content, not unless you enable
unique origin JS execution and do message passing to the parent window. But if
you do that then you open the door to crypto-mining, tracking, spectre
variants, and browser zero-days.

2\. Don't play well with keyboard shortcuts since they steal keyboard events
from the parent window when focused. Proxying keyboard events to the parent is
even more dangerous since an attacker could then spoof keyboard events to
control the parent.

3\. Don't let you whitelist allowed HTML tags, attributes and CSS properties,
which means there's no way to block email tracking.

And that's just for viewing email content. How would you sanitize and
whitelist unsafe email content when replying/forwarding?

DOMPurify combined with CSP is safer and stricter. And if you wanted to,
there's nothing to prevent you from putting the result in a sandboxed iframe
once sanitized anyway. But it needs to be sanitized.

------
GlitchMr
What does this have to do with E2E? I don't see how filtering HTML is harder
to do - even if somehow server-side algorithms are better (which this
presentation seems to imply), cannot the same algorithm be used client-side?

In a way, the situation is better client-side, because when running code on
the client's side, you can check how exactly the browser parses the HTML code.

~~~
bugmen0t
It's in page 18. If you have end-to-end encryption you can't sanitize in the
client.

I mean, you're really just summarizing the presentation. It should be an API
that's in the browser. It isn't. So people need to use a library. That's OK.
But not great.

~~~
jorangreef
"If you have end-to-end encryption you can't sanitize in the client."

I think you meant to type that you can't sanitize in the "server"? Because
with end-to-end encryption the server has no access to the plaintext to be
sanitized. Only the client can sanitize, only the client has the plaintext.

~~~
bugmen0t
oops yes.

------
will_hoskings
This looks like a good idea, but what happens when the user disables this by
inspecting the code (or something)?

------
jgoldshlag
DOMPurify is a great library, BTW. Super small, super safe by default, no
security holes found in months/years

~~~
bugmen0t
They just did a security update last week.

------
austincheney
I am hesitant of articles that bash the DOM for only stylistic concerns.

~~~
jorangreef
I don't see any evidence of alleged DOM-bashing in Mario's slides?

In fact, rather than bash the DOM, Mario wants the DOM to subsume his own
DOMPurify project, rather than have users trust him as a third-party module
developer. I think that paints the DOM in a favorable light if you ask me.

~~~
austincheney
It's slide 27.

~~~
jorangreef
That's not referring to stylistic concerns, and it's not bashing the DOM per
se.

The context of "The DOM is a mess!" on slide 27 is specifically in terms of
security, namely "DOM Clobbering" where an attacker can rewrite DOM methods
from underneath you, and impedance mismatch owing to parser differences and
bugs ("HTML elements implemented in completely different ways, different
attribute handling" in the context of defending against XSS).

It's an honest assessment that's more a statement of fact than anything
intended to be hurtful. It's not even a harsh statement of truth at that. I
find it hard to believe that Chrome or Firefox engineers would find that
offensive. I think they would well agree.

DOMPurify is really fantastic security work. It would make for a brilliant
contribution to the DOM.

~~~
austincheney
> namely "DOM Clobbering" where an attacker can rewrite DOM methods from
> underneath you

I don't see that as a valid security concern in this case. Yes, it will break
your code or do unintended things. In order for this to happen an attacker
must have access to the page in your user's security context, which means some
other preventable security violation has already transpired. This applies
equally with any application/language. Even if you could freeze the DOM such
that nothing can be assigned to object properties then you might be able to
ward off DOM clobbering, but there is still a malicious user in your security
context reading all your secure and private details. If you prevent the
malicious agent from access this security concern with the DOM is eliminated.

In other words whether or not DOM clobbering occurs a prerequisite security
violation is necessary and hardening the DOM won't provide the necessary
solution.

Aside from malicious third parties intentionally writing over event handler
assignments DOM clobbering really comes down to poor code management, which is
the real security problem here. That makes this a stylistic concern.
Additional layers of concerns isn't going to make people instantly less lazy.
There are better ways to solve for this.

> HTML elements implemented in completely different ways

HTML is not the DOM. These are separate and unrelated technologies that are
maintained in very different specifications. This separation is not an
accident. It is by design. I know this is a contentious point, about HTML and
the DOM being far separated.

~~~
jorangreef
"I don't see [DOM Clobbering] as a valid security concern in this case"

It is for sure a valid security concern when doing client-side XSS filtering,
which is what the presentation is about. And no, DOM Clobbering does not
require an attacker to "have access to the page in your user's security
context". Fastmail have an introduction here:
[https://fastmail.blog/2015/12/20/sanitising-html-the-dom-
clo...](https://fastmail.blog/2015/12/20/sanitising-html-the-dom-clobbering-
issue/). Simply put, there's no way to do safe client-side XSS filtering
without addressing DOM Clobbering as a valid security concern.

"hardening the DOM won't provide the necessary solution."

And the author is not suggesting or waiting for that. On the contrary, the
premise is that XSS sanitizers need to be client-side exactly because the DOM
is not hardened and has so many different implementations (even across browser
versions). It's counter-intuitive I know, but server-side XSS sanitizers
really can't address cross-browser parser differences safely. So again, it's
not a question of "stylistic concerns" or "code management" but of doing
secure XSS filtering wherever it is best done.

"There are better ways to solve for this."

And if you go on to the next slide, 28, the point is that despite the
difficulties, this has been solved in DOMPurify, which should be added to the
DOM so that developers can finally have a first-class client-side XSS
sanitizer, without having to trust DOMPurify as third-party code.

There are not many people who know more about client-side XSS filtering than
Mario Heiderich. And I know of no better client-side solution than DOMPurify.

~~~
austincheney
How does the security aspect of DOM clobbering occur without injecting
malicious code into a page?

~~~
jorangreef
Again, see Fastmail's introduction:
[https://fastmail.blog/2015/12/20/sanitising-html-the-dom-
clo...](https://fastmail.blog/2015/12/20/sanitising-html-the-dom-clobbering-
issue/)

No code injection is required. DOM Clobbering simply presents an ambiguous
view of the content being sanitized.

~~~
austincheney
The article makes some false assumptions. My first job out of college was
writing HTML in Email. HTML embedded in email presented in webmail was the
toughest. That is learning CSS through the school of hard knocks, particularly
when IE7 was released with a different box model.

Again, the problem here is injection, specifically HTTP injection. Email
doesn't have an injection problem because it has a more robust protocol: RFC
2821, 2822 and their descendants. To make emails pretty somebody had the
really bad idea of embedding HTML in email messaging. HTML is reliant upon the
simplified architecture of the HTTP protocol. When you want that pretty
content in email you make an HTTP request and some server issues a response.

If they simply took the HTML out of email this security problem would be
instantly solved for email. Therefore this isn't an email problem. It isn't
even an HTML problem. Its a problem of unregulated HTTP requests.

> HTMl is essentially just a serialisation format for the Document Object
> Model (DOM)

They are separate things.

I can speak to all of this with confidence. I passed the Security +, CASP, and
CISSP exams on the first try just from reading a book. I did security for the
military for 10 years, have been developing web technologies for 20 years, and
have been writing JavaScript/TypeScript for more than a decade.

The real problem is that lazy developers are punishing their users under
pressure from business marketing leaders. There are two simple solutions to
this problem:

1\. Don't do stupid things that punish your users.

2\. Create a web standard ACL that limits all HTTP traffic to/from a browser.

These are both sane and simple solutions. Nobody wants them because bad
developers don't want to own the liability for implementing somebody (probably
a marketing executive) else's bad decisions. Also, because an ACL standard in
the browser would kill the web media business.

