
Duo finds SAML vulnerabilities affecting multiple implementations - lvh
https://duo.com/blog/duo-finds-saml-vulnerabilities-affecting-multiple-implementations
======
lvh
This is a great example of a hairy canonicalization bug!

SAML has a process called canonicalization. Why? There are a bunch of XML
documents that "mean the same thing", especially if you extend the meaning of
"mean the same thing" to "how parsing libraries expose the tree and how
consuming applications use it" instead of just the narrow definition of
"exactly what XML says it means". But signature algorithms don't sign semantic
XML trees: they sign bytes. So it's important that for an XML document you
have exactly 1 byte representation. (Well, I say that, and I don't really
believe it is important for SAML. But W3C believed that, and that's why XML
canonicalization is in the spec and we get to deal with that now so :shrug:)

So, let's say I (legitimately!) own attacker.com, so I (legitimately!) own
victim.com.attacker.com, and I (legitimately!) get someone to sign
me@victim.com.attacker.com. All is well: the signature is only valid for that
boring domain.

Here's the problem: the canonicalization algorithm and the parsing algorithm
disagree on how to deal with a particular case. Specifically: the parsing
algorithm doesn't throw away comments, and the canonicalization algorithm
does. When you parse:

    
    
        me@victim.com<!-- what -->.attacker.com
    

The canonicalization algorithm sees "me@victim.com.attacker.com" because it
stripped comments, and the parsing algorithm sees some text (lvh@lvh.com)
followed by a comment followed by ".attacker.com". If you tell your parser to
give you the first bit of text, you end up with "me@victim.com". The attacker
adds the comment, but because the signature validation uses the a
canonicalization method that strips the comment first, it gives the SAML token
the all-clear: the comment does not affect the validity of the signature, but
it does affect the final parse tree.

One of the (other) ways you can prevent this (which I heartily recommend to
clients who can afford to do it) is "cryptographic audience restrictions".
Traditional audience restrictions exist in a variety of standards including
SAML and JWT and basically just say "this token is for xyzzy.com". This isn't
great because tons of applications just ignore those. The SAML messages in
question in this bug don't even contain them. They also wouldn't have helped
much here, because the attacker potentially produces a valid SAML document for
the right RP anyway.

Cryptographic audience validation is about having a key pair per (IdP, RP)
pair, making it mandatory! If you as the victim.com IdP have a unique key
between you and the RP, a so-forged SAML message wouldn't work: not because
the message is wrong (it isn't), but because it'll be signed with the wrong
key. Even if you can get the IdP to sign something for
me@victim.com.attacker.com, and even if the RP's parser would parse the forged
message as being for me@victim.com, it wouldn't matter because the signature
wouldn't validate because it's signed with attacker.com. That's the case even
if you do the sorta-skeevy thing where you get the parser to extract the
alleged principal the message is for (in order to select the key to validate
with).

(If you go through my comments you'll find recent examples of me arguing with
people that they probably don't want public key crypto for these sorts of
tokens anyway; the counterargument is "but it simplifies key distribution",
and my counterargument is: "it really doesn't because if you can you want a
key-per-pair, not a key-per-principal".)

~~~
ak217
This is actually a violation of the "see what is signed" principle.

When verifying an XML signature, you need to only look at the data that was
signed, and disregard everything else. The XML Signature standard defines a
number of "transforms", one of which is canonicalization, which may or may not
strip comments. Only data that has been transformed is signed. This means an
application relying on signature verification should never look at
untransformed data! Comments are not the only attack source - the XML
Signature standard allows "custom transforms" which can do crazy stuff like
evaluation of xpath-based filters.

The "see what is signed" principle is simple and has little to do with SAML.
Canonicalized data that passed signature verification is the only data that
the relying application should look at. The signature verification library
should not allow access to anything else.

If I understand everything correctly, my implementation
[https://github.com/XML-Security/signxml](https://github.com/XML-
Security/signxml) is not vulnerable because it obeys the rule above.

