

Ask HN: open source Posterous-style email validation?  - davi

Does anyone know of an open source library for validating email headers a la Posterous? I think their model strikes a great balance between usability and security, and wonder if there's anything out there that would facilitate building a similar feature into a homebrew web app.
======
patio11
pyspf (Google it) will do SPF checking for you. If you'd rather do it
yourself, SPF is really, really simple to validate in your language of choice.
However, not everybody uses SPF.

As for "validating" the rest of the email headers, well... I want to strike a
balance between "sure you can do that, good luck!" and "the entire anti-spam
community has tried this and it is basically impossible, which is why we rely
heavily on IP reputation and Bayes-based approaches which do not treat the
contents of the headers as semantically meaningful, since they are in the
hands of the enemy".

~~~
davi
Thanks very much, that's helpful. Maybe good enough for a small, experimental
project (i.e. one step beyond 'nothing'). An open source effort to take a
crack at the larger scope you lay out would be a good thing.

------
frognibble
Here's a sketch for checking the validity of the sender. It does not handle
all cases and I am sure it has some holes. I am interested in feedback on
this. Are there other things to check? Are these checks "safe" for some
definition of safe?

Step 1: If DKIM header present, then use result of DKIM validation.

Step 2: If sending domain has SPF record, then use result of SPF validation.

Step 3: If message passes SPF check using a conservatively guessed SPF record,
then treat the message as valid.

Step 4: If message came from same IP address as other messages for user and
some headers match headers from previous messages (fuzzy match on message
id?), then treat the message as valid.

Step 5: What next? Messages will make it past the previous steps.

~~~
JoachimSchipper
Well, each of these has problems.

DKIM, which is not widely deployed, typically protects the message, From: and
To: headers, and other headers. If this is actually used, you only have to
worry about replayed messages (a hacker sends 1,000,000 copies of a legitimate
blog post), which is doable. Unfortunately, you can't do anything if this
header is not present - even if I have a Yahoo/GMail/... address, which would
otherwise be DKIM'ed, I may have sent this message via another mail server.

SPF, which checks that the server sending the mail is authorized to do so,
would work reasonably well, or at least hand off the issue to the
administrator of the sending mail server. Unfortunately, there are quite a few
domains without SPF or which SOFTFAIL all; worse, prank-loving coworkers may
have access to the same mailserver.

"Same IP address" falls afoul of the pranking coworkers again, and is a very
weak heuristic anyway.

There are at least two solutions that work. The actually secure one is
requiring the user to PGP- or S/MIME-sign all mail; the other one is to send
back a challenge. Mailing lists managers typically do this - send a message
with "Subject: 23dsaf2: please confirm post" and accept any response that
contains 23dsaf2 in the subject.

~~~
frognibble
The context of this thread is creating a Posterous-style email validation.
Posterous does not use either of the two solutions that you suggest.

It's OK that DKIM is not widely deployed because the logic falls back to other
mechanisms when the DKIM header is not present. DKIM is deployed on GMail and
Yahoo Mail, so it is worth doing. Replay attacks are easy to defeat by not
posting duplicate content. It's probably a good idea do to dup detection to
handle the case where the user accidentally sends the message twice.

~~~
JoachimSchipper
Hmm, yes, I was just pointing out that there are other solutions.

Yes, I agree that DKIM+duplicate detection is fairly good; you just can't rely
on it being present, and if it isn't you have to fall back to much less
reliable stuff.

------
japherwocky
Zed Shaw's Lamson project (<http://lamsonproject.com>) has some solid code for
handling most of the messiest parts of dealing with email - bounces, unicode,
etc.

It's structured in a way that makes it very easy to snip out the parts you
want to use without necessarily using all the rest.

------
phreeza
Wasn't it shown yesterday that posterous has basically no security at all?

<http://news.ycombinator.com/item?id=1441997>

~~~
convel
_This security hole is now fixed. We had a specific problem with the way we
dealt with SPF records. Dustin didn't set any up, and there was a specific way
that Robin Duckett's email server responded that caused us to flag it as a
false negative for spoofing._

<http://news.ycombinator.com/item?id=1443143>

~~~
MichaelApproved
What keeps someone else behind the same smtp server from spoofing an email?

~~~
_delirium
A lot of SMTP servers implementing SMTP AUTH will add an annotation
"(Authenticated sender: localusername)" or similar, which will let you
distinguish between different users of the same mail server, even when they
spoof the From: header. Not sure if that's the solution Posterous is using, or
how widespread it is, though.

------
karimyaghmour
I'm still wondering what Posterous plans to do when they reach enough of a
critical mass that spammers will actively try to impersonate existing
accounts. Generalized, non-sender-server-enforced sender authentication does
not exist. That's why SPF and DKIM came along ... I'm sure they've had to pour
over this. Anyone have a link on design/discussion?

~~~
pyre
They could always go with GPG/PGP.

------
MichaelApproved
Validating with headers is like securing a webpage by keeping the URL a secret
or browser user agent and ip address. It gives a false sense of security and
is very vulnerable to cracks.

If you're going to validate with headers then feel free to call it usable but
don't call it security.

------
kljensen
Is there any degree of "free" validation if you route all the emails through
another service that probably does some of this. E.g. a gmail account that
forwards all incoming mail onto your servers.

------
quadhome
Am I missing something obvious against backtracking the headers to the server
immediately before your own? If the IP of that machine differs in future
emails, ask for confirmations?

~~~
MichaelApproved
Since you can't validate beyond the SMTP server you're going to have trouble
when two or more people are behind the same server.

~~~
quadhome
Presumably it's their server's issue at that point.

If a server allows multiple people to send from the same address, then you
can't validate further than that.

