
MDN Database Disclosure - diegocr
https://blog.mozilla.org/security/2014/08/01/mdn-database-disclosure/
======
pllbnk
I have been wondering who leaked my address after I started getting the
"E.N.L.A.R.G.E...Y.O.U.R....." spam exactly about a month ago.

Initially I thought that it might have been my fault for entering the email
address where I shouldn't have. I am disappointed that such processes are even
architecturally possible at Mozilla where internal data is exposed externally.

Also, this has raised a question. Almost everybody knows that passwords must
be hashed and salted. But I haven't see anywhere encrypted email addresses.
Are there any strongly negative consequences to encrypting sensitive personal
data in databases?

~~~
conradk
I don't think encrypting an email address would create any issues. In fact, if
I was to provide a service, I'd save the hash email address for easy
authentication (ie hash the email given during a login and compare with the
hash you have) and one encrypted version of the email address so I can use it
if needed (to inform the user or whatever).

I started getting a lot of spam about one month ago too and even emailed
LastPass a bit angry. But this Mozilla incident could well be the cause of the
spam...

~~~
nitrogen
The encrypted e-mail address has to be read somehow, so it's just as likely
that an attacker gets the decryption key as the database itself (unless you
use e.g. a hardware security module). That's probably good enough for e-mail
addresses, but as you likely know, not acceptable for passwords.

------
mp4box
Can someone explain the meaning of "data sanitization process of the site
database had been failing"

Isn't that another way of saying SQL injection ?

~~~
jzwinck
I wondered about this too. At a guess, perhaps they were doing a straight
database dump from a production system which had sensitive information as well
as public data. They would then run a script to delete the sensitive columns
before posting the dump.

This seems likely to have been broken at the design stage: systems should fail
safe. The first-order fix might be to check the return value of the sanitizer
script and refuse to upload if it failed. But a better solution would be to
write a system which makes it much less likely to leak private data. For
example by copying only whitelisted columns (so if new sensitive columns are
added to the system they are not dumped by default). Or storing sensitive data
in separate tables or even a separate database (this will take more work if
levels of sensitivity change over time).

I've speculated here about the details to illustrate the point about systems
design. Unfortunately, too often the glue code for these sorts of things is
written with little or no error checking, so when something is wrong the
system just proceeds through unknown or unvalidated states as we see here. It
doesn't help that the default language for cron and a lot of "supervisory"
jobs tends to be Bash (or Dash) these days, where error checking is turned off
by default.

~~~
Rapzid
Yep, naive bash scripts don't stop on failure and your non-sanitized file will
happily get uploaded.

Does anyone know why Mozilla was posting database dumps(sanitized or
otherwise) onto public servers?

~~~
jeffbryner
It was requested in
[https://bugzilla.mozilla.org/show_bug.cgi?id=932869](https://bugzilla.mozilla.org/show_bug.cgi?id=932869)

------
matugm
I don't understand why they had to do this, couldn't they just use a schema
dump with random data? They are already setting the passwords to null and
names to a random number in their sanitization script...

------
simonsarris
Emails were just sent out to users, full text:
[https://gist.github.com/simonsarris/829ba1c0669c404f0da5](https://gist.github.com/simonsarris/829ba1c0669c404f0da5)

------
frik
In Dezember 2013, Mozilla MDN switched to their self developed Kuma wiki
software (from a hosted wiki solution). An open source wiki software written
in Python and using the Django framework. [https://developer.mozilla.org/en-
US/docs/MDN/Kuma](https://developer.mozilla.org/en-US/docs/MDN/Kuma) ,
[https://news.ycombinator.com/item?id=6876636](https://news.ycombinator.com/item?id=6876636)

~~~
groovecoder
We launched Kuma on August 3, 2012. That post is about the MDN redesign we
launched in 2013.

------
sheetjs
The email that was just sent out to MDN users seems to differ from this post.
The email says:

> Your email address (but not password) was posted on that server for that 30
> day time period.

There is no other mention of the word password or hash (encrypted or
otherwise). However, the post says

> in the accidental disclosure of MDN email addresses of about 76,000 users
> and encrypted passwords of about 4,000 users on a publicly accessible
> server.

~~~
groovecoder
The emails are customized for each type of affected user.

------
jvehent
tldr: an automated data sanitization process failed, emails and salted hashed
passwords were disclosed. no server was hacked.

------
billyhoffman
There is much that could be done to improve this announcement:

1- What does "encrypted, salted passwords" mean? MD5 with a static salt? Holy
shit, that's a problem. bcrypt? Less so. I have no context to know how
concerned I should be, or any indication of how incompetent, or awesome,
Mozilla's existing processes and defenses are. Fail.

2- They talk about a "data sanitization process" failing, but then talk about
a "database dump file" being publicly accessible. Say what? This could mean
anything from "an input validation error allow wrong passwords to work" to "we
do a regular database dump, and store that on a public HTTP directory for some
cron job to grab." Without explanation, I assume the worst. Fail.

3- "While we have not been able to detect malicious activity on that
server..." Again, without the context of what happens, this statement is
worthless. If you leaked the database of your users, I won't expect any
malicious activity. An adversary wouldn't attack Mozilla. They would crack the
passwords of the users and attempt to hijack their accounts on other sites
that matter, like, banking or ecommerce sites. At best Mozilla knows this and
just wanted to include some proof-point that at least they have logs/basic
monitoring of stuff in place, and wanted to save face. At worse, Mozilla truly
believes that someone not actively attacking them somehow means that nothing
bad will happen from this loss, which is stupid. And Mozilla's Security
usually isn't stupid. Fail.

4-" In addition to notifying users and recommending short term fixes, we’re
also taking a look at the processes and principles that are in place that may
be made better to reduce the likelihood of something like this happening
again." This is a completely unsatisfactory statement. If you just discovered
the problem this afternoon, like, "oh shit, why is the a .sql dump in our HTTP
readable /backups/ folder?" then saying "hey, we discovered a problem, we
think we have stopped it, and we are looking into our processes" is a
reasonable response. However when you have "just concluded an investigation"
you should, I don't know, tell us your conclusions maybe? What happened? Why
did it happen? What changed in your existing system that allowed it to happen?
Or has this short coming always existed? If so, who is defining/vetting your
processes? What are you doing so this issue doesn't happen again? What other
thing are you doing to watch the thing that's going to make sure it doesn't
happen again? Instead, we get a generic statement. Fail.

While not as completely opaque as some "oh no, we got pwn3d" posts, this blog
post has completely failed to do the 3 things any post of this kind should do:
1) educate me about what happened 2) help me understand the risk Mozilla's
actions have exposed me to, and 3) give me confidence by demonstrating clear
actions you are taking so this won't happen again.

Yes attacks happen, but when a company or organization is up front, honest,
and over communicates, it does wonders to calm the situation.

Mozilla, I expect more from you.

~~~
jvehent
A process failed, and the DB dump that is published to help contributors
improve the MDN site got out unsanitized. The sanitization/publication process
will be redesigned to include stricter controls. For now, it is shut down.

MDN has been using persona for a while now, meaning that most accounts don't
have passwords in the database. But older accounts still had the SHA256 salted
hash that Django creates.

We traced back as much as we could. Access logs, netflow data, etc... We found
that the tar.gz containing the DB dump had been downloaded only a small number
of times. Mostly by known contributors. But we can't rule out that someone
with malicious intentions got access to it.

~~~
Pacabel
Who exactly are these "known contributors", and why did they have access to
this data? Why did they not report the problem earlier?

And if it was downloaded "mostly" by "known contributors", who was involved
with the rest of the detected downloads?

~~~
jeffbryner
[https://bugzilla.mozilla.org/show_bug.cgi?id=932869](https://bugzilla.mozilla.org/show_bug.cgi?id=932869)
was the request for a sanitized DB for folks wanting to develop MDN itself. We
could identify most of the handful of IPs that downloaded the file during the
time period where it was unsanitized to individuals (i.e. IPs inside Mozilla
offices, etc.). However because some IPs were unknown, or public, or potential
NAT addresses Mozilla decided it was best to disclose the issue.

~~~
Pacabel
If some of the accesses were by people or systems within Mozilla, can you
please address why a month went by before the problem was noticed?

If there was enough need to justify putting forth the effort required to
export a sanitized version of these data for developers to use, then why
didn't these users notice that something was wrong much sooner? And if they
did notice, why weren't the appropriate parties within Mozilla notified
sooner?

Could you please provide more specific details about these IP addresses that
couldn't be accounted for, too? Perhaps a list of them, for instance? At least
then affected users will be able to make their own call regarding their level
of risk due to this incident.

~~~
jeffbryner
Sorry, I can't provide a list.

~~~
Pacabel
Why not?

~~~
ygjb
Because our privacy policies state that we won't disclose personally
identifiable information about users, and IP addresses can be personally
identifiable.

Unfortunately security incidents happen, but we won't violate the commitments
we have made to our users; in this case, if we revealed the IP addresses we
would have another, deliberate information leak on our hands.

------
tomjen3
Can we please start making people go to jail when this happens? I am so tired
of having personal information so often.

