Hacker News new | past | comments | ask | show | jobs | submit login
Marketing Firm Exactis Leaked a Personal Info Database with 340M Records (wired.com)
429 points by georgecmu on June 29, 2018 | hide | past | web | favorite | 294 comments

Oh, man, another missed opportunity to make the average Joe Six-Pack become aware of data aggregation and privacy violations. If the researcher had downloaded the 2TB of data and published it as a torrent, then laymen might care. When someone can query the list and see his own personal information being broadcast, they will understand. When they realize that anyone can look up the address, phone, and all sorts of other info about their wife, husband, girlfriend, boyfriend, boss, children, or neighbor, they might get an inkling that privacy isn't such a stupid thing to worry about.

I realize that we all suffer if it gets made into a torrent, but sometimes pain is necessary to get action.

Within a week, this whole thing will be forgotten and nothing will have changed because privacy is too abstract for most people -- they need to see the personal information that's being collected. The researcher acted properly, but going full Snowden would have had much greater impact on getting better privacy-preserving laws and technology.

"Missed opportunity" ?

People can be stabbed in the back if they go into dark alleys without watching behind them. Let's stab a few people who go into these alleys so that everyone will be afraid to do so and we have an opportunity to prevent people being stabbed in future by making them aware.

Why would you possibly think this is a good idea? The idea is to prevent pain, not cause more pain in some bizarre attempt at making people afraid. There's enough privacy violations - we don't need to be making more of them ourselves.

I actually agree with the parent's perspective. As I see it, there are three potential states for sensitive data:

1. Secured and private. This is data not exposed in any breach.

2. Unsecured and private. This is data which has been exposed in a breach, and which must be sought out by the reasonably tech savvy.

3. Unsecured and public. This is data which has been exposed and can be easily used by anyone.

We want all sensitive personal data to be in state 1. But because of the taboo of state 3, we end up in a situation where we're hostage to state 2, because everyone wants to treat published sensitive data as if it were still private. That takes power away from the non-tech savvy victims of breaches but doesn't diminish the power of tech-savvy criminals who want to use the data.

In my opinion, forcing all sensitive data to be considered either secure and insecure (instead of the weird, quasi-private state 2) would take power away from people who want to use it. Every time a new breach happens there is a race to use it before it's not useful anymore. I believe we could meaningfully defang these breaches by completely leaning in and demonstrating how public the data is. If there were a party truly committed to that and they couldn't be stopped, my hypothesis is that things would actually change.

I think this should be called the 'haveibeenpwned' philosophy or the 'Troy Hunt paradigm'

No, because Troy Hunt and HIBP will not allow you to search the contents of the breaches. He is explicitly against this philosophy.

Your analogy misrepresents the grandfather's point. A closer analogy for his argument might be:

- Some high number X of dark alley stabbings occur each year.

- But alleys still "feel" safe to people, because the stabbings aren't well-publicized. So people don't know to avoid them and the rate X remains the same.

- Let's publicize alley stabbings in an emotionally impactful way, so people know to avoid alleys and we can bring X down.

In the actual case at hand, the argument is that you break a few eggs so people understand the issue viscerally, and hope to achieve massive regulatory change because people now actually care. I don't know if it would work, but it's a more reasonable idea than you're making it out to be.

Solving the root problem here is orders of magnitude more important than any single data breach today is.

I don't think this is correct. For all the people who would have their data exposed in a public torrent, their data is likely safe at present and just needs to be removed from that website. If you put it in a torrent, you're hurting all of those people in a very direct way - you're the one stabbing them in the back.

What the authors here did is correct - they've publicized the issue. Releasing this data as a torrent is not 'publicizing' anything - it is stabbing millions of people in the back, and then waiting for the crowds to come and gape at the dead bodies.

> Let's publicize alley stabbings in an emotionally impactful way

The top post doesn't promote publicizing data breaches that already happened. It is promoting obtaining and publishing the data which weren't published before. It is completely different things. Like making a TV series about alley stabbings - and stabbing actual people in the alley to get better scenes for this video. The former is great, the latter is a heinous crime which can ruin the whole cause.

It says that the tech savvy bad actors may already have it

Doesn't matter. It's like justifying mugging by saying "well, criminals might have mugged you anyway, if not me then somebody else". If somebody might have committed the crime, does not justify committing it again.

There is logic at play here, even if you disagree with the approach behind executing it. It's pretty simple psychology that when your neighbor gets robbed it "hits home" with you more than hearing about nameless people on the news suffering the same fate.

Doesn't mean you go rob people's houses.

>People can be stabbed in the back if they go into dark alleys without watching behind them.

Only in certain countries...

For many people, the benefit of being able to look up information is greater than the cost of letting other people have this ability - most people still won't care too much even if they know their data is published in this way. (For example, most people were willing to have their home telephone number published in a phone book)

For some people, the cost of letting other people look up your information is overwhelmingly huge - this is why privacy should be regulated.

We don't really "all suffer" the same - some people suffer disproportionately (stalking, harassment, abuse).

Publishing the data as a torrent is unlikely to change people's opinion, but will almost certainly harm people.

Don't take this approach.

Everyone has secrets, even if you're not harmed by this data leak illustrating how harmful one could be would move people I think.

Vaccines can give you a fever but we still take them because the short term side effects are worth the long-term benefit. Leaking this type of information to the public operates under the same principle.

Giving personal information to stalkers may kill a few people, but at least the public would be more aware.

You can already make the sort of nuanced queries you're talking about for any of the hundreds of millions of Americans whose records have been leaked in one of the state voter or B2C lead-gen databases. The dumps are all freely available on databases.today and various forums. Phone numbers, addresses, names, email addresses, relationships, members of household...it's all in there, even if paid searches like Intelius don't have it.

Unfortunately, anyone who tries to normalize the data and release a public frontend for querying it will probably be dropped by their hosting provider and ostracized by the security community. People don't tend to like the idea of what you're talking about and will blame the person hosting the information as much as the people who leak it; much like how Troy Hunt will never release the HIBP corpus of normalized password dumps, he'll only allow you to seen if you're in it.

The impact of searching your personal data with that kind of granularity would probably be more dramatic than seeing your compromised passwords online, but I bet it would be even more villified.

> If the researcher had downloaded the 2TB of data and published it as a torrent, then laymen might care

Nope. In fact, you couldn't be more wrong. The outcome Joe Six-Pack would get from it is not that "data aggregators are dangerous" but that "security researchers, privacy advocates and cyber-criminals are pretty much the same, they are doing the same thing - stealing your data from a honest hard-working marketers - and then hide behind 'privacy' and 'research' when they get caught". And most of the press will run with it gladly, it's an entertaining story.

You can't do your cause - whatever it is - worse disservice than to commit crime "to show them". That makes you a criminal - whose argument will be ignored because nobody wants to agree with a criminal - and your cause the one which is promoted by criminals. It's very hard to argue from this position. Sometimes there's no choice - i.e. if the whole enterprise is criminalized in advance, as is criticism of the power in totalitarian states. But nobody smart should put oneself in this position voluntarily.

> going full Snowden would have had much greater impact

Snowden revealed secrets of the NSA that did not hurt average citizen - on the contrary, in many cases were deployed against the average citizen. In this case, you would be the one directly hurting the average citizen. You wouldn't get the Snowden cape.

While I think it's the most effective to leak the data of the persons that are able to change the rules and pursue justice in this case, I do think that these exact people will try everything in their power to make an example out of you - the leaker - and you'll end up as a second Aaron Swartz.

Also I think it's interesting that people say it is "leaked" while what actually happened is that the price of this data got lowered to zero for a few lucky souls.

That’s career suicide and it likely would come with let’s make an example out of you sentencing (depending where lived).

The researcher was using Shodan to probe the entire range of IP addresses allocated in the USA. He found an unprotected site and queried it knowing it should not have been accessible. He queried the personal info on specific people that WIRED asked him about. He revealed data to a third party ("a sample of the data Troia shared [with WIRED]").

An argument could be made that every step above was illegal. I don't agree with that argument, but surely you've heard of (many) cases where people have been prosecuted for things like that.

My point is that he's already taking risks.

There's still difference between "probing a couple of data points to establish a breach" and "downloading the thing wholesale and spilling it out". The first can be reasonably argued as legit research - not always successfully, true, there is a lot of overzealous prosecution, but at least there's a case. The second is clearly either premeditated malice or reckless disregard for the harm to others - you'd have no defense for it.

That argument you propose isn't popular opinion though. He is taking risks if people in power choose to push an agenda but it's totally different in risk if he goes the other path of illustrating something to average joe.

Then he should have used Tor.

In light of my siblings' comments perhaps only lawmakers and their companions ought be exposed in this manner. I'm reminded of the swift enactment of restrictions on videotape rental records sparked by release of Judge Bork's rental records.

Heh. This is a copy/paste comment from the last dozen leaks.

It's interesting that we consider this a leak only when the marketing firm loses the data. If we lived in a just society we would consider it a leak once the marketing firm got the data.

It's a leak because there wasn't an invoice attached to what would otherwise be business as usual: the data being obtained by sketchy third parties.

Marketing companies are sketchy third parties.

That was a significant component of my point.

more sad than interesting because we normally don't know when and which firm(s) got our data. I never heard of this company until today.

I've been a proponent of this idea:

Make companies "super-liable" for any data beyond the data they (actually) need for the functioning of the service that is stolen in a data breach from their servers.

This would hopefully not just encourage more companies to believe that data is "toxic" [1] and treat it as a liability, not as an asset, but it would also encourage them to adopt end-to-end encryption in as many types of services as possible (and eventually stuff like homomorphic encryption or any form of encryption that doesn't give the company itself and hackers direct access to the data).

[1] - https://www.schneier.com/blog/archives/2016/03/data_is_a_tox...

The data in this case is their only asset. They are a data broker, and their entire existence is predicated on the idea that this data is valuable to them.

I think we need to teach people that their data is valuable, likely dangerous in the hands of others, and not to spew it all over the web. Kinda like we did before FB convinced everybody to use their real names.

>The data in this case is their only asset. They are a data broker, and their entire existence is predicated on the idea that this data is valuable to them.

Other than them, who cares? If you want to put people in harm's way, you should accept the consequences when harm occurs.

>I think we need to teach people that their data is valuable, likely dangerous in the hands of others, and not to spew it all over the web. Kinda like we did before FB convinced everybody to use their real names.

No, it's much easier to hold the companies accountable, and they should be held accountable. No company that suffers a "data breach" should have the resources to exist after the breach. Society should punish them out of existence, because they are known cancers.

Considering that their entire business model is "selling this data" then this data is actually needed for the functioning of the service.

I don't know that this invalidates mtgx's general point. Right now, data brokers have effectively zero liability, but we don't treat other companies dealing with dangerous or toxic materials the same way. If a company handling money or munitions left their doors wide open, we wouldn't defend their gross negligence, we'd hold them accountable.

> It's interesting that we consider this a leak only when the marketing firm loses the data.

We don't consider this a leak when the marketing firm loses its data. It's only a leak when we find out that the marketing firm has lost control of its data.

And what really changed? They used to sell the data, right? Now everybody has it (instead of only a percent). Is that worse?

This is pretty much where I’m at as well, on this issue. I only wish there were social security numbers contained in the breach so that we could stop considering them as personally identifiable info.

I agree with your meaning, and I think it goes beyond that. SSNs really are PII by definition; what we need to do is stop pretending that we can use any kind of PII in general as a form of authentication. Whether it's SSN, mother's maiden name, or any of these inane "security questions" that (thank goodness) finally seem to be receding from their peak, we need to move away from the fundamentally broken "tell me something about yourself that only you would know" model of authentication.

Agreed. Thank you for formalizing that ideal more coherently.

With reasonable verification, anyone confirmed to be a part of this breach should be given access to the data, if only for good will. It's a sad state to see that the recklessness (or incompetence) of one entity, and at that a private one, can quickly become a domino in a chain that ends in toppling a person's privacy.

They advertise themselves as having the most accurate data (why wouldn't they advertise themselves this way?) If so, the people it affects have a right to know, and it seems that they have the means to contact them and let them know.

With GDPR that would be your legal right.

... if you are European resident.

Correction: if you are physically present in the EU

Just as medical tourism is a thing, are we going to see privacy tourism emerge as an option? Tour operators can start offering packages...

"The sights of Paris and a personal information purge from the 100 largest US collectors"

Is physical presence the only requirement to be considered a resident? I thought you had to be a resident in the EU.

It's not considering you a resident. The GDPR reads much closer to a declaration of a human right. For example:

Recital 14 - "The processing of personal data is designed to serve man; the principles and rules on the protection of individuals with regard to the processing of their personal data should, whatever the nationality or residence of natural persons, respect their fundamental rights and freedoms, notably their right to the protection of personal data"

Article 3 (2) - "This Regulation applies to the processing of personal data of data subjects who are in the Union"

This hasn't been tested, and each member state could prosecute differently, but it was certainly discussed and then structured in such a way to be a fundamental truth, and in my non-legal opinion (based mainly just on having read the majority of it) it would be interpreted as such by EU courts (ie, not member state courts)

"anyone confirmed to be a part of this breach should be given access to the data"

but how would they ever get the contact information for all of those people? surely that's private information....

oh... right ಠ_ಠ

Much more than just personal privacy. When CEOs, politicians, judges and generals use the internet too do you really want to be the guy/a company that gives them that call? The incentives are all messed up.

The only real strategy is to totally pollute the information with false and erroneous information, while also setting up ways to prevent tracking and fingerprinting and associating. I am somewhat surprised that someone has not yet really emerged as having developed a business model around assuring privacy. It could be dedicated routers with firewalls and built in VPN that also mask device names, combined with browsers and extensions that intentionally pollute browsing history and fingerprinting data, and sends bogus queries and also allows you to set policies for cookies in a little more user friendly manner to only retain specific cookies of specific domains, etc.

From their privacy policy:

“In order to be in line with Fair Information Practices we will take the following responsive action, should a data breach occur: We will notify you via email • Within 7 business days We will notify the users via in-site notification • Within 7 business days We also agree to the Individual Redress Principle which requires that individuals have the right to legally pursue enforceable rights against data collectors and processors who fail to adhere to the law. This principle requires not only that individuals have enforceable rights against data users, but also that individuals have recourse to courts or government agencies to investigate and/or prosecute non-compliance by data processors.”

Information inequity. Whomever has access to this data had an advantage on 340M people, and opportunity to understand and influence them.

I think the antithesis of would be information redistribution. Everybody should be entitled to access all of this information if anyone has it. Just for fun lets say the only caveat is that all information access is also public and linked to each identity.

Do you think its better off in the hands of the highest bidders???

Companies are in some cases (in many cases actually) perfectly allowed to collect user information and from a business perspective would be stupid not to do.

Every time you use a loyalty card that information is collected and yes it's used to understand you and perhaps even influence you, to buy certain products. Buying diapers? Have a look at these baby toys. Most people will throw their personal information out there for a price reduction.

The problem here is the leak, not the fact that it exists.

When did I buy diapers from Exactis? I’ve never even heard of them.

I know exactly what my Safeway card is used for. I also deliberately do not register my phone number or other information to it. Of course they can probably associate it with my credit card but these things are easy to reason about.

The real problem is combining all these datasets in one place for the purpose of perpetuating information asymmetry as a product.

So actually yes, the problem is that this dataset of every single American exists.

You seem to believe you can ward off information collection efforts by controlling yourself what you do and do not communicate to the rest of the world.

While I sincerely admire the quixotic effort, I suspect you are fighting a losing battle.

There are countless situations in daily life where you have no choice but to leak some tiny bit of information about yourself to an external database, and from there on, it's just a matter of cobbling the bits back together.

> and from there on, it's just a matter of cobbling the bits back together

Maybe it shouldn't be. The bit of info I gave about myself I gave (even if implicitly) to a specific entity for a specific purpose. To sell or give that bit to another unrelated, unknown to me entity for an entirely different purpose is a violation.

I agree.

But there's unfortunately no regulation in place to insure that.

And if there were, GDPR style, there would still be the matter of:

    - enforceability

    - exceptions for to e.g. authorities

You have misunderstood my comment. I realize I have no control over what Safeway does with my information or if they or someone else correlates it with other datasets. The data Safeway has is fine by me until it is correlated with other datasets.

That leakage may be inevitable but the correlation is not. We just allow it today. GP claimed that the existence of the Exactis dataset was not a problem. I disagree. That dataset exists only because many disparate sets were linked with that inevitable leakage.

>The real problem is combining all these datasets in one place for the purpose of perpetuating information asymmetry as a product.

Rephrased, the real problem is (currently) what happens once it gets combined with that wealth of other data (that's been purchased, shared, snooped and swindled) belonging to our data overlords like Google.

>Of course they can probably associate it with my credit card

Or if you've ever furnished an ID for some age restricted purchase while also using the loyalty card. Then of course theres location data from Android/smartphone, vehicle telemetry (mfg, finance company, mobile data service, OnStar, anti-theft service, insurance co 'safe-driver' tracking device), members of the Telco mafia (VZW,AT&T, etc.), video surveillance providers running facial & license plate recognition, et cetera.


Safeway may outsource the collection and management of this data to a third party and that company may have a lot of clients and hence records of a lot of people.

I have no idea if that's what Exactis is / does and people may not be aware of this, but it's the reality.

EDIT looks like Exactis gets information on users through cookies, which is not the scenario I wanted to highlight.

In your example does Safeway need to do all the data analysis on their own? Why can’t they contract out to others to analyze the rewards card data. Rarely do I know every subcontractor a business I interact with is using at the time.

I think people are worried about their data being sold, not some limited chain of custody where the data is only being used for the original merchant's analytics.

The issue is not that a third party is involved, the issue is what that third party does with the information.

Just like with weapons of a more physical nature, the problem is that they exist. The fact that they exist means that over time they/it will fall into the wrong hands. The obvious and natural solution to this is to not have them exist, or in this case to not have this database exist.

Yes, I understand companies are allowed to do it. That's beside the point - just because they are allowed to right now doesn't make it right.

> The problem here is the leak, not the fact that it exists.

The fact that the information exists guarantees that it will leak. If not from one company, from the next.

Personal information is like hazardous chemicals.

>Personal information is like hazardous chemicals.

Big data is munitions, ammunition that is loaded into algorithms which are like machineguns that can fire at the speed of light millions of times across the world in under a minute.

There is a difference between a company collecting information related to its transactions and a data aggregator.

There is currently 904.8 TB of data available on Internet-exposed Elastic clusters. Here is an overview of where these servers are located:


2.6MB per person on average? That's a lot of personal data...

An exposed elasticsearch server does not equal personal data though. It can be used for anything really. I have two systems that use ES and none of them for personal data.

A lot of people complained that GDPR was too onerous on small firms and that they should be exempt. According to LinkedIn https://ie.linkedin.com/company/exactis-llc Exactis has just 10 employees (obviously some error possible. Call it 15-20?)

Now do you think small firms can’t hold large quantities of damaging data?

notably in the UK the size of a company is determined not just on employee numbers, but on turnover as well. For example in UK Government guidance on lodging company accounts with Companies House[0] it says:

"There are thresholds for turnover, balance sheet total (meaning the total of the fixed and current assets) and the average number of employees, which determine whether your company is a micro-entity, small or medium-sized."

And there are different requirements for each

[0] https://www.gov.uk/government/publications/life-of-a-company...

That sounds a lot like "I told you so" tone when I still disagree with you. But in case you're here to talk about it and not just to assert your version of the truth, no, I don't think anyone ever claimed that small corps are a loophole. Then big corps would just delegate it to a shell company and be done with it. European law is, to the best of my knowledge, fairly reasonable: if you do something wrong regarding privacy either because you didn't know (like, you tried to follow GDPR but missed something) or do a small thing, you won't get ridiculous fines. But if you're a 10 person company working with huge amounts of personal data and you were grossly negligent, then of course they'll look at that differently from a 10 man company that produces pencils for retailers and incorrectly stored customer's delivery addresses.

What I'd love to know is how much of that is codified law (as in in the actual act) as opposed to just expected to come from reasonable courts.

Courts will always base their decisions on case law, and I suspect that you can reasonably expect a certain kind of GDPR case law to arise, given what the standing case law is already.

The EU has a civil law system where the US has a common law system.

Common law gives judges an active role in developing rules; civil law is based on fixed codes and statutes.

Case law is not binding in the EU.

> Common law gives judges an active role in developing rules; civil law is based on fixed codes and statutes.

This is a dramatic and misleading oversimplification. Under civil law systems, judges still do have great leeway with interpreting and applying regulations. And under common law, it's not really true that judges have an active role in developing rules - they have the ability to interpret them in the contexts of cases which come up, but they don't legislate. The closest thing that they can do (aside from overturning provisions) is to introduce limitations or tests on existing law that is challenged, but even then they're mostly only allowed to do that to the extent that they are using the tests to connect the law back to the Constitution or other existing legislation.

Case law is not binding in civil law (at least not to the same degree as it is under common law), but does definitely play a significant role.

Furthermore, it's flat-out wrong to say that "case law is not binding in the EU". The Republic of Ireland and the UK both use common law, under which case law is binding. Not only are UK court decisions are enforceable across the entire EU, but UK law is actually the jurisdiction for a lot of contracts and agreements within the EU, similar to how New York is the chosen jurisdiction for a lot of contracts or even international treaties that are enforced worldwide, whether or not the parties are based in New York.

Even if you're referring specifically to legislation passed by the European Parliament itself, it's still not really correct to say that case law isn't binding. The European Parliament is an international body held together by international treaties, and while EU courts might have decided to use civil law in interpreting legislation passed by the European Pariament itself, that doesn't mean that case law does not come into play, either in countries with common law systems or even in countries with civil law systems. It's way more complicated than that.

This is, incidentally, one of the problems that Brexit is currently introducing: it's unclear whether parties that have elected to govern their contracts under UK law will continue to be able to do so with the expectation of enforceability.

Wow. Thank you for explaining that. I've never fully understood the distinction between the two.

There is no doctrine of stare decisis in EU courts. Case law is not binding. Further complicated by the huge number of courts that might hear a case, dependent on the DPA.

The French CNIL just fined an association for 75,000 € for a leak in their data.

It was a 2017 case, but I guess it will reflect what can happen ?

Can you link to this? Searching for "CNIL", "75,000" and "2017" doesn't turn up anything useful.


tl;dr: a non-profit got fined 75K€ because their website leaked 42,562 private documents from their users. Anyone could modify numbers in the URL and read other users' documents. The documents included passports, tax information, identity documents, and more.

EDIT: better source: https://www.cnil.fr/fr/sanction-de-75-000-euros-pour-une-att...

Oof, I can see why then. On the other hand, if you're not storing people's passports... is this really something you should be worried about? And shouldn't somebody who's intentionally storing thousands of passports be required to implement basic security practices?

On HN, it's people associated with businesses in the latter category that seem to be complaining the most.

What if gross negligence is the industry standard?

That's when you introduce laws (GDPR) to try and change course.

The "you won't get big fines if you try your best" thing isn't in the law. I believe you that it is probably true, but it relies on the reasonableness of all current and future regulators. I don't like that.

It is in the law. It’s one of the basic principles of law.

By its very nature, however, you cannot nail such a thing down and define it precisely beforehand.

The law only says regulators should think about your intentions when assessing penalties (among many other factors).

Is there anything stopping a regulator from deciding an unintentional violation is "only" a company-destroying 5M euro fine instead of the full 10M? In fact, couldn't it still be a 10M fine? Or should I expect to be let off with a warning? Seems like I'm depending on the good will of the regulators of every single EU member state...

I do not think it's impossible to write a law that says fines for minor and unintentional violations are limited by statue.

That's what makes me nervous about interpretation of GDPR. The EU has 28 member states. Let's say each one of them has a 90% probability of their regulators being reasonable at any given time. Does that mean the chances of the regulators on the whole being reasonable are 0.9^28? (In other words, about 5%?)

As an outsider, I would love to hear that that's not how it works. Do the member states have any checks on each other's enforcement?

I think the root of the argument about small firms was not about employee count, but that small firms typically do not have the resources to comply. But what is Exactis’ annual profit? Maybe they did have the financial resources.

If you don’t have the resources to be a good steward of a dataset, you don’t have the resources to gather and store that data in the first place, even if it may seem easy to do so.

I would agree.

Is there a torrent yet? I want to lookup my own data.

GDPR would have allowed you to ask the company for it. And to some extent remove it.

It was discovered by a white hat; he didn't publicise a data dump.

Let's hope they were the first to discover it.

I hope they were not and this data ends up public.

I’d love to search this database for the details of top people at privacy-violating companies and publish them.

> I’d love to search this database for the details of top people at privacy-violating companies and publish them.

Who defines "privacy-violating"? Jumping into the mud because you feel aggrieved just makes you look like a pig.

Facebook? Equifax? Ad networks?

Basically anyone who profits off user data and makes it difficult/impossible to opt-out.

This is pretty silly, you're going to publish something everyone already knows? What do you think that's going to accomplish? Most of these companies are publicly traded and finding top people in the private companies is just a Google search away. This business is all done out in the open.

You can even force the companies subjected to the FCRA[1] to give you a report on exactly what they have on you.

[1] They are subject to the FCRA if the data is sold to companies who make use of it in credit, employment, and housing decisions.

It's not all that silly actually. Politicians and corporate CEOs make the decision but rarely are on the receiving end of the fall-out. By concentrating on them and by distilling out that information from a much larger body of data enough of an embarrassment could be put together that they might start to pay attention.

As long as all those needles are safe in the haystack they can be ignored, a stack of needles on the other hand is not so easily ignored.

I wasn't talking about business-related data - that is indeed public already.

I'm talking about revealing the same kind of data their companies collect on us, which is way more personal and could contain embarrassing stuff.

It's not mud if you send them an email from a donotreply address saying "my privacy policy has changed, and now I will freely publish your previously private data"


Could you please not post unsubstantive comments here?

I was calling attention to the vast difference between what the status quo security industry considers white hat and what free individuals should consider white hat - low corporate pain versus actually closing the hole.

As I and others have said elsewhere, the data was leaked the moment it was collected and priced for selling to attackers. Forgoing full disclosure is really just blunting the truth, giving corporate whitewash a leg up, and delaying society learning the lesson of what we're up against. As (presumably) individuals and not owners of surveillance companies, we shouldn't bless this behavior as being in the public interest.

When will this stop? When's the last straw? If I gave a bank 100 dollars, and they lost it, I'd have avenues with which to pursue some sort of justice. If I give a company my data, and they lose it, oh well. I wish all personal data was treated like HIPAA, at a minimum.

> When will this stop? When's the last straw?

When the top folks in the US government are personally affected. Until then, "congressional hearings" and presidential ambivalence is the most action we'll get out of them. Most people don't really understand what the significance of these events are.

I hate the fact that I feel so small and insignificant because of the futility of speaking to a representative. I did try to contact Rand Paul last year. He never bothered to reply, but did add me to his mailing list. Great.

On the bright side, we do not know that he has leaked it yet :)

Since some percentage of Congressmen and Senators were almost certainly part of the breach, all we need to do is search for them once the data becomes available and post hand-curated lists of Senator McConnell's Shopping Habits.

Remember, we only got the Video Privacy Protection Act after someone published Bork's rental history during his supreme court nomination[0]. I had to say that public shaming works, but, public shaming works.


> When the top folks in the US government are personally affected

The OPM breach covered a lot of powerful senior people.

Not only that, but the Russian site exposed[dot]su has published personal information, including SSNs, about a lot of powerful people, including Michelle Obama, Robert Mueller, Eric Holder, and Hillary Clinton.


Indeed, however, the impact of such breaches has yet to be made explicit. It seems only a massive cyber attack would show the public what is possible with such information

I think it can be explained with game theory. Right now customer data is massively valuable to a company for all sorts of reasons. A data breach is unlikely to occur and if it does occur then the financial loses to the company are much less (on a risk-adjusted basis) compared to the benefits. Every company has every incentive to collect as much consumer data as possible.

It will only change when consumers demand it to change or outright refuse to give their personal information away. I think everyone should adopt pseudonyms for everything and to be constantly changing their pseudonyms regularly.

>If I give a company my data,

In some cases you're not knowingly giving them your data either.

I just signed a rental agreement for an apartment in the US and in the fine print it says that they can share your data with whoever they want. You can't even opt out. Pretty fucked up.

I bought a car earlier this year. Exciting purchase. We had got to the final bit before they hand over the keys and there was some paperwork to sign. On page 3 was the small print about us agreeing to give our data to everyone.

So I refused and made it clear I would walk away. The sales guy went though the whole ‘it’s not a problem, I’ve bought cars from here and haven’t got spammed’. In the end he had to get a manager and it turned out that the option could be removed from the contract, three menus down in the system.

Sounds like no one had ever asked before. I imagine GDPR will have changed this to opt-in.

> You can't even opt out

I scratched that, and other lines, out of my rental agreement when I rented my New York apartment. The landlord agreed.

I wonder, do they then just enter your data into their database and sell it all anyway? Is there some way for you to ensure it's not included in all the other tenant data they share?

That's the depressing part. I usually shop at Meijer because they were the last grocery left without annoying loyalty cards. As of this year, I've began receiving in the mail coupons for specific items I'd bought there. So either my credit card company has sold my data, or it was 'stolen' when they scanned my license to buy beer at some point(they require scanning the license, not DOB entry). I'm tired of this.

The credit card info is called "level 3 data" and they in some cases have line item by line item detail. Not just "spend $24.89 at Meijer store #349" but each individual thing, e.g. you bought 2 avocados.

Is this data available to mortals? I don't even have digital itemized receipts for credit card purchases, and it's my purchase!

Mastercard and Visa [1] sell this data in aggregate to firms via brokers like Bluekai, to allow for ad-targeting.

I don't believe it'll be feasible to purchase just one person's purchase data [easily], but if you knew who you wanted to get to, it should be possible to narrow the targeting to get to them

[1] http://www.oracle.com/us/solutions/cloud/data-directory-2810... [ctrl+F + mastercard]

Wouldn't a GDPR request to Visa or MasterCard in the EU get me this data?

That's what I was thinking. Would be awesome for personal budgeting...

Well, they have to provide it digital, if they already have it digital. So yeah. Anyone wanna integrate the format they can deliver with e.g. GNUCash or so?

How does this work with Apple Pay, which doesn't tell the retailer who the consumer is? Does the retailer sell the info, or the credit card company?

Wondered about that too. Apparently Apple Pay does use the same artificial CC number with each payment (maybe only with the same merchant?) so it's still possible to have your purchases tracked over time, even if they don't know who you are.

I wonder if the credit card companies have access to line-item level 3 data if we use ApplePay?

usually there is only space for group ids of items rather than individual item details.

but i guess it might be different for different acquirers.

the purpose of loyalty cards is that the messages are usually acquired or processed on non-bank systems so they can go into much greater detail and include individual sale item details

Is that recent? I just started getting it this year. Is there a way to opt out?

From what I know, it started a few years ago; but not all big stores had the equipment in place to send it (the cc processors give them a discount for sending the line-item level 3 data).

With the advent of the chip and pin cards in the USA, it seems logical that just about everyone upgraded to equipment that does support it; which might explain why you are only seeing this in the past year.

So, this seems a little opposite of what I meant. Naive me always assumed I pay with a card, the store gets my cc info to charge and we part ways. I'm getting in the mail ads from Meijer, for items I buy frequently. This tells me they were able to extract my home address and name from my credit card. Is that accurate?

I worked for the largest merchant acquirer ~11 years ago, and we were collecting SKU-level data for Wal-mart at that time.

Yes, these data are all available, if you are willing to pay to get those.

I noticed that now when my employees buy from staples.com, my AmEx statement will show everything they bought. From a manager's perspective, it's kind of awesome because you don't need to keep track of little receipts.

I have no problem with loyalty cards. It helps the store work more efficiently and sell me more relevant products. I have a problem when the loyalty card is tied to my identity and any data from anywhere else.

I usually shop at Meijer because they were the last grocery left without annoying loyalty cards.

(quizzical look)

Are you aware of their MPerks program? Tied to your phone number and an email address, electronic receipts, tracking of your savings, online/in-app clipping of coupons auto-applied at checkout time, automatic "rewards" of $2-3 for every $150 you spend.

The only part of a traditional loyalty card program it doesn't have is making their sale prices apply only with card, but it definitely gives you measurable (and measured) discounts both passively through those "rewards" and actively via the in-app coupons.

Yeah I'm aware. I am not signed up for them. That's the reason I don't like Kroger, their stuff is way marked up without a card. That said, I figured out a 'trick' of just asking for a card and saying you'll fill out the application at home. Doesn't seem like such a trick now since they're all just tracking me by my payment methods. Oh well.

You could do business with places in the EU, where you are covered by EU's stronger data protection law. If an EU based company do stuff with the personal data of US citizens in the US, then the GDPR (etc) applies to that EU company.

It will not stop, simply because there is a ton of money in not having it stop.

The problem is that you can use data an infinite amount of times, but spend money only once.

Data are facts, money is a repository of value. With a bank, you are the customer. With marketing, you are the product.

> my data

Data about you is not (necessarily) data you own.

I'm not saying it's right, but any reasonable discussion has to take this legal landscape into account.

That is not the European view. Data about you is owned by you, not by the company that collected or processed it.

I don't think that's exactly right. You certainly have some say in how it's used, but I don't think it's built on a notion of "ownership".

It won't.

Let's discuss how we can fix this. I'm actually considering leaving my job of 8 years for a probably to be doomed privacy startup. Either way, I'm interested in solutions and more importantly working towards them, even for free.

Legislation. We need legal repercussions for people who wantonly mishandle our personal information. Sorry, but the market can’t help us. This kind of shit needs to be illegal yesterday. It’s just impossible because we have a Congress that is so remarkably out of touch that they won’t do anything about it.

How so many people are convinced the market will save them from the marker given the last 300 years oh history is beyond me.

In America, they simply do not know the last 300 years of history.

Practically everyone knows about the second world war, but nothing is being done about rising fascism. It has nothing to do with not knowing, it's just stupidity.

And how many know about the Ludlow Massacre?

There is no rising fascism in America, if people really knew anything about the Wiemar Republic, Republican Spain, or March on Rome they would know that the left always loses when it tried to be humane and gain power. The only time the left gains power is when 'tankies' are the ones leading the resistance.

I do not comprehend the actual point you intend to make.

But, remember the left chooses the hard road, not because they can but because it is moral.

Regardless, Trump is absolutely a fascist. It seems a bit futile to disagree.

What do you suppose can be done about it?

If you need a good example, look at Germany. They have very strict laws about it and while there are still fascists there (as there are everywhere), they do not run the country.

Edit: On a more practical note, it's always baffled me how normalised it is. People who defend fascists in the US media are still respected and hired. They might be prominent political figures. Somehow calling attention to someone who is fascist is confused by anyone doing something as simple as saying "liberals are evil!", as if the two party system there has anything to do with racial supremacy. As an outsider, you look at it and think "This is the country that helped liberate Europe from the Nazis. How are they not ashamed?". Maybe I think the first step is allowing shame to enter into things when you think about your nation, instead of a quasi-religious patriotism.

Uhmm, actually Congress is very much in touch...

With corporate interests.

Government funded elections might solve most of our society’s spiraling issues but we can’t have any clue what the value would be until we try it.

And of course we can’t make honest claims to democracy until then either.

As an uninformed observer, one of the biggest issues I see is the US two party system. Do you think government funding would help with that? I've always seen it more as a weakness of the general US electoral FPTP system.

Congress grilled Equifax and nothing. Average person saw Equifax commercial on “hey be smart we will keep your info safe with alerts” and thought “wow this company cares about my data” when its precisely opposite.

If the congress is unable or doesnt want to draft a bill to stop predators from milking money off of your data, then that money probably ends up in their pocket some way. Or at least some of it.

Please dont quit your job for the sake of your security - food, shelter, etc. Move these assholes out with next election. Vote in young people that are probably as angry about this shit as you are, in hope they won’t sell out their soul.

In my ideal mind, I'd keep my job, it's pretty lax as long as I deliver. But I want to fix privacy. I'm working with a guy I respect, who is also all about privacy, but I just don't think his model works. Ideally I'll donate some dev time for free, but if I saw even a glimmer of hope, I'd go all in. We can't trust congress, we need a private sector movement. I'm offering my services for free to anyone in the space, generally speaking.

> We need a private sector movement.

You should reach out to a venture called Equifax. They are providing customer alerts for data breaches and a premiere data protection service.

Your imagination is working against you here. The obvious and well-known reason that congress is ineffective is they work for the private sector, who is the disease; not the cure.

Do you have any data to support your claim about what the average person thought about Equifax and Congress? I have a lot of strong opinions about privacy, but I'm also resigned to the fact that most people don't care about it as much as I do. So I don't guess what they're thinking.

> Do you have any data to support your claim about what the average person thought about Equifax

Here you go, first result in Google:


The claim was:

> Average person saw Equifax commercial on “hey be smart we will keep your info safe with alerts” and thought “wow this company cares about my data” when its precisely opposite.

The result you cite is a general consumer favorability rating for Equifax, which is different than this particular claim. People who didn't hear anything about Equifax and Congress are included in the general consumer poll. I'm not trying to nitpick, I'm just pointing at a lack of data for this particular claim.

Figure out a way to make the data collected useless. You can't hide and block every attempt to track, collect and generally have your privacy invaded. But it may be possible to throw a wrench in the gears by making so much noise the data is low quality.

Or increasing costs of holding, exchanging, or using it.

That is what would be needed in the United States of Anarchy, but everyone is so busy trying to pull off their own scandle that I don’t think this could be done at an effective scale.

> If I give a company my data, and they lose it, oh well. I wish all personal data was treated like HIPAA, at a minimum.

And yet, when GDPR tries to address the issue, HN is full of "blocking the damned EU users completely" and "stop stifling honest companies".

It's almost as if the issue is more complicated than the solution represented by the GDPR.

I mean, I'm a "tin hat" privacy nut in the USA, but that doesn't mean that I'm a fan of 100% of the GDPR. It has plusses and minuses. It'd be nice to have a conversation about them.

The question though is what's the alternative? The IT industry has failed spectacularly in protecting citizens' personal data. I'm not a fan of EU bureaucracy, but it looks as if they are on the right side of history on this one.

> what's the alternative?

Absolute liability for data losses. Exactis lost 360 million peoples' data. They should be able to (a) form a class and (b) extract money damages from Exactis without having to prove specific harm, which is difficult to do with data loss.

A good model is Illinois' Biometric Information Privacy Act [1]. Broaden the the definition from "biometric identifier" to a longer--but still specific--list. If you want to get fancy, create a regulator who can add things to the list after a public hearing. (The specificity avoids GDPR's "what's personal data?" mess. The public input mitigates the risk of unintended consequences and corruption.)

[1] http://www.ilga.gov/legislation/ilcs/ilcs3.asp?ActID=3004

That's basically what GDPR does. It broadens the scope of what is considered sensitive info and slaps a fine on people PRIOR to a breach. If a breach is found, then any breach of GDPR means EU can come after that company and hurt them seriously.

A fine and a lawsuit are very different things, especially with 340M people involved [even a $340M fine would only be $1/person]; fines don't usually go to the people injured by it, which would make sense with personal data being leaked. A fine also misses companies who are "doing what the law says" but still have some horrible flaw anyways. If you are _genuinely_ responsible for the data, meaning if something happens to it you are liable for it, then you often take more care of it above and beyond, than for simply complying with rules.

> fines don't usually go to the people injured by it,

Well they do. Our government takes money through fines and taxes and uses it to build infrastructure and provide services.

> which would make sense with personal data being leaked.

My preference would be that personal data not be leaked at all. Ideally the warnings and fines kick in long before that happens.

> A fine also misses companies who are "doing what the law says" but still have some horrible flaw anyways.

What example are you thinking of?

The GDPR is quite broad and open to interpretation by both sides.

> If you are _genuinely_ responsible for the data, meaning if something happens to it you are liable for it, then you often take more care of it above and beyond, than for simply complying with rules.

That's what the GDPR does.

Requiring people to lawyer up to make the company responsible is far weaker.

>Our government takes money through fines and taxes and uses it to build infrastructure and provide services.

Using fines to fund public services creates perverse incentives though, especially where the fines go directly to the agency that brings the case.

Well, this certainly happens at small scales: Police who issue speeding tickets in the US to fund the police force; but it's not clear it happens at large scales.

> > fines don't usually go to the people injured by it,

> Well they do. Our government takes money through fines and taxes and uses it to build infrastructure and provide services.

That's only true if you consider the public at large to be equivalent to any individual member of the public, or if you believe only the government is "injured" by a data breach.

If I stole all your money and repaid it in fines to the government instead of directly to you, would you consider the matter settled?

> That's only true...


> If I stole all your money and repaid it in fines to the government instead of directly to you, would you consider the matter settled?

I mean, that's typically how it works.

People get robbed by people who don't have the ability to directly restore what they've taken, so the state takes them into custody and makes them a productive member of society.

Would I consider the matter settled? Gee, I've seen some really stupid arguments on the Internet that have made me wish I could punch someone over TCP/IP, but while I'm wishing I'm not going to wish for that either.

> Nonsense.

Can you please detail why paying a fine to the government is the same thing as paying a fine to the people injured by a crime?

> My preference would be that personal data not be leaked at all

Me too! But not at any cost. This discussion involves thinking about scope (both in who and what is regulated), penalties (both in frequency and magnitude) and pre-emptive enforcement, if any. The trade-offs are far-reaching. A conservative approach is prudent. (It's also politically resilient.)

> GDPR is quite broad and open to interpretation by both sides

That's a sin and a virtue.

> Requiring people to lawyer up to make the company responsible is far weaker

This, too, is a sin and a virtue. The sin is it may allow bad deeds to go unpunished. But presently, everything is going unpunished. The virtue is in its prudence. It's unlikely to cause systemic harm, and we can observe its case law to more-precisely draft the next wave of rules.

It's only been law for almost thirty years in some places, but pray Europe be prudent!

And what is this "any cost" rubbish?

Many european countries (including Germany) have no class action lawsuits, so unless you want 340 million separate lawsuits, you’ll have to either use fines, or accept that this will go unpunished.

Just because there are no class action lawsuits doesn't mean you can't take collective legal action. It's been a long time since I lived in Germany, so I can't cite any recent examples, but 20 years ago there were lawsuits in Germany of broad groups.

> any breach of GDPR means EU

Replace EU by "the data/privacy regulator of the country in question"

How is a regular person supposed to even know if their data was in this particular leak?

> How is a regular person supposed to even know if their data was in this particular leak?

Reporting requirements. If you find out you're breached, you have to notify everyone involved--plus their states' attorneys general--within N days. If you find out you're breached and fail to notify at least one attorney general, that becomes a criminal liability for those who knew but didn't act.

I was working with Albany on a law in this form (notice only) after the Equifax breach. It was tabled due to lack of Equifax-related outreach from voters.

It was tabled due to lack of Equifax-related outreach from voters.

I suspect this is because the very people who would be the most vocal about this issue are also the most politically cynical, and would never think to reach out to their representatives. That's a damn shame, if true.

A responsible company might be expected to put up some sort of portal that allows a user to check if they've been compromised.

Of course, "responsible" and "data aggregation company" rarely belong in the same sentence...

I'm suggesting that the alternative is a modification of the GDPR. It has a lot of great aspects, and some aspects that are kinda terrible.

It seems like the biggest issue with GDPR is that it’s comes from Europe and not the US? Historically speaking, Europe has in many issues come to agreement on technically solutions and industrial standards many years ahead of the US.

For example, Europe was first on texting on the mobile network while the US (single country) took years to come to a standard.

I think it will be the same with regards to GDPR. You (US) will discuss this for years and come up with a different law.

That's not the biggest issue that I have with the GDPR. In fact, I'm totally OK with someone doing a better job than the US at regulating privacy. However, I have some complaints about the GDPR, and there isn't very much discussion about the details; most people appear to think about it as "all or nothing" or "Europe good, USA bad" or "everything looks clear to me so what's the problem?" instead of discussing the details. You can see all of these opinions in this very discussion.

You're creating or contributing to the problem you say exists, parent above was not refusing discussion and literally asked you what alternative you have in mind, so he was open to them.

But after going several answers deep you still haven't listed any specific complaints and instead complain that nobody discuss them. This really make no sense.

So, feel free to explain what specific things you dislike, why, and how else you would have done it, and then people would be able to discuss them with you and exchange opinion.

Saying "it's not possible to talk about x" when you don't even try to really isn't the way.

The issue isn't complicated at all. Regardless of any country's laws. Don't use my personal information for anything other than verifying my identity or record keeping. Don't give it to anyone, don't sell it to anyone, don't use it for marketing bullshit, dont analyze it to find out how to sell me things, or how to trap me in targetted advertising (which I block anyway because you have no right to spam me with them or waste my bandwidth) or filter bubbles. If you can't do that when fuck off, there's no reason I should do business with you.

As someone who owns services which had to provide our data for customers as part of GDPR I was super happy to oblige.

The work sucked, but I was more than happy to help our customers get their data from us.

Same flawed logic as in online piracy.

No one lost your data, they still have it, but someone else made a copy.

A rather irrelevant nitpick to this discussion. Let's not pretend we didn't know what lost meant in this context. And of course it's the process of making a copy that's the issue.

Actual I think it's worth picking that nit because it emphasises that the company who leaked the data, allowed its exfiltration, still have the data.

That emphasis is twofold: 1) they can do it again, because 2) they didn't lose anything.

The corollaries being that their incentives aren't aligned with the people whose data is leaked, the company don't need to spend on avoiding leaks because they're not harmed beyond a little (bad) PR.

What was lost wasn't the data, but its confidence and control over it.

The pedantism is unhelpful.

Liability piercing the corporate veil to executives, shareholders, creditors, vendors, and clients/customers.

Why do you think this can be stopped ? Think it through.

You can't stop data loss until you can guarantee platform security. You can't do that until you prevent developers from creating bugs and security flaws in the first place. You can only do that unless you have either perfect tools to catch all the issues or a perfect testing regime.

It's basically an unsolvable problem.

That's a defeatist attitude. If you make companies liable for this, they'll start paying more attention to security. I'm not a trained security expert, but did have to explain this year why not to store plain text passwords in a database. Security is seen as secondary to product across the board. We need penalties to change this.

It's definitely solvable. You pass laws that make it very costly for businesses to expose personal data. Businesses are rational actors (for the most part) and will adjust accordingly, for example by not collecting the data in the first place.

We can't prevent builders and architects from making mistakes either, but they are required to comply with regulations, and take out insurance to cover their customers in the event of human failure.

If the problem is inevitable on some level, then why isn't insurance to cover that eventuality required?

I still can't understand why leaking SSN should do me harms. These are primary key, not crediential. But everybody is treating them as crediential.

Tell me about it. We need a government account that grants access to banks and utilities via oauth or some other cryptographic protocol that allows revocation at will.

Except that's a political nonstarter in the US. The military has a 2FA smartcard authentication system that works really well, so it's not like it's infeasible.

A number of of very different groups are very opposed to the very idea: libertarians, (some) Christians, and (many) civil rights activist being the most vocal.

The reasons for this isn't your privacy. But still it can match you as a person even if your other data is defect or incomplete. A good primary key.

It is obvious that selling customers' data gives more profit than not selling. No wonder that in countries with little regulation personal data are collected and sold in mass. It is the most profitable strategy for companies that have those data.

As a US citizen, traveling in the EU, what rights do I have under GDPR? Can request data and erasure from Exactis while abroad?

No, as a US citizen you have no such GDPR protection. If Exactis operates also in the EU, Eu citizens may request their data or erasure of their data from Exactis.

> "I don’t know where the data is coming from, but it’s one of the most comprehensive collections I’ve ever seen"

> Each record contains entries that go far beyond contact information and public records to include more than 400 variables on a vast range of specific characteristics: whether the person smokes, their religion, whether they have dogs or cats, and interests as varied as scuba diving and plus-size apparel.

It might be "comprehensive" but is it comprehensive in a scary way? It's probably just 400 machine learning features that are estimating what people might like, so not necessarily super accurate?

> so not necessarily super accurate.

Even worse. Many people say they “don’t have anything to hide” because they too haven’t considered the vast consequences regardless of having something to hide. For starters, when the data is inaccurate, you might have something to hide that even you didn’t know about, and it could be responsible for all sorts of events and opportunities in your life both public and private without you even knowing. Things that give you an different life experience than your friends to an unknown degree. This sort of lack of knowledge, control, deprivation of explanation or closure etc. would be the lived experience of chaos and it’s one of the most frightening parts.

False or misleading information can also be harmful, if widely disseminated.

Birth certificates. Creditworthiness.

At the age of 54, Sigmund Arywitz was a healthy American success story. He was making $30,000 a year as executive secretary and treasurer of the Los Angeles County Federation of Labor, AFL-CIO, his family was sound, his reputation high on all counts, and he had just finished eight prestigious years in Sacramento as state labor commissioner under Gov. Edmund G. (Pat) Brown. But something was awry. In the space of one year, five Los Angeles department stores refused Sig Arywitz charge accounts, and a major car-leasing company turned him down for credit -- even though he had a walletful of oil-company and other credit cards and had always paid his bills on time....



See also: Cardinal Richelieu.

> See also: Cardinal Richelieu.

Context on the reference: https://history.stackexchange.com/questions/23785/what-did-r...

Accuracy doesn't matter.

Imagine you're a pastor at a church and a datadump claims you're an atheist. Maybe you can convince people it's a mistake, maybe you can't.

More and more this just feels like the modern crisis of capitalism. The declining rate of profit is so extreme that we have to institute a corporate marketing panopticon designed to sell you shit you don't need, to the extent we're willing to risk that panopticon leaking dangerous information to non-state actors that could lead to theft, extortion, or worse.

And we're not even beginning to think about what this can be used for by authoritarian regimes (cf. https://www.madamasr.com/en/2014/09/29/opinion/u/you-are-bei...)

>And we're not even beginning to think about what this can be used for by authoritarian regimes

"Big data" was crucial to the operational efficiency of the Holocaust


I think that the right title should be "Marketing Firm Exactis Exposed a Personal Info Database with with 340M Records on Internet". This is not a leak, at least there is no evidence of it yet. While this does not downplay this security "mishap", there is still big difference between "someone rob a bank" and "bank left their vaults open".

OTOH, it would be interesting to know how did they get hold on such data.

Unusually for me I find your pedantry here too quibbling - even if the bank is left open taking the money is still theft (robbery is with threats/force in my jurisdiction, UK).

The point is there's no evidence a robbery occurred. Someone phoned the bank and told them they saw the vault was left open and that they had better count the money. Nobody knows whether anything was taken yet.

What is messed up is if the firm doesn't have sufficient visibility/logging, they can claim "no evidence" the data was accessed (purely because they Had their eyes closed). IMHO that is negligent - but sadly the various laws tend to support it.

The problem here is it may be impossible to know if anything was taken.

If theft occurs at a bank, the money is gone. If I steal information, the information is still there. Knowledge of theft is completely dependent on a logging system capturing the correct information.

Except what happened here was someone left the vault open and a passerby called the bank and the cops to let them know.

This will continue to happen until the laws change such that holding personal information is a liability, not an asset.

Where is the american version of GDPR when we need it? This is arguably worse than Equifax one.

edit: nope, this is infinitely worse.

It's coming to California in 2020:


What is the source of this data?

Without more information I can only assume they are scraping public records just like sites like Spokeo etc. Perhaps with some data analysis thrown in.

So I don't see much of a personal concern; especially since their business model appears to be selling this very data!

I think you're a bit confused by what data Spokeo has. Most of it is generated on the fly when you do a query, by scraping other sources.

That's what I mean though.

If this data comes from just scraping other sources that are freely and publicly available and applying some shitty data analysis on it, why should I be particularly concerned? The data itself is already out there for someone to find if they wanted to or even buy it from this company if they are too lazy to scrape themselves.

However, the source of the data wasn't in the article.

I think a lot of these incentives could be resolved by just treating data as a liability.

William Pearson


Will is a highly accomplished IT Executive designing and developing self-service software applications built on BIG Data, running in Cloud Infrastructure in highly secure environments, leveraging analytics and yielding high profits and rapid growth.

He is responsible for technology strategy which includes highly accurate and automated data processing, cloud infrastructure, MS Azure platform-as-a-service, Cloudera / Hadoop Data Management Platform, APIs, Marketing Automation Platform, Analytics, and Digital Marketing.

( http://www.exactis.com/about-us/ )

highly ironic

I wonder what’ll happen if they’ve sucked up a bunch of European data too

Has anyone mapped out all of the data brokers that are active in the US, what information they collect, what their sources are?

I imagine a lot of that info is proprietary, but I'd really like to understand this industry better. It's probably a foolish hope, but I really hope there are a few main choke points that one could opt-out of. If that's not possible, I could always try to inject bad data into the system, if I know what their inputs are.

Yeah. I’m not a god or smth, but I’m ashamed of “experts” that work(ed?) there.

At the very least it’s a pity that even good people make _mistakes_ like that.

Untill that day, whatever I do that is at least robust, diesn’t require mainatance each 2-3 months etc. not even fast, just decent.

In short while our idustry features stuff like that I will be angry, sad etc. But boy I will always have money, I might have a lot fun at least for some definition of fun!

Don't you love it when a conpany you've never heard of turns out to have a ton of personal info on you, and then jsut gives it away?

Fortunately, at this point, with all the leaked data, it's nothing really new that others don't already have.

Seriously, though, this is just getting out of control. I'm almost to the point of writing my representatives. I don't think the industry can adequately self-police.

[Edit] By "industry" I mean any company who handles my personal data.

"Marketing Firm doxes 340M victims"

This always makes me think that if we had a semantic web that this would be unnecessary. I know, I know, that's unrealistic but hear me out. It means that you wouldn't have to hoard data such as in this case. Yeah I know, terrible idea but I think about this logic.

Why exactly are people not paying more attention to projects like MaidSAFE which are working diligently to solve these problems once and for all?

Why do we assume we have to make an arbitrary choice of landlord to trust just so we can get basic things done on the Internet?

Don't forget that you leaked all that info in the first place.

Do you use ad-blocker, vpn, private browsing, same on the phone too? All privacy settings in facebook, avoiging gmail and google? No?

At least teach your kids to.

You go through all that trouble, and yet ... you have a Facebook account?

Anyway, the info being leaked here isn't dependent on browsing history. Companies have been gathering these sorts of profiles far longer than most people have been using the Internet. The only way to avoid it is not just to never have Internet access at all, but never have a credit card or a bank account, never own a house or sign a lease, never drive a car, never register to vote, etc. If you do any of those things, even temporarily, you could be leaking information that can't be unleaked.

True, its impossible to stop banking leaks. I recently found that Visa and MCard sell not only transaction totals, but Itemised data, like eggs, 2x milks, chocolate brand. I guess I'll fill my wallet with cash for shopping.

Many people cannot jump off facebook. I just use Messenger for comms.

And its no trouble, after setting all up, in few months, you'll hardly notice it, it will be new normal.

Forgot to add, use several emails, one for government, one for facebook, for ones you will not use, like shop and blog accounts - create few protonmail ones, it will be harder to put them all together by data mining.

If registration page asks, give fake names, DOB, phone numbers. Make it a habit asking yourself - do this shop really needs this data be real?

And, they made fun of RMS... He was telling you what the future holds. This is just a trailer of what is to come.

I agree, it is really unfortunate that even people within the software development profession take these issues so lightly.

I don't think it's so much that software devs take it "lightly". In my experience as a infosec consultant, the bigger problem is that most software devs are too cocky when it comes to security. Most think that security is just a subdomain of computer science (it is not!), and that because they took a crypto class in college, they are 100% qualified to handle the security themselves. They think they are taking it seriously, but they don't understand that knowing how to write software does not make you an expert in securing software.

Most devs don't seem to acknowledge that good security requires having a separate, dedicated person/team to handle it, just like how you would hire a lawyer rather than having your software devs handle legal issues.

I once posted on HN that every company that deals with sensitive data, big or small, must have a dedicated security person/team. My comment was downvoted/flagged, and I was bombarded with responses like "why would we waste the money on a security person? my dev team already knows to encrypt passwords".

This. I worked with an end-to-end encrypted communications company for 5 years, and learned a vast amount more about crypto, attack vectors, and security holes than I did in the previous decade or two, but I would never claim to be a security or crypto expert, or even competent at it.

In fact, I almost certainly know only a tiny fraction of what the actual experts in that company knew, but a number of people have told me that I know a lot more about it than the average developer.

That scares me, and if people flame someone for recommending that a dedicated security expert be hired by companies that handle sensitive data, I can only conclude it is out of ignorance - of what's out there, and what's possible.

On the other hand, there are economic realities to consider, especially in early-stage, underfunded startups. What do they do about this?

Where can mere mortals get an overview of just what you know? A lay of the land, scope, just to frame up what these problems really look like.

Its hard to even think about these things for those of us working at low levels, firmware, embedded, etc...

Your comment got me to thinking about what I don't know. Which is a whole lot.

I think you're missing the point. It's not about security, it's about data.

Ah, you're right. I skimmed over the top level comment and missed that. My bad.

Information wants to be free. That includes information you don't want to be free.

    Information wants to be free
    Unless it is about me

Where do I sign?

And? ...

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact