Hacker News new | more | comments | ask | show | jobs | submit login
The City of Seattle Accidentally Gave Me 32M Emails for $40 (mchap.io)
516 points by bpchaps 4 months ago | hide | past | web | favorite | 233 comments

I'm very surprised they gave out this information. I'm not talking about the mistake, I mean the actual request. In the UK I don't think you could even get a production order for this. Like, it's effectively getting Communications Data simultaneously against thousands of people not suspected of any crimes??

Like, do people know that by emailing their local government their email address is now free for scammers to request under FOI? Could I request this data myself, then start emailing them scam emails "I know you contacted us in June, could you call me on 555-1223 etc"

This seems totally against the spirit of FOI

The Washington State Public Records Act, which this request was made under, states its spirit very unambiguously:

  The people of this state do not yield their sovereignty to the agencies that
  serve them. The people, in delegating authority, do not give their public
  servants the right to decide what is good for the people to know and what is
  not good for them to know. The people insist on remaining informed so that
  they may maintain control over the instruments that they have created. This
  chapter shall be liberally construed and its exemptions narrowly construed
  to promote this public policy and to assure that the public interest will be
  fully protected. In the event of conflict between the provisions of this
  chapter and any other act, the provisions of this chapter shall govern.

Beautifully put. This information is _there_ whether we like it or not. I’d rather have as much access to it as a government employee than none at all.

I have never assumed that an email address I gave the government would be protected. I would also not assume that the contents of any email I sent would be in any way protected either. The government is collectively owned. Your police record, where you live, who you're married to, and whether or not you voted last election are publicly available. I would rather all of that be protected in some way, but I think it's common knowledge that a lot about you is made public to anyone who wants to walk down to the courthouse. In fact, if you want to take a trip to Hawaii, you can drop in and see a copy of Obama's famously "missing" birth certificate. I am rather shocked that credit card numbers are being emailed about.

> Your police record, where you live, who you're married to, and whether or not you voted last election are publicly available.

This is highly country specific. For the marriage record, I checked the laws in Germany, and (except for your own records) you have to present a "legal interest", which seems to be stricter than a "legitimate interest" (i.e. probably you need the information to enforce your rights, not just because you want to do genealogy). I'm pretty sure the others would count as particularly sensitive personal data too.

In the UK, as far as I know of those only marriage records is open. Police records can be obtained by your employer, but even then most minor (sentence less than 4 years) offenses are eventually considered spent and not disclosed to most roles. The electoral role (addresses) is open by default, but you can opt-out (though can still be used for certain narrow purposes), and as far as I know there is no way to check if someone voted (I've never heard of it happening, and searching doesn't give any information about it)

You can opt out of allowing use of the electoral role for marketing - you can't opt out of political use.

You can also not be on the role but you cant vote in that case.

Maybe you need the GDPR.

When governments in the EU started digitizing their data like 20 years ago, it used to be that lots of personal data would end up published on official websites, either in the form of scanned PDFs that Google would gladly OCR and index, or in directly readable formats. Since then, the EU has cracked down on all of that, and you can no longer search for someone's phone number in order to get their full name, address, date of birth and ID. Even data that used to be available just 10 years ago, now has been removed.

GDPR largely does not regulate what governments do with data.

Except the GP is talking about explicitly public records, not private records that were inadvertently published.

(Ignoring that the GDPR doesn't even apply to governments)

> I have never assumed that an email address I gave the government would be protected. I would also not assume that the contents of any email I sent would be in any way protected either.

I have similar assumptions, but what about the less technically inclined citizens?

Moreover, I wouldn't be surprised if the ploy described by the OP:

> Could I request this data myself, then start emailing them scam emails "I know you contacted us in June, could you call me on 555-1223 etc"

would work on me. Really, it would take me checking DKIM and/or SPF to notice such an email. And from this story it seems likely the city of Seattle doesn't actually implement DKIM or SPF.

DKIM and/or SPF can be a liability if you want to accept everyone's email. Only use if filtering is more important.

> I have never assumed that an email address I gave the government would be protected. I would also not assume that the contents of any email I sent would be in any way protected either.

That's because you're not corresponding with case officers and police officers.

Note that human services & the police dept. did not respond. Likely because they are exempt from foia requests.

Similar story.

I worked at a polling company out of college owned by a Standford professor. My first task: After a poll is finished online, match that with voter records (using emails and addresses).

My first question was: "Well, that is a cool idea, but, there is no way the government would release a huge database of every california voter and their party affiliation. Let alone, the users entering in online poll information would extend that database to include their actual vote. There is no way this is possible.... right?"

Standford professor's response: "Do you want it in CSV?"

Voter registration is considered public information in many states. Some states even provide the entire database on their website to download. However, voter registration does not include who a person voted for in an election. You are free to augment the database with your own data, of course.

I don't even understand why party affiliation is tracked by the state. What's that good for other than entrenching the two party state? Parties should have their own member lists.

There are states with ‘closed primaries’, where certain elections are only open to registered party voters - ie a Democrat would not be allowed to vote in the Republican primary and vice-versa.

This, and ‘because that’s how we’ve always done it’ are probably the main reasons party affiliation is part of voter registration.

If a party wants to hold a closed primary, can't they do that themselves without help from the government? Why would the government be involved in a party-internal election?

It always interests me how much of the US electoral system is just obviously completely broken from the perspective of outsiders, and it seems strange that people within the US see procedures like this and view them as normal and legitimate.

I'm curious what you see as "obviously completely broken" about the current primary process. Previously, party candidates were chosen via convention, which effectively left the selection process to party elites.

Deleted my previous comment as it didn't directly address your question. I don't know all the history behind it, but in terms of "can", I'm assuming that no, parties were unable to effectively hold a closed primary in a way that was well-run and accessible. And a poorly run primary vote defeats the purpose of having any primary in the first place. Using the state infrastructure and schedule makes the voting process much easier for the average voter, without being a burden to the state political parties.

As to "why" the government should feel obligated to subsidize the process -- because the government has the ostensible goal of facilitating fair and proper elections, and presumably the primary process -- which is not Constitutionally-enshrined -- is a net benefit to the general election, at least compared to selection-by-party-convention. In the future, political parties may decide that it's better to have open primaries, but that's orthogonal to the government providing the voting infrastructure and logistics.

Mostly for primary eligibility. It also allows for things like ensuring that poll watchers etc are available in a equitable way.

Why is the state getting involved in primaries? That should be the parties' business. I think there should be no public record of party affiliation.

Primaries aren't enshrined in the Constitution, but they became state business in the 1970s because there was a desire to let the average voter have more say in the selection process, which had previously been done via party conventions.


I would prefer going back to party conventions instead of the endless campaigning :-)

Either way it boils down to $ spent to “win” the nomination. Whether it’s spent on advertising or back room wheeling n’ dealing it’s just money spent.

My (very uninformed) guess is that this is cultural. When democracy spread, Europe already had a good system of tracking people. The church was keeping records of every family for hundreds of years already. In contrast, the USA is a country of immigrants where new people without a history came in all the time. Tracking who votes where could not rely on an established system.

Voter registration is fine but why track party affiliation?

>However, voter registration does not include who a person voted for in an election.

Many include

1) Last time you voted


2) what your party affliction is

I'm personally afflicted by both parties.

Yes, though neither of those include who you actually voted for.

Reflect on the fact that 'didn't vote recently' is being used as we speak to suppress voting in the upcoming election. For non-US readers, every state has a Secretary fo State who rather than doing any kind of foreign affairs work like the federal office of that name, primarily oversees paperwork and particularly elections. These Secretaries of State are elected offices and highly politicized, since officeholders can heavily impact the conduct of elections. In Georgia, for example, one of the candidates for Governor is currently Secretary of State, and has put the eligibility of tens of thousands of voters into question in a way that just happens to massively impact likely voters for his opponent.

I agree, and in that context, having name, party affiliation, and last_voted_at be public record would be the only way for an independent organization to gauge the impact per party of the disenfranchisement.


Great name for a scam for-profit, online institution, don't you think?

Yes, you are as far as I can see correct. The request should have been rejected as overbroad and against data privacy laws (in so far as they exist), or the purpose of the request could have been verified and then they might have seen whether or not there was another way to let the requester do their work without giving them the data they requested (see another comment of mine for one suggestion).

That's not how FOIA works. It's a good thing too. Government employees almost always fight FOIA requests. There aren't many subjective tools (e.g. overbroad) and you're certainly not required to say why you're making the request.

Data privacy laws in the US are unfortunately minimal. The bigger problem comes from imbalance -- if the government and corporations have lists of names, people need them to in order to be able to work together and organize.

If you don't think this information should go out in FOIA requests, the tool to accomplish that is data destruction. Government could wipe old emails once no longer relevant.

Email is really difficult because retention law and regulation is based on a topic.

To meet federal requirements, information about procuring equipment with certain grants must be maintained for 10 years. Caseworker notes for a minor who is a ward of the state may be required to be kept for 20 years after the 18th birthday.

If a record is deemed in scope and topical, an employee could be committing a crime by deleting that email. As a result, the easy answer is retain.

Commercial organizations at least are known to implement maximum allowed retention strategies, such as having their staff not keep archived email beyond three months, presumably so it doesn't embarrassingly show up when it's legally unfavorable. Not quite the same, but along the same lines.

No, the Washington State law does not allow for agencies to reject requests on the basis of being overly broad or against data privacy laws. There are specific exemptions (e.g. library records), and for records that may contain personal info (like someone emailing the mayor and including their own credit card number), it is up to the agency to redact such info. However, the agency can charge the requester for that work.

Moreover, the requester is not required to give a reason for the request.



Ok, so in that case redaction would have been the way to go here. But the request as it is actually harms the privacy of large numbers of individuals which is not what the FOI laws are supposed to be used for.

Also, of course Seattle could reject the request, they could simply say: "Without an explicit court order to release this information we will not do so", and that would be that. It would then be upon the petitioner to ask the courts to force the release of the information requested, if the petitioner felt his rights had been violated. In the present situation the city is opening itself up to liability because of the privacy of all the people they have exposed (and more so because of the mistake). FOI does not mean 'every piece of data the government has should be released to the requester', the goal is increased transparency of government, not privacy violation of citizens using the FOI requests as an end-run around any kind of privacy law.

There is a tension between those two and typically the legal branch will determine where exactly the line is, when in doubt: go to court.

But Seattle cannot summarily reject the request -- they have to follow the law, and the law does not require FOI requesters to get an explicit court order, e.g. a subpoena, for this information or for any other valid request. I mean, yes, the city of Seattle could try to reject the request, and the requester could sue and win in court after the judge finds that the city acted illegally. But that's like saying Seattle police "can" just arbitrarily arrest and imprison people, and fight the subsequent lawsuits.

Because the FOI law exists, the city does not open itself to liability in releasing records, except when it accidentally releases records that are mandated to be private, which I'm not even sure is the situation here.

Increased transparency is almost always a tradeoff with privacy. I don't disagree with you that the law may be abused for commercial or malicious intent, but it is up to the legislature to propose a bill that curbs FOI. Until then, the government cannot just deny valid requests because they don't approve of the requester or the requester's purported motives.

> which I'm not even sure is the situation here

That's the key bit right there. So, if you are not sure - and they are also not sure - then they could ask for a ruling before releasing. Err on the side of caution is good practice when it comes to releasing data.

I just looked at the dataset and it is full of information that I would normally consider to be private, which private citizens contact which government officials and when is in principle not something that should be disclosed to all callers in a format of their choosing.

What's to stop you from asking for stuff that infringes other people's privacy? I'm all for a more open government but 'anything goes' FOI requests are only a little bit less dangerous than non-transparency.

There is some middle ground to be found here.

Sorry, what I'm not sure about is whether an agency is liable if it releases exempt information. Exemptions allow an agency to deny a request, but the agency still has discretion whether or not to follow the exemption.

> So, if you are not sure - and they are also not sure - then they could ask for a ruling before releasing. Err on the side of caution is good practice when it comes to releasing data.

Again, that is simply not how the law works. Some years ago, elected Washington state legislators and the governor decided the law should make these tradeoffs between transparency and privacy. And until subsequent legislators get together and decide otherwise, that is the law of the land. Washington government agencies do not have discretion to reject requests based on requester identity or motivation, period, nor can they make up their own reasons for exemptions.

The "middle ground" has already been decided -- that's ostensibly how the law got written and signed in the first place. Your line of argument would allow literally any government employee to make arbitrary rejections -- the law was codified to prevent exactly that situation.

Your concerns are no different than concerns raised about freedom of speech and the press (and of course, the right to bear arms, but let's not follow that tangent for now) -- e.g. "I'm all for people being able to express themselves, but what if those people say incredibly hurtful and damaging things?". The legislature can pass laws that limit those rights (e.g. defamation laws), and courts interpret whether those laws follow the Constitution, but it is not up to the executive branch (i.e. government agencies) to ignore the law because they disagree with it.

> based on requester identity or motivation

No, but they should decide based on the data requested. And in this case the data requested is none of the requesters business since it involves the privacy of other citizens.

Which definitely could be in contravention of other laws and in cases like that judges usually get to decide which weighs heavier. If I were a civil servant faced with a request that releases information that I felt would infringe on some other law I would definitely not decide to be the one to make the call and release it without a sign-off.

There isn't just one law at work here.

> they should decide based on the data requested

I agree with this -- of course a request can be rejected if it requests something that is explicitly exempted in the law. The metadata of emails to public agencies is currently not exempt from Washington state law.

> And in this case the data requested is none of the requesters business since it involves the privacy of other citizens.

OK, but that is not your or the state government's decision to make. The law does not allow for the government to make a unilateral judgment on whether something is "none of the requesters business" -- isn't it patently obvious how this could be abused?

> If I were a civil servant faced with a request that releases information that I felt would infringe on some other law I would definitely not decide to be the one to make the call and release it without a sign-off.

Sure, if you don't know the law exactly (most employees don't), then you consult your agency's FOI officer, who would then tell you whether the request is valid. If it is valid, and you decide to reject it anyway, you'd probably be fired (I don't think most state FOI laws provide criminal penalties for violating FOI).

I'm not a Washington historian, but I'm assuming the FOI law was passed because legislators had actual scenarios and use-cases in mind. For example, being able to request the emails sent and received by a government employee is useful if you want to know who contacted that employee about an issue, such as a regulatory enforcement action. Maybe there are clear-cut cases where a received email is obviously not work or issue-related, such as emails from that employee's mom. But what if the mom is herself a lobbyist or other influential official? Or how about an email from a guy talking about going golfing and getting a few beers? Is that just personal? What if the guy emails every week about going golfing on his dime, and the guy happens to be a businessperson waiting for regulatory approval on some project?

Apparently, the myriad of ways for unwanted behavior to be expressed via email are so plentiful that legislators decided to err on the side of transparency, because it would be too easy for officials to shut down requests and deny transparency all but to those with the means to sue (usually, corporations and journalists). It is not up to a civil servant to decide otherwise; likewise, the law protects the civil servant from liability for following a lawful request.

Say an ex finds the new address of a former partner then goes around and shoots them dead.

BTW I am not being hyperbolic here there was a case where this happened when I worked for BT - someone as a favor looked someone's new address up for a friend which resulted in a murder.

The scenario you propose is possible through using Google, and/or many other services that collate public records of people's names and addresses. Try Googling your name and one of your cities of residence some time.

> someone as a favor looked someone's new address up for a friend which resulted in a murder.

A government employee who looks up someone's address for a friend is not covered by FOI laws. Just as FOI doesn't protect cops who use the DMV database to look up other cops they like/hate:

- https://www.wired.com/2012/11/payout-for-cop-database-abuse/

- https://www.sun-sentinel.com/local/broward/fl-pines-police-o...

And if the state govement employee freely just hands massive amounts of personally identifiable information due to a FOI this is some how better.

FOI is a law mandating these public records be provided on request -- with various exemptions and allowances for redacting information that could be reasonably seen as a violation of privacy. It's not about being "somehow better", because legislators have deemed that bureaucrats cannot be trusted to decide whether transparency is a good thing.

Consider the example you brought up -- it is against the law for a state employee to send a friend that kind of information, and I imagine that that law exists because politicians feared that kind of murderer scenario. How exactly does that murderer use a cache of email metadata and redacted messages to go after his victim?

I mean, real estate records are public information and real estate transactions are even republished in the newspaper on a weekly basis...

I have filed both US and UK FIO requests. (But I am not a lawyer.)

I think you are right in the UK. The US law is different and seems to allow this sort of broad request. I have been told before when filing in the US that others may request my contact information, and I have seen lists of FOIA requests received via the FOIA including contact information for requesters.

In the UK, requests need to be fairly narrow as I recall. And the time frames in which the request will be processed also are narrow: 20 business days as I recall. If it would take longer than that you probably would be asked to narrow the request. This is good for me as I typically request individual documents, not huge swarths of data. I requested a classified UK technical report and received a redacted copy within a month as I recall. Much faster than in the US.

> do people know that by emailing their local government their email address is now free for scammers to request under FOI?

Florida, which has fairly broad open records laws, at least makes this extremely clear:


> Any agency, as defined in s. 119.011, or legislative entity that operates a website and uses electronic mail shall post the following statement in a conspicuous location on its website:

> Under Florida law, e-mail addresses are public records. If you do not want your e-mail address released in response to a public records request, do not send electronic mail to this entity. Instead, contact this office by phone or in writing.

Why is it like that? Do they give out your phone number or postal address if you contact them via those channels?

I suppose they treat e-mail addresses as street addresses. Public and not tied to anything else by default (you know it exists but not who is assigned to)

The answer is that no, that request should not be filled, and certainly not in that way.

From a FOI POV, email is tough because it straddles a line between “record”, deliberative material and conversation. Everyone hates sharing email because it is always trouble.

In this case, they didn’t have a good process in place and nobody did a privilege or other review. The hint there is the police material — most police records are trivially made exempt from foi in most places.

But to your point, there are many categories of communication with government where there is literally no expectation of privacy. If you email the zoning board something, you should fully expect to see the entire email in a public record somewhere.

The part I found even more strange is that people are sending their credit card numbers and other personal information through e-mail...

In practice, the bad guys are not getting credit card numbers by monitoring unencrypted plain text email connections. They're doing it by getting the information in bulk straight out of databases at the destination (either by hacking or by getting unscrupulous employees to sell them data dumps). My opinion is that the risk factor in sending a CC number over email is much smaller than the risk of giving it to the destination organisation in the first place. And even if you do get unlucky, as a consumer you don't usually end up out of pocket for fraudulent transactions on your card.

I think it's very hard to anticipate how every given person in a large population uses communication tools. When former Florida governor Jeb Bush began his campaign for presidency, he released all the emails he sent/received as governor. This is something mandated by Florida's very open public records laws:


As it turned out, the cache of thousands of emails contained things that are hard to anticipate. Including people emailing their SSN, or talking about their employment/medical issues; the former is possible to filter out computationally, the latter would require manual review and judgment. Bush was criticized, and in response, he took down the emails temporarily until employees could clean them up. But AFAIK, he didn't do anything illegal, because he mirrored exactly what was available from the state official archive which, again AFAIK, did not alter its copy.

This is similar to AOL's release of its search logs. It thought anonymizing user identities would provide anonymity, but they did not realize that some users write very personal things into the search box: https://en.wikipedia.org/wiki/AOL_search_data_leak

Are you really shocked by this? I guess you have never worked on a corporate email system! People do this all of the time.

1) They don't realise email is not secure

2) When you explain point 1, all of the other solutions seem like too much hassle so they email anyway.

3) You can tell your customers not to email you CC numbers, you can even refuse them, but they will keep sending them

...and if you provide online chat with customer support or sales, people will send credit cards in that too. If you have customer accessible support ticket submission system, that will end up with credit cards too.

The Payment Card Industry Data Security Standard (PCI DSS) that you have to agree to follow in order to be allowed to process credit cards requires secure storage of all the cards you store--not just the cards that you intended to store or just cards that come in through channels you intended for receiving cards.

This is a serious issue to take into account when choosing your chat system, ticketing system, and email system because you can't just ignore those wayward credit cards. If you choose systems whose developers did not consider this and failed to provide good tools for finding and redacting unrequested, unwanted sensitive information you will end up having to hack such tools into the system yourself.

That is a very good point

Or they think it's an acceptable risk... $50 liability limit.

That's the thing with credit cards. The merchant always ends up paying.

People laugh at most DLP solutions. I know I used to. But then, I used to think the target was smart people trying to exfiltrate data.

But then I implemented one on a busy mail system, and I started seeing credit card numbers very regularly. I mean, consistently. Even after people have been told why their emails are being bounced and no, I give an "ETA on when it will be fixed".

It's not strange, I too send my cc number and personal information through email because I consider the risk is sufficiently small. Likewise I told people who want to send me these information to just use email

Imagine you are a secretary working at a company where HR is physically located across the country. HR needs you to fill out "Government Form 27" which requires personal information.

You fill out the form, how do you transmit it back to HR? You're probably going to email it. Very few companies have a "secure document transfer system" and very few people understand the risks of emailing personal information.

> Could I request this data myself, then start emailing them scam emails

Yes, bad actors can find malicious uses for a dataset. Not sure what that has to do with FOI. Are you suggesting that people who email the government expect their email to remain private, even as that email may be forwarded to a number of agencies and employees?

Using FOIA data for marketing is not legal in USA.

It is done but not legal and hard to prove.

Some state laws do not restrict commercial usage [0] amd AFAIK, there is nothing in the federal FOIA law that prevents using information for marketing or other commercial purposes. That wasn't my point, though. The parent commenter talks about the potential danger of phishers using public data as being "against the spirit of FOI". Yes, I agree that using FOI to find info that facilitates a crime is not FOI's intended usecase. But the potential drawback has not been determined by legislators to outweigh all the potential benefits of transparency.

[0] https://www.mcall.com/news/watchdog/mc-nws-open-records-busi...

A few years ago I found a random SSD on the ground while on a walk with my son. The drive contained unencrypted records which squarly fall under HIPPA. I also did the right thing and returned it to the proper owner and told them about how their mdb files were readable by anyone.

The same exact thing happened. They thanked me and then their lawyers nicely asked me to clone my hard drive and sign a bunch of shit.

It was not fun at all. A lot of them thought that I hacked something.

The type of organization that would store HIPPA encumbered data unencrypted, which based on my brief reading is not legal anymore, is not one that would operate in a reasonable (or legal) manner. Sadly, that seems to be most organizations that fall under HIPPA, compliance is a box to be checked while expending as little resources and effort as possible.

How they reacted to your kind action is sad, and depressingly common. I hope you told them to pound sand, and contacted whoever the data protection authorities were in your state. There needs to be much more aggressive enforcement of HIPPA and similar data protection laws, CYA bull like you encountered should not be happening.

Article I ran across: https://info.townsendsecurity.com/bid/74330/Does-HIPAA-Requi...

Guys, it's HIPAA not HIPPA.

This is an interesting case. I always pronounced HIPAA as "hee-pah". That has the advantage of approximating the spelling, but the disadvantage that it's not really a natural way for an English word to be pronounced.

People in the medical field, who deal with HIPAA all the time, pronounce it as if it was spelled HIPPA. It's a short step from there to actually spelling it HIPPA.

Did you actually give them that clone and sign the documents? Or did you give push back like in the article?

It feels to me like they shouldn't have much of a leg to stand on.

I did consider fighting them for about 7 minutes. Then I remembered that they had a legal team paid for by taxpayers, while I had a toddler and a pretty decent, mostly stress free life.

The laptop that I connected thier SSD to was my coding box, so I deleted all the code and secure erased the free space before they cloned it. They gave me shit about it too, because their forensic people saw days without any file activity. When I told them that it was because I removed my IP they responded with "why don't you think you need to do that? We can keep your data from falling into the wrong hands".

Maybe if I was some kind of an activist I would have tried to fight it.

“We can keep your data from falling into the wrong hands...” just like they did for that drive you found.

I don’t understand...

You found an drive, tried to return it and then we’re subject to an illegal search? Or you consented to a search that would have otherwise been illegal?

I consented to a search

I personally don’t consent to any searches. But I’m sorry you tried to help out and then we’re coerced into having your liberty taken.

Yeah it sucked. Lesson learned :P

Also, googled you. Here is a quote from your website: "I'm an american entrepreneur, inventor and __activist__" :)

Handling these kinds of things anonymously is the only reasonable way to protect yourself.

Is there a service that supports this? Like, get it to a security researcher, wipe it from your drive, and have the security researcher handle returning the data and providing a basic education (or reporting the data loss to the appropriate authorities.)

Just turn it in to the police anonymously, let them look at the drive.

Why would you ever reach out to the owner? Is patting yourself on the back really worth the risk?

To tell them about their huge security hole.

It will fall on deaf ears.

Interesting dataset. Data like this can be used to identify strong links between contractors and government officials.

One problem is that the metadata should have only contained anonymized entries for the email addresses of the counterparties of the Seattle.gov addresses, the article leaves this unclear.

Another potential problem is that if a case of corruption or nepotism is identified that has not been passed to the authorities for review that the author suddenly finds himself in the possession of data that can be used to blackmail some fairly powerful people, in fact there might be fish at a higher than city level government in the trawl because there have to be links between Seattle officials and state officials.

Yet another problem is that the addresses most likely contain the names of private individuals (including employees) as well, and I am not quite sure what to think of that but feel that the city has no business releasing that in cleartext.

A better way for amateur sleuths and the city government to work together to battle corruption would be to release only anonymized data to protect the identities of the people working for the city, for instance by releasing only hashes of the email addresses, for instance a hash@hash format where the hash for all Seatle domains is released to the requester. All the relevant analysis could still be done, and if something interesting was found it could be released to law enforcement who in turn should have then used a judge to order de-anonymization of those entries they are interested in.

I'm not sure I could disagree with you more. At least in the US there is a very strong expectation that communications between governmental employees is non-private except in very special circumstances. You'll note Matt says that the Police and Human Services departments have not responded, I'd guess thats not an accident because police records and personnel/medical records are largely exempt from FOIA requests.

Further, the idea that sleuths (amateur seems pejorative in this case) are working with city governments to battle corruption does not hold up to Matt's (or other journalists) experience. By and large the governments only provide the data because they are required to, and we have made sure they are required to by representative legislative action.

Had the IT dept. in Seattle not made an obvious mistake this would likely have not been a story at all and the data would have been an interesting data set for informed democratic functions.

> Further, the idea that sleuths (amateur seems pejorative in this case) are working with city governments to battle corruption does not hold up to Matt's (or other journalists) experience.

Yes, exactly this. History has shown that government employees and officials, being human, are reluctant to dig into, nevermind snitch about matters that may impugn their own colleagues. That a random citizen, or even a journalist, could convince law enforcement to go to a judge based on trends found via analysis of anonymized hashes is unrealistic. Nevermind that it effectively denies the benefits of transparency to anyone who isn't trained in data science.

> between governmental employees

The request contains the names of private individuals through BCC and CC headers requested and exactly when they communicated with which government officials.

Yes. When I say “between governmental employees” I mean between them & anyone.

This is a normal expectation to the point where there are lots of rules about what public servants can even use private email addresses for. This was a big deal of course in the 2016 campaign.

This is likely a cultural difference, I view the data that Matt requested as mine because that governmental agent is acting in my name.

Fine by me.

release only anonymized data to protect the identities of the people working for the city

No thanks, that's an invitation to further corruption. Let me put that in context; recently a group of people were pushing a local authority to adjust its policy on the public release of arrestee's photos (aka mugshots). The group felt it was being selectively used for political purposes which the police and city council denied.

FOIA requests revealed that the authorities were unhappy with their public image as it related to law enforcement and decided to use mugshots as part of a social media campaign to change public perception. Can't have a social media campaign without any content, so people were arrested on nonsense charges that were subsequently dropped, so as to generate mugshots that could be publicized.

Thanks to the FOIA requests members of the public were able to confront the city council with specific names and dates of the people who drafted, approved, and implemented this policy, which information will also be central to future litigation on the topic.

The linked Kaggle dataset https://www.kaggle.com/foiachap/seattle-email-metadata/ shows that the final returned data are Excel tables with content which looks like this:

  Sender or Created by: "Herring, Kaya" <Kaya.Herring@seattle.gov>
  Recipients in To line: Ortiz, Piper; Jones, Raphael
  Recipients in Cc line: Valdez, Khloe
  Recipients in Bcc line: 
  Sent: 3/23/17 18:08
(I changed the names to random ones)

I suspected as much. So the names are out there. Pretty sloppy.

It's not sloppy, it's what the law allows for. You seem to have a misunderstanding in stating "the metadata should have only contained anonymized entries for the email addresses of the counterparties." That's simply incorrect in terms of what is allowed under law and for the request, though there is some variation state-to-state as to their FOIA.

But why would we need to protect identities of public servants? I disagree with the entire presumption that the information requested is private. It falls squarely within the principles of the laws on the books and is consistent with the ideals of open government that led to those laws being instituted in the first place.

More importantly, information that can be used to blackmail a public servant is IMO information that should be kept public. Blackmail is only useful if that information is kept private between a blackmailer and victim. Put it all on WikiLeaks and suddenly blackmail holds no weight because the blackmailer lost his leverage. If the information is of public interest and is severe enough that it is blackmail worthy, the entire public deserves to know it.

hash@hash alone isn't enough. Keyed hashes, with a secret key might work.

The issue with hash@hash is that it is still possible to see whether a given person sent an email. Moreover, there are probably similar issues as with hashed_known_hosts as described in [1]. In short, the space of possible emails might be small enough to just brute-force search for all e-mails.

[1] https://news.ycombinator.com/item?id=18082033

> The issue with hash@hash is that it is still possible to see whether a given person sent an email.

In that case you already have their email, and you know what hash and salt were used. It's game over at that point afaic, nothing will stop you from reversing all of the email addresses.

Even just seeing the graph laid out would allow you to infer who some of the players are. In general, to release such information on the assumption that it will be impossible to reverse it is irresponsible, and I would have loved for the city to recognize this and to get a judge to sign off on the release.

> In that case you already have their email, and you know what hash and salt were used. It's game over at that point afaic, nothing will stop you from reversing all of the email addresses.

Agreed, hence the need for more than a plain hash. Note that technically, a 'salt' is unique per user and generally doesn't need to be kept secret. It really only applies to storing passwords.

What I suggested is more like a pepper [1], but in this use-case, you could use the same pepper for every address. Alternatively, you could just generate UUIDs for each address and publish those, but that requires a lookup in the UUID table for every e-mail. (Just like salted hashes would require a lookup to the salt for every e-mail).

[1] https://en.wikipedia.org/wiki/Pepper_(cryptography)

I don't think you could use the same 'pepper' for every address. After all, if you know at least one address in the database (for instance, your own) and what time you sent the email (which you do) then you could use that to recover the pepper that was used for the hashing. So I really do believe the salt should be unique per address used.

As per the wikipedia article:

" Where the salt only has to be long enough to be unique, a pepper has to be secure to remain secret (at least 112 bits is recommended by NIST), otherwise an attacker only needs one known entry to crack the pepper. "

If you use e.g. a 128 bit pepper, anyone trying to brute-force that based on a known email-hash combination would need to brute force 128 bits.

> After that call, I asked my lawyer to reach out to their lawyer and was pretty much told that Seattle was approaching the problem as if they were pursuing Computer Fraud And Abuse (CFAA) charges. For information that they sent. Jiminey Cricket..

Somewhat related, I'm constantly shocked (maybe I shouldn't be anymore) at the tech ineptitude of cities that are supposed to be big tech hubs. I live in Seattle, and my regular tech complaint is we can't get the buses connected to an app that is accurate within +-10 minutes. I know it doesn't sound like much, but how much tech brainpower is here, and why isn't that tech shining more clearly?

If you are an IT professional in a tech-hub, do you want to work for a bit IT company or for the municipality?

In a similar vein, do you think the municipality wages are competitive with those from the tech companies?

I could see how being a tech-hub would function to draw a lot of talent out of a municipality.

Because the problem isn't tech, but business models. No amount of tech brainpower can help when bus operators think of their schedules as data to be sold instead of given away for free.

At least in Puget Sound, every transit operator (except for Community Transit) makes their real-time bus arrival data publicly available for free via Sound Transit's Open Transit Data service, OneBusAway: https://www.soundtransit.org/Open-Transit-Data

(OneBusAway is the name of the app that shows information to end users and the name of the API service that developers can query.)

I know that this service is widely available and free because I've had an API key for years and have (accidentally) pummeled the service with requests and have heard nary a peep from Sound Transit asking me for money.

Good for you and your city. My hometown apparently made an exclusive deal with a startup, which I hear is why we still can't see bus schedules on Google Maps.

Because private tech companies fight like mad to avoid paying taxes to public city operations? Citizens too, probably. Nobody likes taxes.

And the best/brightest tech workers gravitate to the higher private salaries.

It’s not a technical problem - those are easier.

I live in DC, land of the professional Fed. At the absolute highest level and after adjusting for location, the most a DC Fed could earn is $164,200. No surprise that anyone with serious technical talent--and by extension, market value--doesn't want such a job.

True, but consider some folks are content with a 38 hour week, may have automated large chunks of their job, and find the demands of working for a big public sector organization far less than that of a similar role in a private sector tech company.

This is something I argue with myself a lot about. I work as a software engineer for a large non-tech company. They pay well, but I could easily go to a tech company and make 20% more. Except...well, in 18 months I've never had to work more than a 40 hour week. I can totally see the appeal of a government job.

If you can manage that, sure, sounds pretty good. But I doubt that's the case most of the time. None of the Feds I've had the chance to talk to sound like they're slacking off.

Have you used the OneBusAway app? It's accurate for the buses I take. Is it inaccurate for your routes?

OBA, Transit, Apple Maps, and King County Trip Planner all use the same back end data sources (though Transit adds users who are volunteering their location with the Go feature in its app) so they should all be similar to each other.

Ever since the major overhaul of data in mid-2015 that tried to fix the "ghost bus" problem (a bus would show as being on the wrong trip pair) and the "never-ending terminal" problem (arrival times for stops very near a route's terminal would show as many minutes delayed or early until the bus actually started moving), I've been quite pleased with the quality of real-time data.

There are always going to be inaccuracies, though. At peak, Sound Transit / OneBusAway are trying to track a couple thousand vehicles across many miles in unpredictable traffic and weather. That it works at all is a minor miracle in my eyes.

The writer has fessed up to reading a lot of the emails. As evidenced by summarizing the content (e.g. cheating spouses, zabbix etc.). Wouldn't the responsible thing to do be stop reading the emails once you realise what is going on?

I probably only spent 30 minutes looking at it and used a few regular expressions to look for anything interesting. The point there was to understand the extent of the leak so that I could raise it in the intent of being taken seriously.

A search for "(Fuck|Shit|Bitch)" can go a long way.

For what it's worth, I used to work at an investment bank spending 30hr/week diving through logs with unix tools, so finding interesting information quickly is something I've learned to do quickly.

FWIW I think you should not have done that, though I understand the temptation.

At the first indication that the data was not what you requested and contained more than you - or they - bargained for you should have stopped looking at it and alerted both the sender and the relevant data protection authorities in so far as those are a functioning entity where you live to tell them they have an 'accidental disclosure' on their hands. Essentially your blog post documents something that is pretty strong proof you are not able to deal with confidential information properly.

Fair. Though, in my defense, the last time I tried to report a similarly large leak (completely different circumstances), it wasn't until I was very explicit in what I found that the report was taken seriously and sent to the appropriate place. Otherwise, I think reports like this are just added to a pile of other complaints.

In this case, it took about a week for them to take it seriously. Like last time, it wasn't until I was explicit that they took it seriously.

If someone accidentally sends me information I owe them no duty of confidence. I'm under no obligation to notify them. It is entirely their problem.

The idea that the OP is at fault for looking at data which the city had already published has no basis in law.

>If someone accidentally sends me information I owe them no duty of confidence

This simply isn't true. If someone accidentally sends you information that you know you shouldn't be privy to, you should delete it. Unless perhaps you are Nelson Muntz.

Ethics aside, ideally you're right. In reality though you're wrong because you just described a key function of the CFAA

The city sent him a link with the intention he would download it, and he did. There's no crime there.

Having legally downloaded the data he looked at it. There's no crime there either.

Just curious, are there established guidelines that are broadly accepted and that lay out how to proceed? Or is it really just down to the "I think you should" on Hacker News? (No snark intended here.)

Accidental disclosure is a gray area, and once you are aware of it being an accidental disclosure you will want to make sure that you do not make things worse through your actions.

There are a number of moving parts here:

- The disclosure was clearly not the intended result

- The recipient could - and in fact did - realize this

- The recipient was in contact with the sender

- The recipient had some easy means to redress the situation

Given all of the above, if you then dig in and start looking at the data I think you are crossing a line. At a minimum a legal professional should have been consulted before further examination of the data, once it became obvious something was wrong.

In the end it would have been down to a judge to decide whether that crosses the line in a criminal sense but I would be loathe to find out the hard way. Pick your battles and all that.

If all that information is available in plaintext to the relevant IT department, does it really matter that one other person is party to it?

I guess if all that is in those records I’m going to commend the Seattle IT department for their ethics at least.

It’s amazing what people think they can put in emails/messages and have stay secure...

There is a huge difference between releasing the information to the public and having IT workers under NDA with the ability to access it. And depending on their set-up that could be a lot harder than it might seem from the outside. Though, in truth I doubt it was set up properly, but absent evidence we can't make assumptions either way.

One citizens communications with the city should not automatically result in disclosure of the fact that that person communicated with the city to other citizens.

The fact that a communication took place in itself is information, and correlated with things like timestamps and who in the city was contacted a large amount of sensitive information will leak out.

I had the same thought - he seems to have indulged in the data quite a bit to come up with such a thourough analysis.

What was thorough about his analysis? He mentions a few things that could easily be detected with grep within a corpus of millions of messages.

I would love to hear his explanation for why he thought reading through them all was necessary. I'm not surprised the city requested 3rd party proof he had deleted them, he clearly demonstrated a fascination with their contents

> The passive aggression is thick.

From this I can accurately deduce that you really were talking to a person in Seattle. /local-in-joke.

Pretty amazing story; Seattle, collectively, always tends to mean well, but so often they stumble.

> Seattle was approaching the problem as if they were pursuing Computer Fraud And Abuse (CFAA) charges. For information that they sent. Jiminey Cricket..

> So, I deleted the files.

Isn't it great to live in a country where we have generic felonies that governments can apply to just about anything involving a computer and ruin your life?

Land of the "free" and the home of the 'fraid.

Where police treat you as fully guilty if you're a suspect despite you having to be thought of as innocent until proven guilty. A number of people have gone to jail because they knew not to keep their mouths shut and told cops too much that made them sound like criminals and guilty. Most people don't know where they were 2 hours ago, let alone the night of December 5th, 1957 at 2:05 AM, like SERIOUSLY?

I can't tell whether this is a testament to the incompetence of public IT operations or an indictment of public records keeping practice. Maybe both?

Also has a good example of hostile FOIA officers. I have filed about two dozen FOIA requests, and the vast majority were fine, though usually slow. Earlier this year one longstanding request of mine was rejected because they claimed the document I wanted was export controlled. Two months later I sent in an appeal where I showed that the document in question was not export controlled (I filed another FOIA with a separate agency to check), and I suggested that they lied to me. Never got an apology, though they seem to be processing the request for real now.

I've requested thousands from all over the United States, big and small municipalities. The responses and individuals ranged from a small contingent of extremely professional, organized, helpful, and all around wonderful people to many and more abysmal, unorganized, uneducated, tech-illiterate, and downright incompetent folks.

I can't remember how many times I'd point the person I was attempting to request the information from directly to their sample version or official space where they said they were the holders of said datas I was looking for which also provided the instructions to specifically call them to request the data. I know from my POV (as a tech worker) it may seem silly to expect anything like that from them but what I was asking for was akin to an excel spreadsheet full of information they absolutely do have and it required no legwork, no generation of new materials, no gathering of data from multiple ancient sources.

It was all material / datas each individual municipality was making money, hand over fist, every month of every year -- the sample datas most municipalities had was evidence of that. I was basically asking them for the collection of the entire data, publicly traded info, and most were convinced they didn't have it. A product of laziness is what I'd chalk it up to because the individuals and municipalities that were awesome to work with and more than helpful seemed to take pride in their work and getting said data was easier than the majority of things involving interfacing with the government at any level usually winds up being like.

Never attribute to malice that which is adequately explained by stupidity.

I used to abide by this until it dawned on me that malicious people often use stupid people as a cat's paw to conceal liability. A careless disregard for the truth is itself a form of dishonesty. This argument is laid out very effectively in philosopher Harry Frankfurt's delightful short book On Bullshit.

I agree, but in this case stupidity can not explain their actions.

The agency processing the FOIA request would get the export control status from the third party I contacted. The third party's software highlights export controlled documents with a red and highly noticeable statement. It would be difficult to believe they thought this was present when it was not.

The request also had actually been transferred several times because no one believes they have the authority to release the document I requested. The other agencies had ample opportunity to reject the request for being export controlled, but none did.

There are some other reasons that I will omit for brevity.

This makes me think that the export controlled claim was a lie meant to kill the request. Most people would stop at what they told me, but I thought it was worth verifying.

Well, like most of Seattle's efforts, it gets fucked up, mostly because it is from Seattle.

Note: I am a Seattle resident and I expect nothing less.

Have you lived anywhere else, like, say, the northeast? Seattle is a shockingly well run municipality among American cities of size.

You haven't met the city council of the last several years, then...

Maybe I should run. I wonder if I'd stand a chance against sawant.

Probably both. Or maybe just a story about a thing that happened.

This whole exercise seems more damaging than constructive, and I don't really like the author's smug tone, as if he deserves praise.

Disagree. Obviously there was a bug in the system, the author simply uncovered it. I'd rather have a smug white/grey hat than a malicious black hat. Now the system is all the more secure thanks to his actions.

The smug tone of the paid (with his taxes) government employees combined with their utter incompetence seems more inappropriate.

I find it constructive in that once again it is demonstrated that the narrative that we're governed by rational, competent people and that we should trust and respect our government is very much a mirage, and the 180 turn in the tone of discussions suggests managerial malice running on top of the front line bureaucratic incompetence.

So they accidentally included the first 256 characters for all the emails? yikes.

I was quoted almost $200k for a similar request for emails. I was trying to investigate a shady real estate deal, and they made it as difficult as possible. I was never actually able to get the information I requested.

I'm completely disgusted and fed up with corruption.

Perhaps try and sell the story with some of the investigative podcasts / blogs? An apparent cover-up gets as much attraction as uncovered corruption. (As well it should).

I'm now convinced that there is no amount of evidence of wrongdoing that will actually harm crooked politicians. They control the narrative, and the courts.

I don't want to end up bone saw murdered, and it feels more likely every day.

In case Matt Chapman is reading this -- the contact email at the bottom of the page (matt@mchap.com) is probably not correct, given that the domain mchap.com redirects to an australian photographer.

The alternative is that the email address is correct and Matt is redirecting his domain to another Matt Chapman, which would be totally hilarious.

Ah! Thanks!

To me, the most interesting thing in this entire post is the following:

> Funny enough, in the middle of that question, my internet died and interrupted the call for the first time in the six months I lived in that house. Odd. It came back ten minutes later, and I dialed back into the conference line, but the mood of the call pretty much 180’d.

I find that when strange things happen like this, they’re hardly coincidence. Did you run a traceroute after the disconnect anywhere? Did you see an IP address change? If so, was it a significant change in the CIDR block it was within?

That's speculative without any proof and I personally think it is a weak point in the article. I do conference calls several times per week and the number of times I've been accidentally booted out of the room are numerous.

Also, once they realized they had left the room of course they would continue to discuss the case and it is obvious they had to consider all possibilities, including the recipient releasing the information to others, hence the 180.

Also, once they realized they had left the room of course they would continue to discuss the case and it is obvious they had to consider all possibilities, including the recipient releasing the information to others, hence the 180.

You're saying the natural default behavior is to assume the worst about someone and draw a conclusion in their absence, as opposed to suspending discussion briefly while trying to get the person back on the phone? That seems like a very bad-faith approach to negotiation or discussion, given that the legal liabilities are something that were so easy to identify in advance.

Er, why the doubt? My internet completely died. My room mates were also affected.. Not sure how I can give proof.

Because it is just the timing that makes you say this, and I highly doubt the city of Seattle can - on a moments notice, no less - pull the plug on any residential internet connection.

If true, that would be a far bigger news item than the rest of your story.

Who knows. It was strange for me, too.

Think about it: you see your internet connection dying as proof when they could have just as easily booted you from the conference call raising much less suspicion. I see it as proof of the opposite, they had a far easier and more direct means at their disposal to achieve the effect you say they desired. So I really do not believe that it was anything other than bad timing, all that it would take for this to happen is for your provider to reset a router somewhere.

My residential connection here is pretty good, even so it goes up and down at least once every week or so whenever some firmware update is pushed to the router.

My phone was dead at the time, so I was using my desktop with google hangouts for the call. I was not booted from the call. My internet died. End of story.

I work from home, so my internet going down is a big deal for my livelihood and all that. I'm not saying that something suspicious happened, but I figured it was an interesting thing to happen. You're frankly thinking into it too much.

> You're frankly thinking into it too much.

I think that was my line.

Pot kettle black, I guess.

I'm simultaneously impressed and saddened by how fast the responses for these FOIA requests were proceeded by the government.

And here in my country, I needed a court order to get at least an acknowledgement of my FOI request.

And now I'm petition court intervention to get the FOI processed in accordance with the law.

This could have been an extremely valuable dataset for the legal community. The Enron data is currently guiding much of our machine learning validation, simply because it's available.

As a general point, I totally agree with this. The Enron dataset released over ~15 years ago is still used by EDiscovery and other legal vendors along with other researchers.

There have been a huge number of papers using this dataset and there are not many other datasets of its type or size available and despite its age is one of the best we have. If people are aware of legally released datasets with a similar size and content I would be interested to hear about them.

I'm confused, why was he looking for this information?

My guess? To look for government employees whose networks are tightly coupled to a special interest or significant person/party.

He wasn’t, that’s the kicker. They included it “because we have no way to filter the output”. It’s a good read top to bottom :)

He mentions that he had previously requested this info in Chicago. This is the relevant previous blog post:



>Seattle's first response included a bit of gobsmackery that I’ve almost become used to

Brit here. I'm always amused that 'gobsmack' and its derived words are still used these days, more so across the Atlantic.

Roughly translated: lost for words, typically for a short time.

> Especially with the use of Excel, which would be useful for removing duplicates, etc.

Excel can only handle about 1 million rows, right?

Nah its not limited except by memory anymore AFAIK.

Ah, spinning beachball time then.

Mental note to self.

Instead of reporting data breaches turn it into a torrent and make it public.

The dump includes email addresses, both government and private. That does not seem good.

There is a surprisingly little spam. Either that is about to change, or spam didn't get included in the FOIA.

Interesting read to learn of challenges cities have and what mistakes they make along the way, but the author comes across as defensive and quite arrogant.

Awesome work! I'm glad we have people willing to go through all the hassle, so we can keep our governments responsible.

I find the writer to be a bit of a dick in his responses. Yes, the city IT may not be at the same level as Google engineers, but there’s no need to mock their ballpark estimates, and after the mistake there’s no need to be a jerk about it. Be forthright about the error.

Consider being on the other side of this, due to a careless mistake the data for many people is exposed on a random strangers hard drive. Asking for an independent third party verification is reasonable.

Bringing lawyers in the mix was also unnecessary. And if more people follow in the authors actions then the state level FOIA laws may be put at risk over the long term.

Your point would have been made better without name-calling.

You are asking for leniency on the side of the officials, yet do not seem to be willing to apply the same standard to the requester. Yes, it could have been handled better, but that applies equally to both sides. Anybody that does FOI requests that have the potential to retrieve a lot of sensitive data due to mis-understandings or mistakes (which is pretty much all of them) should handle the data carefully until they have verified upon receipt that it is what it should be and that the data is not somehow more sensitive than intended.

The author did ok in that respect, could have still done better and the city would have been served better by refusing the request as stated until order by a judge to release it based on the grounds that it is an overbroad request, which will result in the release of privacy sensitive information if fulfilled.

> Asking for an independent third party verification is reasonable.

Not really, for the same reason they never should have sent the excess data in the first place... Why should he give up his privacy to some 3rd party company to help cover up their mistake?

Plus it opens up dangerous precedence. "Oops, here's some confidential data you never asked for, let me send a couple guys to scan your hard drives".

Corrupt cops do sometimes plant evidence. This isn't much different.

For the ballpark estimate, I think they just wanted for the request to go away by quoting an insanely high price.

Long long term resident with a connection to local government

They don't have a choice. Seattle IT is so underfunded that hands are tied because there isn't any resourcing.

On one hand, you have to respond to all of these requests (and rightfully so, as it's the law.) On the other, you have no money for your department because it has no funding because the citizens didn't want to spend the money.

The person who did this isn't malicious. Just very overworked and did a data pull wrong. They probably didn't give a shit to check because their job kinda sucks and they have too much to do already.

A good chunk of this is caused by our city repeatedly choosing awful vendors that bilk the city for crazy amounts of money, and provide trash as the final product.

City Light and the new meters/new billing system are great examples, all the new power meters have no encryption, and use FSK for modulation. Asking City Light about this got me a response that FSK was the encryption, and the gal was dumbfounded when I pointed her to the Wikipedia article on FSK.

On the billing side, an Oracle salesman ran off with over $100 million in city funds for what is essentially a CRUD app, and the worst part is they didn't bother to customize this system, just forklifting this in place and letting the chips fall where they may. The prior billing system had quite a bit of data validation and business logic that has yet to be implemented or replicated on this new system. The same actions when you call customer service now take significantly longer.

Both these vendors fleeced the city for broken, insecure systems, and neither is having to face the music for it. Worst part is, eventually someone may attempt a fairly trivial exploit of either system, which could wreak havoc in our city.

The vendors may be horrible, but it's still on the shoulders of the city for choosing the vendors. This is really down to the fact that most decision makers have no idea how to understand or differentiate between options. It's been like this for decades. When a secretary of state doesn't know the risks in running a private independent email server and how to ensure those risks don't become issues, how do expect much lower level city governments to make any better decisions. Honestly, the IT workers are often underappreciated, underpaid, and underskilled. I usually don't blame the IT people. If they could work at FANG, they would. If they get into a rut where they end up not caring anymore, it sucks. But let's put blame where it belongs, it's on leadership to develop a culture where people care and on leadership to invest money and resources accordingly.

Edit: though I shall add that it unfortunately is sometimes not possible to build a new culture without firing incumbents. That one is a really unfortunate situation. It's not the fault of low-skilled IT workers if they are enabled and rewarded for poor skills and attitudes. But if an organization wants to make good cultures, it sometimes requires hard decisions. Still don't blame the workers, just like I don't blame factory workers that get automated. It's just an unfortunate thing that can happen that most people don't deserve.

I don't think it's good enough to say « it's still on the shoulders of the city for choosing the vendors ».

If I write a piece of software which is technically capable of meeting its requirements if you read the manual carefully enough, but in practice the intended users can't figure out how to do so, that piece of software is no good.

Similarly if the market is in principle providing IT vendors who are capable of providing a decent service, but in practice the purchasers can't figure out which ones they are or how to make them do so, the market has failed.

Do you have a viable method to fix the problems you describe? If not, then we're still left with it being on the shoulders of the city. Ultimately, it's really hard to police that the vendors don't make crap solutions. If the market fails to figure out which are crap, the market may have failed, but I don't have any ideas that would succeed better.

Edit: and let me say that I posited that the issue isn't market forces. The issue is lack of expertise at decision making levels. Even if there were zero market, it wouldn't stop people from doing it wrong.

Principally: start punishing the corrupt vendors; don't assume that reputation mechanisms will ensure that non-corrupt ones will eventually outcompete the corrupt ones.

Just as you have to build software for the users you have not the users you feel you deserve, we need a service industry that works for the service-commissioning agents we have.

Again... how would you do it though? Who would do the punishing? How would you enforce the punishment? Would clients build the punishment into the contract? Even if you could do that, the vendors have the advantage of how to construct the contract in their best interests and avoid punishment. Most ideas will go straight back to market forces. Again, the issue is expertise to choose good options, not the existence of market forces.

> Wikipedia article on FSK


> The demodulation of a binary FSK signal can be done using the Goertzel algorithm very efficiently, even on low-power microcontrollers.

The source links to the TI page for the MSP430, named because it originally sold for $4.30. While the original link is dead, an internet search reveals many side and college projects using this technique.

>A good chunk of this is caused by our city repeatedly choosing awful vendors that bilk the city for crazy amounts of money, and provide trash as the final product.

How much of this is voters voting for politicians who are good at what is essentially a popularity contest instead of politicians who are capable of signing decent contracts? As long as voters don't care enough to change their voting habits this will continue to happen.

I heard from my city solicitor (whose office handles these FOIA requests) that one issue they have is that when people make requests of other city departments the default response is “go make a FOIA request” even in straightforward situations where they could easily give out the info directly (and used to, before the FOIA was passed). He said it’s annoying because it ends up putting a lot more workload on everyone.

It was a reasonable ballpark. Let's say a city prosecutor was working on a organized crime case that involves the FBI and other people.

Based on timing of emails this would leak the list of people working on the case, maybe informants and put them at risk.

We all lost our collective shit when NSA said they're only collecting metadata. Metadata is Data.

Since they revised the cost for the data they actually sent him down from $33M to $56 (90 days of data at $1.25 for 2 days data), was a ballpark estimate that's over 500,000 times higher than the actual cost really reasonable?

$32M is not a cost the requestor bears, but an estimate of the cost the city bears.

IT estimated the requestor fees to be $21k/year assuming 10TB of data.

Clearly, IT were estimating the costs of releasing all email text, because FOIA mistakenly changed the words "please provide the following information:" to "including metadata:"

Once they figured out the request was for header information only, the city came back with an estimate of under $60.

The actual cost of servicing this request was much greater than the $56 estimate.

Especially after the city made a mistake!

The ballpark estimate was for checking the full text of each and every email (which was a misunderstanding -- the author wanted metadata only).

> Metadata is Data

And in many cases metadata is just as useful as the payload, in some cases even more useful.

How was a lawyer unnecessary? He was threatened after doing the right thing..!

Absolutely. If anyone reading this ends up in a similar situation: please involve lawyers. It's the single best move the author did in this whole mess.

> Asking for an independent third party verification is reasonable.

Asking for it may be reasonable, demanding it certainly isn't. You can't demand that somebody gives a third party access to their hard drive.

First, it's completely unreasonable from a privacy aspect. Given the level of personal data most people store or process on their computers, it is even less reasonable than asking someone to have a third party dig through their house to "verify" that they don't have something.

Second, it's completely unreasonable because it's pointless against a malicious actor - they could have cloned the disk before deleting the data and there would be no way to detect it.

It does make sense only to confirm that the data wasn't accidentally left on the disk due to an insecure erasure method. When dealing with someone competent with computers, the proper solution (which they ultimately arrived at) is to have him describe the method he used to delete it, possibly ask him to verify it (e.g. via some form of "dd | grep"), and that's it.

Assuming the authors version of events to be true, I think he was being as reasonable as possible when confronted with obvious incompetence and hostility from public servants. This probably happens a lot in the world of FOIA.

I do want to recognize that not all local governments behave the way described. When I was recently called for jury duty, I discovered a vulnerability in the city's jury duty online portal that would've let anyone get the PII of anyone ever called for jury duty via that system. I immediately called the county IT department, and they took me very seriously and thanked me for the report. They later emailed me back to tell me they worked with the vendor to close the vulnerability. I was extremely impressed with their professionalism and wish that all local governments could be so responsive.

I'm curious what his actual legal exposure would have been if he hadn't folded.

I feel like they should have offered to compensate the author for his time in their initial request - if someone wanted to perform forensic scans on my hard drives it would be a huge inconvenience.

It is also a legal issue, depending on what else is on the drives. Imagine having to call your employers or clients and tell them that data under NDA is being turned over to Seattle's auditors.

Worst case, they'd want to know exactly what data was at risk, which would trigger more demands for forensic audits and more duty to notify.

I'd be curious what his legal exposure would be if he were a person off the street, who didn't have a lawyer handy.

Pretty massive. An accidental release where the party released to is known and unwilling to perform a remedial action can go anywhere from a slap on the wrist to a criminal investigation. That doesn't mean there will be a conviction, but de-escalation would seem to be a wise course of action in such cases.

De-escalation? Sure.

Allow a third-party forensics company hired and beholden to a presumed-hostile counterparty unfettered access to your hard drives because of their own lack of care or incompetence? Hell no!

These kinds of actions are why the bill mentioned at the end of the article were put forth in WA (vetoed by Gov)

TIL there is a genre of people who call themselves FOIA nerds and who appear to be unpleasant, doing this sort of fishing for fun.

more than dollar per email address doesn't sound like very good deal to me

It's clear they didn't have expertise to do it, and I'm tired of reading people that know way more looking down at others over it and assuming they don't want to comply. If they hiding something and malicious, the end result wouldn't have been to send way too much, but I don't see the author realizing this fast enough.

I'm not sure why you think the only malicious response would be to send over too little. It's a common enough scene in TV shows and the like where a malicious actor attempts to hide incriminating information in a sea of irrelevant information.

I'm tired of reading people that know way more looking down at others over it

What do you mean "people that know way more"? He made a simple request for email metadata, spelled out each field he was interested in. He didn't tell the city how to do it.

Are you saying that the author knew more about how to retrieve email metadata than the actual Seattle IT staff that administer the mail system? And what bothers you most is that the author knew more about how to fulfill his request than the people that run the mail system?

Agreed. You're putting an overworked, underpaid public servant in a "damned if you do, damned if you don't" scenario. They complied with a far reaching request and got told their response was too far reaching? I'd quit my job if faced with a legal minefield like that, especially one not actually related to the job itself

Why do you think complying with these requests is not part of the job for Seattle IT?

I would think it’s pretty cool to retrieve this massive amount of data that I wouldn’t otherwise get to play with.

Presumably this should be considered overreaching should be considered an important though: if there’s an authorization process in play, then more information has been given out than the public servant was actually authorized to hand out. If me as a citizen starts receiving sensitive information despite only being authorized to receive insensitive info, that could easily become a significant security breach.

In fact, a known security check was actually bypassed in this case: the email review, reserved for the content of the email, causing the whole problem in the first place.

It seems to me imperative that they actually deliver up to the amount authorized. Ideally exactly the amount, but never more.

>They complied with a far reaching request and got told their response was too far reaching?

He requested metadata and they sent actual email content, kind of a big difference there.

If they hiding something and malicious, the end result wouldn't have been to send way too much

Wrong. Dumping a vast pile of irrelevant information at the last possible moment to obscure something embarrassing is a very common tactic in commercial litigation.

Author of article has no background/understanding of the "sunshine" laws in effect in WA. Those laws may (do) explain a lot of why things go this way with any/all FOIA in WA.

Source: 100s of FOIA requests to various WA government agencies.

Can you provide more details? What is it about WA sunshine laws that make the government misunderstand a request, overestimate the cost of providing the requested data, and then provide data that was not requested resulting in a breach of disclosure laws?

Remember when Shoreline had to pay out ~$500k because of mistakes they made on a FOIA request?


It's because Washington agencies are required to cover reasonable attorneys fees for their opponents after losing open records lawsuits (one of the factors in our FOIA laws)

So when Author sent the request to Seattle, they have this above cited example (and 100s of others across the the State) where a mistake could create a lawsuit the costs this loads of money.

Did you know that the burden is on the agency to establish that its denial of inspection is proper?

Did you know that the court could award you an amount between $5 and $100 a day for each day that access to the records was denied.

So, if they don't give you everything you ask for you can sue. And it's easy (relatively) to win in WA for that because of our FOIA laws, then the agency has to pay for the lawyers and a penalty for delay of the records. For emails that means $5 * (Days of Delay) * (Number of Records).

In short, if Seattle fucked up this FOIA request, denied or delayed -- that could have cost them millions of dollars.

The author didn't understand that and (like a fool) blames the city and city-workers.

> In short, if Seattle fucked up this FOIA request, denied or delayed -- that could have cost them millions of dollars.

Sorry, but that sounds like bullshit. The Washington law provides for agencies to take reasonable time on a request, especially one a request that is complicated and broad. In fact, unlike the FOI law for federal and other states, the Washington law does not proscribe the number of days that an agency must respond by, only that they be made "promptly":


The city of Shoreline did not have to pay out $500K "because of mistakes they made on a FOIA request", not according to what you posted:

> The City of Shoreline will have to reimburse $438,555 to cover the plaintiffs' costs as Washington agencies are required to cover reasonable attorneys fees for their opponents after losing open records lawsuits. Shoreline also agreed last year to pay a $100,000 statutory penalty after the court found that the city violated the state public records act.

They paid $438K for fighting the request for seven years. They paid an additional $100K penalty because they were have found to violated the law. They did not pay for "mistakes", at least not mistakes in good faith.

SEATTLE – The Washington Supreme Court has upheld a $502,000 penalty for Public Records Act violations by the state Department of Labor and Industries, in a ruling which affirms that judges can calculate such fines based on each page of a withheld record.

Another example in WA, the kind of thing that scares public sector employees.

I have direct personal experience with this, and direct knowledge of others receiving payout from government (in WA) for similar violations (almost had to start another suit this month).

Your narrow interpretation is splitting hairs. The danger is real to the government workers executing these requests

Edit: two more easy to find examples



Sorry, I don't understand what your argument is. Both articles you link to refer to large penalties imposed against government agencies for withholding the requested records, over a long period of time.

In your original comment [0], you suggested that employees feared of making innocent mistakes that would lead to open records lawsuits. None of the examples you've provided describe that situation. Instead, they involve agencies (and their lawyers) who have decided to refuse a request and fight it out in the courts. What does that have to do with being a danger to employees who handle these requests?

[0] https://news.ycombinator.com/item?id=18267039

Well, spend some time with those employees, they know about these cases.

This means that these employees frequently send more, faster to avoid a big public issue.

The point is that individual people feel pressure and make decisions based on these articles (and many not so public cases) that in retrospect are not that great.

And then some blogger makes foolish claims, as if it was incompetence rather than fear.

I'll not comment further

I've sent quite a few FOI requests myself when I was a reporter, and my experience biases me against thinking there's a significant problem of state employees sending out FOIAs quickly/prematurely out of fear. In fact, I've never dealt with an employee who broke protocol -- for non-routine requests, the vast majority of them consult with their FOI officer/legal counsel. And they have no incentive to rush things because most FOI laws allow for a delay in response time -- Washington's law doesn't even have a set time limit in which the government has to respond.

> The point is that individual people feel pressure and make decisions based on these articles

That is literally the situation of every public servant -- as just about any police officer will tell you. The difference with FOI is that the law provides ample protection for government employees to take their time to get it right, and every investigative journalist I've ever worked with puts up with those delays -- it's only when the delay goes into months/years such that it's tantamount to a rejection that legal action is threatened, because the lawsuit itself takes months to resolve.

The only example lawsuits you've found were ones in which the agencies refused to fulfill the request. Until you can show a single instance in which a state employee, or even an agency, was punished because they were late while trying to respond in good faith, I don't think we should assume you know what you're talking about when you claim the author has "no background/understanding" of WA's public records laws.

It's especially absurd that you're trying to argue that the mistake the city of Seattle made in his case was done out of hurried fear, when the author provides correspondence that shows he and the city emailed back-and-forth from April to August before they sent him the data. A technical screwup (via internal miscommunication) is the most plausible explanation by far, as no one in the IT or FOI office had any reason to rush this request.

Could you be a little more precise? As is this is just a vacuous statement about how you know more than the author.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact