Hacker News new | past | comments | ask | show | jobs | submit login
Congress votes to make open government data the default in the United States (e-pluribusunum.org)
1130 points by danso on Dec 23, 2018 | hide | past | favorite | 154 comments

Making public data open by default can arguable be an imporant step towards fostering societal equity. However, it needs to be not only "open", which typically means stashed away in some corner as a spreadsheet or database file, but accessible and useful to people. The UK has been pushing open data for years now and more and more institutions are now realizing this. Shameless plug for a research project that is aiming to make open data more accessible and to democratize data-science: https://data-in.place/ ...

In case you haven't seen it, I think https://www.data.gov/ is an attempt to answer your point about making it "accessible and useful to people." There's room for improvement, but it's a start.

I thought there would be a flood of projects analysing the data when it came out, but it seems like the idea everyone applauded but not much came out of it. Steve Ballmer's http://usafacts.org seems like the first real attempt though.

Is there a HN thread discussing these reports? I'd love to hear folks' opinions about them... Interesting facts to me in the latest report [1]:

- Page 44: "Our economy has grown at a steady rate despite changes in economic policy". I expected to see a lot more fluctuation on this data.

- Page 30: "There have been more suicide gun deaths than homicide gun deaths every year since 1981". This is crazy to me given how much we hear about gun homicide being a problem in this country.

- Page 29: Crime rate has declined but "The number of incarcerated persons has increased by 330% since 1980".

[1] https://static.usafacts.org/public/resources/USAFactsReport2...

>Page 44: "Our economy has grown at a steady rate despite changes in economic policy". I expected to see a lot more fluctuation on this data.

Barring some calamity, this is pretty much as expected. Tariffs impact a very small percentage of the economy with large size, and the rest of it with a small overhead, much like a fed interest rate hike. Even with this hawkish fed, there hasn't been anything overly harmful to the economy from a policy perspective.

> This is crazy to me given how much we hear about gun homicide being a problem in this country.

Anything that's politicized gets this special treatment. Kid kills brother with car, blurb in local newspaper. Kid kills brother with gun, national news and Tweets from Presidential candidates.

Texting and driving is as bad as drinking and driving when it comes to number of deaths (dwarfing gun deaths as well), yet people do it like it's no big deal. Most cities still only levy small fines (in comparisons to DUIs) for doing it. Not all accidental death is created equal.

>Page 29: Crime rate has declined but "The number of incarcerated persons has increased by 330% since 1980"

Welcome to the US where the war on drugs gives us authoritarian level incarceration rates.

Every election cycle I look for the candidate with the balls to say the War on Drugs is over, let’s wind this crap down, change our laws to reflect this fact, change some sentences ex post facto to reflect this and get on with our lives.

Every election cycle, I continue to be disppointed. Even if they were crazy in every other regard, I would probably still vote for them. It’s like, Step 1 towards doing anything meaningful in regards to poverty, education, criminal justice reform, et cetera.

+1 for this. This site is beautifully designed and loads very fast in mobile.

That's pretty interesting, thanks for mentioning it.

https://public.enigma.com is another good way to browse public government data.

data.gov is great but it seems like the most recent activities/updates were prior to 2016...

USDS/18F have done such incredible work and I'm glad the Trump admin hasn't killed them off completely. Given Trump's total-war attitude toward all Obama initiatives I'm somewhat surprised they still exist and haven't been gutted like CFPB.

honestly, with this administration's turnover and record number of still-unfilled positions, I'd bet there's a very real chance that it still exists because it hasn't been noticed yet. because it sounds exactly like the kind of thing the Trump admin would hate, seeing as how they've already pulled a lot of data out of the public eye like climate change reports, white house visitor logs, etc.

nih.gov is also quite good for health data.

As long as those spreadsheets/database files are accessible to someone with technical skill, people can pull in the data and use tools to make it more accessible and useful. Ideally, yes, the data is useful to begin with, but as long as it's available, there's nothing stopping individuals with the skills from making it useful.

Of course, there are exceptions: the PDFs that are often provided by the prosecution as part of the discovery process are prohibitively difficult to deal with, and should be considered a violation of Brady vs. Maryland, IMO.

I've spent a great deal of time parsing data out of government PDFs that isn't attainable by any other means as a part of my job. In the process I've learned how difficult this information can be to access even for people who don't require it to be in a machine readable format. It certainly has been an interesting exercise in how far simple web scraping tools can be pushed, though.

Amazon Textract was recently announced, sounds like it might be good for that. Haven't tried it myself.


I applied to the beta, but they never got back to me :\

Have you tried Apache's tika? It's pretty decent.

Nope, I'll have to give it a spin. Thanks for the recommendation!

Do you have any tool suggestions or general advice for someone trying to do this? A while back I was trying to extract text from some government PDFs in order to make the information more accessible for others, but I became a bit overwhelmed when I started reading up on PDFs.

Sure! In terms of raw text extraction (for documents that don't require OCR), the most useful tools I've worked with have been pdftotext [0] and PyMuPDF [1]. For extracting useful details, really, my best advice is to make sure that your regex skills are sharp. I've been meaning to explore the possibility of using NLP tools for named entity recognition, but unfortunately I don't have much of a background there.

The rest kind of it kind of just comes down to using good software engineering practices to help keep yourself sane. Find useful abstractions for common tasks you need to perform and build a library around them, make sure that your data processing pipeline is designed with enough flexibility to handle inputs in different formats so that adding or modifying parsing logic becomes trivial, etc.

[0] https://www.xpdfreader.com/pdftotext-man.html [1] https://pymupdf.readthedocs.io/en/latest/

pdfminer is another good library (Python).

Exactly. Accessible and machine readable are necessary but not sufficient. Thankfully, civil society can reasonably pick up the slack.

In regards to modern day transparency requirements, it seems like laws should include a reasonableness clause.

Making records available to the public but requiring them to be hand photocopied vs. making them available in electronic form in a custom format.

Both open. But two very different magnitudes of effort.

>Making public data open by default can arguable be an imporant step towards fostering societal equity

I think this was one of my biggest shocks doing work for the government, collecting public data, payed by tax funded grants. Public data isn't for the public.

We went into this project with all these starry eyed dreams of making a public online database and freely posting everything we collected, with maps and interactive tools, status reports. It was part of our grant proposal.

Then reality came and we found out public data meant a government password protected database with access fees where our data would be available to people willing to pay for it or we'd lose our funding. The data were for companies or individuals willing to pay the government not for the public.

This still doesn't sit well with me nearly 6 years later. That was never what we wanted out of that project and it wasn't what was planned or accepted when we wrote our proposal.

How much did access cost?

Ideally, I'd prefer the data be free, but if the fee was (mostly) nominal, I'd consider that almost as good...

Simply put, this government isn't ours. It belongs to the corporate heads and the monied elite and the lobbyists who write the laws that are uusually summarily passed by congress.

The FAIR principles make a lot of sense to me: Findable, Accessible, Interoperable and Reproducible.


I was kind of concerned right off the bat with that numbered list:

1. public information should be open by default to the public in a machine-readable format, where such publication doesn’t harm privacy or security

I'm sure literally everything that they wish to keep opaque will declared to be covered under one or both of these incredibly vague categories and nothing will significantly change. Is there any elaboration in the bill that defines what they can call a matter of privacy or security? Even if there is, it wouldn't matter much because how are people going to tell if they keep it locked down in the first place? And they would not risk any sort of real blowback for abusing this and getting caught. Tell me there have not been far, far worse scandals that resulted in no consequences for the perps and cowed silence from the public. I don't think they're hiding the X-Files in there or anything, but this won't magically cause a more transparent, just, or equitable government unless it has serious teeth and tight language.

And 2. federal agencies should use evidence when they make public policy

Somehow I wonder if the data from the Kansas experiment will be taken into consideration and turned into public policy by this current administration, or if they will cherry-pick evidence selectively to justify only wildly unpopular legislation because someone (possibly an industry with a conflict of interest?) contrives some p-hacked research to back it up. Just because something is scientific doesn't necessarily mean it's good government. It is often so, but I'm always very wary when they trot out a bill with lots of bold language touting justice and democracy, truth, stuff like that. If the US legislature passes a bill called "protect innocent puppies from being kicked in the name of god and freedom" you can be 95% sure that this bill will enable a great wave of puppy-kicking despite its holy name.

This is a cool project. I was trying to find the source for the project on the page. There’s an oss page [0] but that’s about the software used.

Is this an open source project? Or what’s the way for licensing to use with US data?

[0] https://data-in.place/open-source

>However, it needs to be not only "open", which typically means stashed away in some corner as a spreadsheet or database file, but accessible and useful to people

True, I guess HGTG applies well here:

“But the plans were on display…” “On display? I eventually had to go down to the cellar to find them.” “That’s the display department.” “With a flashlight.” “Ah, well, the lights had probably gone.” “So had the stairs.” “But look, you found the notice, didn’t you?” “Yes,” said Arthur, “yes I did. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of the Leopard.”

It would be extremely helpful if the government data used the same Solid protocol (Linked Data Platform) which will likely be getting widespread use starting next year. (See Tim Berner Lee's Inrupt.com Solid startup)

What are the signs that make you think Solid will see widespread use, and by whom? I'm genuinely curious.

I agree that the government will need to work on a transparent front-end to make this data universally accessible. Nonprofits without the budgets for advanced tech workers and without volunteers will need clearly organized links to download. They may not know how to do shell scripting.

The more data everyone can use, rather than data that can be owned and commoditized or utilized only by specialists is a good thing.

I would also mention Data USA by MIT Media Labs Collective Learning group, with an effort to combine multiple sources into a single geographic profiles rich with visualization and data sources https://datausa.io

Making public data open by default can arguable be an imporant step towards fostering societal equity.

Define equity here?

Not the parent commenter, but open data policies could contribute to levelingn the playing field for society. The vast majority of records requests are currently made by corporations and other business interests.

Having accessible data isn't enough. The average person also needs the tools to process it in a meaningful way. This is where corporations are ahead - with relatively unlimited resources to turn that government data into insights that influence business decisions.

I actually rather have the raw data accessible instead of doctored version.

Why not both?

Make the raw data available for those of use who want to write machine-parsing algorithms, and also make it available in human-readable and easily digestible form for the broader public.

> Making public data open by default can arguable be an imporant step towards fostering societal equity.

Out of curiosity, what’s the argument here for how public data being open by default is an important step toward fostering societal equity?

Here's a paper by the Sunlight Foundation that goes into this a bit:


I disagree - that's not a link to a paper by the Sunlight Foundation.

It's a link to a site that wants you to agree to try a monthly subscription before you can download anything at all.

Especially in the context of a discussion about public data, that's an important distinction.

If you can, please provide an URL to the actual content?

The article linked above, hosted on Scribd, is actually one of the official distributions provided by the Sunlight Foundation. The Scribd user who uploaded it was one of the paper's authors.

Having said that, here's a link to the version hosted on the Sunlight Foundation's domain: http://assets.sunlightfoundation.com.s3.amazonaws.com/policy...

You can also verify what I said about the Scribd version by checking out the original press release/announcement here: https://sunlightfoundation.com/2015/05/05/a-new-approach-to-...


What do you mean by "democratize data-science"?

I prefer when it is available as a spreadsheet or database file. The crucial element is my having access to the data. After that, any differential amount of information I have over everyone else is an improvement.

For instance, consider the statistics on accidents between cars and bikes in California. You can get the numbers yourself from the government. When someone says that more accidents are adjudged to be because the bicyclist is at fault, you can reference the truth and ruin the credibility of the person making the assertion, thereby allowing political advancement of your own cause. No one can use the technique against you because you are capable of acquiring the knowledge and won't make wrong assertions. Only other people will make them.

Having verifiable true information over someone is power. It's better non-democratized so long as I fall within the circle of power.

I think it needs to be both. Accessible in it's raw format as well as approachable for non-technical users to interact with public information.

All I want for this Christmass is an authoritative list of all US federal agencies. I kid you not, there is no such list and the number of federal agencies is uncertain. From Wikipedia:

>Legislative definitions of a federal agency are varied, and even contradictory, and the official United States Government Manual offers no definition. While the Administrative Procedure Act definition of "agency" applies to most executive branch agencies, Congress may define an agency however it chooses in enabling legislation, and subsequent litigation, often involving the Freedom of Information Act and the Government in the Sunshine Act. These further cloud attempts to enumerate a list of agencies.

That seems like the issue is that the answer varies with chosen semantics, not unavailability of information.

It seems like the solution here would be to have a government agency that counts how many government agencies there are. I believe that could be done with an executive order but correct me if I'm wrong.

We're working on it, but it's not a data transparency problem, it's a governance problem

What do you mean by federal agency (given that the definition itself is unclear)?

Shameless plug: does anybody know where I could get detailed spending and receipts data for US federal budget? Including all federal agencies? There is very limited data at https://www.whitehouse.gov/omb/supplemental-materials/ but I am looking for more detailed data.

The main place to explore spending data is https://www.usaspending.gov/#/download_center/ and https://datalab.usaspending.gov

Let me know what you can't find or request via https://www.data.gov/data-request/

It’s not quite an official list due to the issue you mentioned but anything that realistically rounds up to an “agency” will publish notices or regulations in the federal register, and the fedreg API has an endpoint to get all agencies.

That's a really clever idea!

Check the budget. Bills.

As a next step I would love to see an exploration of a legal system where a change of a law or regulation is backed by both data (data and methodology directly referenced by the law) and a description of the expected impact. Something such as:

> By changing law A related to B, we expect the increase of C to be at least D in the next E months.

If not achieved, the change is reversed/reduced. Hopefully that would allow experimentation without taking the risk of creating a system too bad in case the implementation or policy isn’t good enough.

Just dreaming here :)

I've dreamed the same thing.

Since every law is a trade-off, having both positive and negative effects, I wish each law would enumerate the expected/possible positive and negative effects. In other words, I want the trade-off to be explicit and for the lawmakers to express why they believe the positive effects outweigh the negatives ones.

I'm biased (I work on the BigQuery team) but I'm always excited to see more public datasets made available in BigQuery: https://cloud.google.com/bigquery/public-data/. It would be great to have government data available through a variety of cloud services with free exports.

Some personal favorites among BigQuery public datasets include NOAA GHCN[0], the Census Bureau's Zip Code Tabulation Area [1], and FEC Campaign Finance [2].

[0] https://console.cloud.google.com/marketplace/details/noaa-pu... [1] https://console.cloud.google.com/marketplace/details/bigquer... [2] https://console.cloud.google.com/marketplace/details/bigquer...

Open data is the first step.

I wish I could do more business with my gov't (State and Local) through the internet.

"Copies of documents cannot be ordered through this website, by email or over the telephone." Only fax and snail mail... https://www.dos.ny.gov/corps/faq_copies.page.asp

It's worth noting that easy to access isn't always the best. A great example of this is mugshots, which are open data, but there are now websites that automatically scrape and index people's mugshots and use SEO to rank highly on searches. They then blackmail the person to have it removed.

As long as the data is accessible and reasonable easy to get, where journalists and data scientists who really care about the data can get it, then I think we're in a good place.

Is this because they are required to make this info accessible, but are not required to make it easy?

More importantly, there's no specific funding set aside to support the effort to make it available. Government at all levels lives and dies by funding.

There are fairly cheap services that turn your emails into faxes. I've used them previously and they worked great.

FOIA officers will still find a way to send me scanned PDFs of spreadsheets.

This is the worst. Thankfully, there are tools like Tabula to extract the data.

I’ll defend this practice. It’s the only way of knowing for sure that you’re transmitting exactly the information you intend to send. Even copy/paste often picks up other stuff you don’t intend.

It's more of a way to prevent transmitting any easily accessible data at all. Using a human-auditable but still machine-readable format like CSV is what should be done.

It the only, but maybe the easiest.

Having a data review process with automated integrity, confidentiality, and quality checks is not terribly difficult.

But having a prototocol to export the pdf to csv is also dead easy for confirming only the data relevant is included. ASCII is just as “easy” as scan, but it requires training clerks to be data-oriented rather than document.

_ugh_ if only CSVs were standardized sooner and more completely. There are many encoding, delimiter, escaping and truncation conventions to deal with in real world data.

Definitely. They are better than PDFs, but still have lots of room for improvement.

There are other ways to ensure this. Even with your own logic, it would make sense then to send both the Excel sheets and scanned PDFs of the Excel sheets, isn't it? It would be super comical though

PDFs can and do send non visible data that wasn’t intended to be transmitted.

But as the public is typically entitled to that "other stuff" information as well, you're just obstructing.

Too bad thing related to "security" won't be included.

All I want for Christmas: accountability for the DoD budget, which for 2019 will be $717 Billion, the majority of the USG's discretionary spending.

In 2015, an audit of the budget revealed $125 Billion in wasteful spending, and this was covered up. In 2016, the Office of Inspector General for DoD said that the Army made 6.5 Trillion in wrongful adjustments to its 2015 accounting. https://en.wikipedia.org/wiki/United_States_Department_of_De...

We still do not know how many contractors the DOD employs, or how much money they are paid, because the numbers are just not recorded. We do know, though, that the numbers and budgets we do have are often inaccurately reported, according to the DOD OIG.

People whine a lot about paying taxes, but the politicians that always complain about taxes are extracting record amounts of tax money for a military that is mismanaged, doesn't do its accounting properly, can't build modern fighting vehicles, and doesn't record basic information like how many people they employ.

I've been working with the GPOs api. The engineer is quite responsive on github and the api is pretty snappy. Constantly asking for feedback and releasing new features. I think we're headed in the right direction. A shame my project isn't further a long for a shameless plug.

If you're interested in informing the public on legislation, have experience on the hill or UI/UX experience hit me up. I'm just some dude with an idea who lives in a terminal. Money is secondary.

Interesting that the Federal Reserve was exempted from this legislation.

The Federal Reserve [board] bases its data on data from the Reserve Banks (think Federal Reserve of New York)- which are private and get data from banks chartered in the region (so the New York Fed is partially owned by JP Morgan). Requiring the Federal Reserve [board] to release underlying data could get sticky and be a legal minefield.

Also the St Louis Fed provides provides FRED, which is pretty open for most purposes.


You can access the Reserve's information online as part of the Federal Reserve System's Open Government webpage implementing the Open Government Directive from 2009: https://www.federalreserve.gov/open/open.htm

Despite all that, the Federal Reserve is not a government institution, so such legislation would be similar to singling out your business to post all information in machine readable format online.

Is it? My understanding is that although the board is appointed by the President and Congress, once appointed they don't answer to the government, so their data isn't really "government data".

But I'll be the first to admit I know very little about how the Federal Reserve works.

That's a real shame, because we wouldn't have a federal debt if we didn't have the Federal Reserve. More people should actually know how it works.

The federal reserve is a curious semi-company-agency with apppontment by president and Congress, but profits going to shareholders (large banks).

Or course, some of the most important economic data sets won't be shared.

It's almost like they want to cover up the systemic disempowerment of the American electorate.

The first problem here is the data formats state and federal governments use. You'll see a hodgepodge, but primarily MS Office $version .

The biggest problem with this terrible binary format is that metadata can leak a great deal that should have not been released. So this leads to PDF output of word/excel.

The next area is that especially local government offices have no way of setting up a data portal. I'm working on this right now, where the only way to get data out of Bloomington,IN is to do FOIA requests every week/month over the data you want. This absolutely should be available via a portal, and not locked behind "in person, mail, fax at cost of .10$ a page".

Well that’s cool, but the wording worries me:

public information should be open by default to the public in a machine-readable format, where such publication doesn’t harm privacy or security federal agencies should use evidence when they make public policy

The word “should” is used in both one and two. If my time in the government taught me anything, it’s that “should” is only slightly stronger than “may”. If an instruction says “shall”, then it is required to be done.

Finally a bipartisan moment, looks like most reps were onboard.

They have to be for the law to pass?

Not just most reps were onboard; nearly all reps voted in favor

"The Open, Public, Electronic, and Necessary Government Data Act (AKA the OPEN Government Data Act)"

Please, could we end this obsession with backronyms in congress. Perhaps some congressperson could create a suitably titled backronym act to rid us of backronym acts. This should just be the one singular "Government Data Act" and it should be amended as needed to cover any law changes around "Government Data".

BACKRONYM Act: Banning Acronym Cleverness Keeping Representatives Occupied Nutting Your Mom Act

(struggled with the NYM so gave up and went with something childish...)

Naming Your Measures / Mandates.

Does anyone know if this applies to PACER and its fees?

It does not. Only CFO Act agencies in the executive branch. But there is a bill to open up PACER.

Very unlikely.

It's good to see the US govt making an effort to step up its technical level. A vast if hidden problem in the political sphere is that most politicians do not have a technical education. This creates a serious misconfiguration of the govt alongside other centers of soft power like large cap tech companies.

There is no way around it that these companies have to work with the government to secure public interests from 21st century threats. Just today I read an article about how black hat hackers are targeting outdated industrial control systems more vigorously than ever before. The government on its own without technical upgrades cannot face down this problem in its current condition. Which is why opening up data is a beneficial thing.

Openness of data is a double edged sword. It will make malicious agents' job easier to have as much data as possible in a consistently machine readable format, but it will also help those on the other side.

If tech is one of the things that can bolster and improve government, tech needs to work in the optimal environment. Which is one with open data.

With all the problems its government has, united states is still at it fostering innovation.

Open data will bring innovation and accountability.

Great move

> federal agencies should use evidence when they make public policy

the cynic in me just figures that this moves the goal post such that special interest groups will adapt to produce the right evidence for the desired outcomes

You're probably right, but at least in that case there will still be a paper-trail of "data" that motivated entities can point to and make a case against.

I would rather have bogus evidence in the official record, that I can analyze and challenge, than no evidence at all.

Probably, but there'll be others punching through the data for other useful things.

Innovation? Other nations (like Sweden) have had its data public for many decades now.

You missed the point: I didn't say us was innovative by opening its data.

I am saying by opening data, us will be fostering innovation.

UK is another great example.

There is no "innovation" going on here. It is the data owned by the citizens, doesn't take that much effort to release it - they are collecting the data anyway, all they have to do is make it available to the general public. Still, a step in the right direction

The innovation will come from vast amount of data coming from government agencies. That's what i meant.

US government has much more room for improvement (because of its complex structure (federal vs state agencies, etc), large population and so on) - This is very different from pretty much the rest of the world.

> there was a carve out “for data that does not concern monetary policy,” which relates to the Federal Reserve, among others.

Does that mean they won't open data that affects monetary policy or that they will only open data that affects monetary policy?

Either way that seems huge.

Can someone who is more familiar with the actual policy shed some light on this?

FWIW, the author, Alex Howard (who I'm friends-via-Twitter with), is as familiar with this as anyone. He was previously a senior analyst at the Sunlight Foundation, which is a prominent open-government organization: https://sunlightfoundation.com/author/ahoward/

The submitted post includes a link [0] to an article he wrote a couple days back, which provides more context on "How did open government data get into the US Code?", including the nitty-gritty of how the original bill was proposed in the last session but ultimately left out of legislation. Howard writes that the legislation was "one of the primary legislative priorities for me during my years as a senior analyst and then deputy director at the Sunlight Foundation"

[0] https://e-pluribusunum.org/2018/12/20/senate-passes-evidence...

> (who I'm friends-via-Twitter with)

Full disclosure? Odd brag?

It's someone who I mostly learned/knew through social media and am friendly with, even though we've never worked together and maybe have met in person a couple times at conferences.

I wasn't sure if you were disclosing the relationship because of the nature of your comment or if you were just bragging about knowing them. It is/was odd.

My opinion is that he’s an expert on the legislation, but I wanted to be clear that I could be biased :)

Haha ok, as an outsider reading that it came off as a brag and I was confused. That's why I asked.

Do you have specific questions about the law that aren’t addressed here? https://e-pluribusunum.org/2018/12/20/senate-passes-evidence...

One has to be careful here. Lots of internal details may give trolls and conspiracy theorists fodder to generate controversy and fake news, often by taking things out of context. It may slow work-flow because workers will be hesitant to write anything without expensive pre-vetting.

Tangent, but related shameless plug. Especially to San Franciscans: https://theconstituent.net.

It makes legislation easier to access. And eventually easier to engage with.

For those who don't know, as the article mentions in passing, there's already quite a bit of data online available from data.gov (which was started by the Obama administration).

Has anyone read any good analysis about Title I of HR4174 requiring evidence when making public policy? Skimming the bill, I don't have a clue what the implications are other than requiring agencies to make some reports. It sounds like a good thing on the surface, but it has been a talking point of the Trump administration, and put into practice in a rather Orwellian fashion.

Basically, the implied subtext is that regulators should not put any restrictions on industry unless the evidence is completely unambiguously in favor of the regulation. However, science is never 100% certain even at its best. And you can always drag up some study that contradicts the strong consensus of the field, whether by fluke or intentional design of the study. In other words "evidence based policymaking" has been euphemism for "must give alternative facts equal weight".

Does anybody know a public source of

net worth + annual earnings+ board positions

of US Senator/congressman spouses+children ?

Cool! And they should also adopt UK-like government web design principles.

I'm glad they were able to do it before the new Congress came on board.

What has that to do with anything?

The bill had large majority bipartisan support (which makes me suspect it's toothless...)

There's a reason it was passed during the lame duck session of the Senate, just before change of control. That's when urgent matters that are guaranteed to be ignored by the next Congress are handled.

I'm surprised this needs explaining.

We should go the path of Sweden and make all IRS records public too...

when I first saw this story, I thought it was a joke ala The Onion. so I looked for it on NYTimes, WaPo, CNN but didn't find it. i still can't find it there.

Open data is a relatively niche topic/policy in politics. It wouldn’t be front page news on a normal day, and so it’s not a surprise if it doesn’t make the weekend headlines, on a week in which the federal govt went into shutdown over the border wall, nevermind the resignation of the defense secretary.

The part about using evidence to make decisions is a bit Onionesque, I concur.

What type of policy implications does this actually have?

Hopefully they release it in RDF.

I wonder if this open data can be (will be?) used by a foreign entity and be turned against the american people? It should be open but not free-for-all

I'm not convinced there are really any important national security secrets other than passcodes and so on. The rest seems to be excuses for hiding corruption of various kinds. Secrecy in government is incompatible with democracy by definition.

You have to think about how people respond to things. Social media in its current state has basically turned into a system that takes an event, removes all context, spins it in the most negative way imaginable, and then dogpiles on it to no end -- virtue signaling for imaginary points. There are often lots of things in government, and in life in general, where you must choose between two very negative choices. And the decision there does not necessarily imply that you're in any way satisfied with it, but you see it as the decision least likely to produce an awful outcome. Social media and this sort of logic are wholly and completely incompatible.

Take Khashoggi as a contemporary example. Undoubtedly there have been countless heavily classified conversations weighting the pros and cons of any action against Saudi Arabia or MBS as a result of this. How we are extremely dependent on Saudi Arabia for reasons outside the scope of this post, and they know this. And this is all happening at a time when Saudi:US relations souring would greatly stand to strengthen the geopolitical position and power of nations such as China and Russia. In my opinion it's extremely likely we will do nothing, but that's because it's better than doing something. If this was not classified, social media would throw a nonstop hissy fit. Not because they actually care or think we're making the wrong decision, but because it's an incredibly easy way score those imaginary points and followers which are the hottest commodity since sliced bread. Another issue here is that knowing the mechanisms of our decision making here, and how close we came to 'breaking', would be incredibly valuable information to nations such as Saudi Arabia which they could then use to exploit the US.

So while I do agree with you that classification is very often abused, I also think there are indeed vast amounts of information that must be classified. People are not mature enough to impartially process our decision making processes, and the details of such processes would provide invaluable information to other nations which could then be used to exploit the US.

Khashoggi is obviously a canard; the Saudi Arabian government is currently running torture chambers in Yemen where they roast people alive[1]. The US has had no problem with this for years; we ourselves appointed a known torturer to head the CIA, the organization which is supposedly agog about Khashoggi's assassination. The CIA itself has been running an extrajudicial assassination program for over a decade now. Talk about "virtue signaling".

These people, the ones who are part of the security apparatus, are not in any position to make moral choices for the rest of us. They are highly immoral actors. I want to see them removed from government. The idea that they are capable of making good decisions in secret, without public input or oversight, is belied by the two decades of war, torture, assassination, and general chaos that they have actively fostered.

The problem, really, is that all of the secret policy conversations about Saudi Arabia have been about how to get them to continue buying weapons from American arms merchants, a policy greatly to the detriment of the American public, which would, in general, favor a policy more like: disengage from the Middle East, stop supporting torture, and transition away from fossil fuels. If we had spent $6 trillion on that instead of funding pointless wars and building up the Saudi torture state, we'd be much better off.

Let's stop having unaccountable, immoral people decide in secret what is good for America, and get back to letting the American public decide. We're better at it.

[1] https://www.news.com.au/world/middle-east/inside-yemens-secr...

About this: "buying weapons from American arms merchants, a policy greatly to the detriment of the American public"

American arms merchants employ the American public. People get jobs. Taxes are paid. Suppliers and subcontractors benefit too.

A detriment would be if Saudi Arabia bought from China or Russia, enriching people in those nations instead of the American public. We'd also have far less influence over Saudi Arabia... really, they could be a lot worse.

In the narrow sense I think you are correct; in the long term deeply incorrect. There are many ways for the American public to be employed. A society built on arms manufacturing seems the least desirable to me. This is the detriment I speak of. Much of our effort and economy is wasted making guns and bombs rather than more useful and less destructive things. This industry sucks up a trillion dollars in government support annually; if it were reduced and that effort (spending) directed elsewhere we would all benefit. We might, e.g., direct it to developing fossil fuel independence and obviate the need for anyone to sell arms to Saudis.

When arguing against the fundamental ability of people to rule themselves, here the ability to eventually make good decisions based on the available data, it's a good idea to consider the alternatives.

For this specific case, a loud and public debate over our country's disastrous relationship with the Saudi royal family is long overdue. Embarrassment, and unseemly noise are a pretty low price to pay to do the right thing and save more lives in, for instance, Yemen.

Our elected officials and government employees don't suddenly gain insight they did not have as common citizens. They just have access to more information.

"...incredibly easy way score those imaginary points and followers which are the hottest commodity since sliced bread." This made me laugh - sadly true

Yet these imaginary points are a source of influential power.

That can lead to money, but doesn't have to. Wielding power is its own ability.

Look no further than HN. Power is granted by how many "imaginary points" you have. Downvote, flag, vouch, and I'm sure more - there's power wrapped up in imaginary points.

And I'm sure someone would also argue that money is also imaginary points, that everyone accepts as 'monetary power's. Of course, this is orthogonal to 'karma', 'likes', or other names of the same thing.

You’re argument amounts to “American citizens can’t be trusted to govern themselves”

I’m sorry but speak for yourself, not me. This is a democracy. Don’t tell me I can’t know about things because I want to score imaginary internet points.

Democracy and secrecy are incompatible. No man is above the law, and laws are not secret.

“The masses can’t be trusted” is nothing but the talk of a despot.

An earlier commentator linked to the UK's effort: https://data-in.place/. Can you give an example of the misuse? Even if there was, surely the benefits outweigh? Transparency is often underrated.

Another angle: Maria Butina's husband is a Republican from South Dakota.

The UK's effort is actually at https://data.gov.uk/ data-in.place appears to be compute_me's personal project.

"open but not free-for-all"

What would this mean in practice?

Available to us citizens only

That is–not to put too fine a point on it–a dumb idea. It’s essentially impossible to enforce, and doesn’t stop the data from being accessed by anybody else.

Waste of time, easy to get around by asking an American friend to forward the data.

Applications are open for YC Winter 2024

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact