Hacker News new | comments | show | ask | jobs | submit login
Publishing more data behind our reporting (economist.com)
146 points by gballan 64 days ago | hide | past | web | favorite | 9 comments



This is a great move. Journalists are finally regaining our trust in a western world that really needs somebody to hold other institutions accountable.

https://www.edelman.com/trust-barometer


Unrelated to the article: why does a news organization with an established web property stoop to writing articles on Medium?


Because non-subscribers can’t get past the third paragraph of the paywall?


Publishing raw data itself is definitely a good start but there also needs to be a push towards a standardized way of sharing data along with it's lineage (dependent sources, experimental design/generation process, metadata, graph relationship of other uses, etc.).


> Publishing raw data itself is definitely a good start but there also needs to be a push towards a standardized way of sharing data along with it's lineage (dependent sources, experimental design/generation process, metadata, graph relationship of other uses, etc.).

Linked Data based on URIs is reusable. ( https://5stardata.info )

The Schema.org Health and Life Sciences extension is ahead of the game here, IMHO. MedicalObservationalStudy and MedicalTrial are subclasses of https://schema.org/MedicalStudy . {DoubleBlindedTrial, InternationalTrial, MultiCenterTrial, OpenTrial, PlaceboControlledTrial, RandomizedTrial, SingleBlindedTrial, SingleCenterTrial, and TripleBlindedTrial} are subclasses of schema.org/MedicalTrial.

A schema.org/MedicalScholarlyArticle (a subclass of https://schema.org/ScholarlyArticle ) can have a https://schema.org/Dataset. https://schema.org/hasPart is the inverse of https://schema.org/isPartOf .

More structured predicates which indicate the degree to which evidence supports/confirms or disproves current and other hypotheses (according to a particular Person or Persons on a given date and time; given a level of scrutiny of the given information) are needed.

In regards to epistemology, there was some work on Fact Checking ( e.g. https://schema.org/ClaimReview ) in recent times. To quote myself here, from https://news.ycombinator.com/item?id=15528824 :

> In terms of verifying (or validating) subjective opinions, correlational observations, and inferences of causal relations; #LinkedMetaAnalyses of documents (notebooks) containing structured links to their data as premises would be ideal. Unfortunately, PDF is not very helpful in accomplishing that objective (in addition to being a terrible format for review with screen reader and mobile devices): I think HTML with RDFa (and/or CSVW JSONLD) is our best hope of making at least partially automated verification of meta analyses a reality.

"#LinkedReproducibility"; "#LinkedMetaAnalyses", "#StudyGraph"


Recently my website was criticized for using the data on the food package rather than USDA data.

I'm unsure what the 'correct answer' is, given they were off by 25% which is significant.

I could add javascript so the user could decide which reporting method they would like to see, but I'm considering that this is massive overkill with limited benefit for the customer.

I doubt my conclusions are impacted as despite this huge difference, top 10 items usually stay around the top 10.

Any thoughts on how to make the data the most trustworthy and accurate?


Doesn't have to be fancy JavaScript. Even an asterisk next to disputed items and a link to the "Alternative Nutrition Facts" would be helpful.


Ive been thinking about this since the morning.

I believe a link to the alternative calculation is probably going to be my solution.


Is there any information available about why the two sources differ so much? For example, is the USDA data not updated frequently?




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: