Hacker News new | past | comments | ask | show | jobs | submit login

Is fact checking that difficult?

Imagine a forum that works in the following way:

1. You post similar to here.

2. Any post that has a claim is flagged. The poster, or another commenter has 24 hours (or some time period) to link to both a page and specific text (similar to how Google highlights search text when visiting a link) that supports the claim.

3. Once a claim is attached, posters can vote and downvote how well the source(s) support the claims (claims themselves are not voted/downvoted).

4. Sites are whitelisted and blacklisted by looking at the total percentage in which claims are sited with sources that are downvoted vs. up + downvotes.

5. Posts that do not receive sources for claims are deleted.

I feel like the only reason this isn't already done is because it was dramatically decrease the amount of posts and increase friction around posting. Posters would have to be careful to assert nothing in their post.




It's extremely difficult because the human psyche seems incapable of distinguishing "fact" from "that which supports my view", at least when the emotions are activated, and the only topics where people seek "fact checking" in the first place are the ones where emotions are activated. There may be exceptions—i.e. people whose minds don't work this way—but if they exist, they're so rare as to be no basis for social policy. (And I doubt that there are really exceptions.)

That's why there doesn't seem to be any fact checker whose calls aren't predictable from one of the major ideological partitions.

Another way of putting this is that the question, "what are the facts?" is complex enough to already recreate the entire political and ideological contest. It's understandable that people would like to reduce that contest to a simpler subset of factual questions—but you can't. Just the opposite: that apparently simpler subset reduces to it.

(Edit: it might be worth repeating that there are infinitely many facts, and they don't select themselves—people do that, and when they're people with political preferences, that's usually the high order bit. https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...)


Excellently put.

> It's extremely difficult because the human psyche seems incapable of distinguishing "fact" from "that which supports my view", at least when the emotions are activated, and the only topics where people seek "fact checking" in the first place are the ones where emotions are activated. There may be exceptions—i.e. people whose minds don't work this way—but if they exist, they're so rare as to be no basis for social policy. (And I doubt that there are really exceptions.)

I think there are at least people who do seek to fact check their own views or dissents from their own views with genuine curiosity. But that's still a matter of degree, and even the most platonically curious among us still use the same monkey brain blueprints that we all do. They may do their due diligence more often, but never, ever always.


> Another way of putting this is that the question, "what are the facts?" is complex enough to already recreate the entire political and ideological contest. It's understandable that people would like to reduce that contest to a simpler subset of factual questions—but you can't. Just the opposite: that apparently simpler subset reduces to it.

Great point, which reminds me of 1960s "English Prime (E-Prime)", https://en.wikipedia.org/wiki/E-Prime

> a version of the English language that excludes all forms of the verb to be, including all conjugations, contractions and archaic forms ... D. David Bourland Jr., who had studied under Alfred Korzybski, devised E-Prime as an addition to Korzybski's general semantics in the late 1940s. Bourland published the concept in a 1965 essay ... Albert Ellis advocated the use of E-Prime when discussing psychological distress to encourage framing these experiences as temporary ... and to encourage a sense of agency by specifying the subject of statements.


Very well said. And the only difference between community moderation and 3rd party "fact-checkers" is if the "fact-checkers" have a different average political profile than that of the users of the site.


First of all, even genuine, sincere upvotes and downvotes are not a reliable way to evaluate truth. Second, even if they were, such a system could easily be gamed by parties with political interests.


A local newspaper had instituted separate "I agree / disagree" and "Well / Badly argued" buttons to their comments section. This was, in principle, a good idea. It'd let people clap for things that made them feel nice but didn't exactly have much substance to them, and to recognize well-argued opposing points while registering their dissent. The system was, in theory, excellent.

Of course, nothing remotely close to the ideal use case happened, and banal nonsense was voted as being a good argument because the readers agreed with it, and good, carefully thought out comments got swamped in bad argument flags because they were disagreeable. People just rated on one feeling-valence and nothing else.

In the end, the paper canned most of the voting options, and the interface is left with "reply" and "well argued".


The up and downvotes are not there to evaluate the truth, the cited source of the claim is. No system can be a substitute for the reader to investigate themselves. The things I'm proposing basically just help triage your effort.


Well, yes it could easily be gamed by parties with political interests, but FB's system of fact checking is parties with political interests.


> Is fact checking that difficult?

Yes, it is. Years ago I went through a period of personally fact-checking stories across the board as they were being discussed by my FB contacts. It is hard to quantify just how time consuming this was.

Some stories were super easy to debunk. Others took days to weeks of research and sometimes mathematical modeling. Other stories took 10+ months to wait of legal evidence to surface. I remember one particular case where someone tried to burn down an black church and spray painted "Vote Trump" on the side of the building. The outrage and blame was immediate, massive and pretty much across the board. I looked at it from a game theory perspective. It just didn't make sense. It took some months for law enforcement to finally track down the culprit: It was one of the members of the church, black, of course, who had a problem with the pastor (or something like that) and thought he could burn down the church and deflect blame through his graffiti.

There were cases like this on both sides of the US political spectrum, of course. This is one I remember. A lot of the cases on the other side have to do with things like climate change and vaccine denial.

The point is, it takes a lot of time and effort. It's almost impossible. And, when you finally get down to facts, convincing people what they were told was wrong is pretty much impossible. They can't un-see the lie when it was carpet-bombed into their brains.

Not sure what the solution might be. Taking sides doesn't seem to be a good idea.


Of course, none of the current fact-checkers on the major platforms are doing anything like that.

I might be wrong, but I imagine each has a binder with a bunch of bullet points under each topic, listing common claims determined to be false by management. Probably some post-it notes holding new claims that their manager saw the previous day and didn't like.

Maybe the 'lab-leak' claim is scratched out.. it's permitted to be spoken of, now.


This wouldn't work, at least not outside of places like HN with a strong technical bent and a strong assumed knowledge base of the people.

I've done fact-checking work professionally; it's not that fact-checking itself is difficult, it's that fact-checking can only work as intended in certain information environments, and mass media is NOT one. I've also studied filter bubbles, algorithmic influences on political POVs, etc.

Your proposed system might work in a vacuum, but unfortunately, our modern landscape is NOT such a vacuum.

Issues with your proposed system:

2.) What counts as a claim? Is 'the sky is blue' a claim, or is that common knowledge? We clearly wouldn't flag common knowledge because using the flag when it's not necessary deprecates the usefulness of the flag, but then who decides what counts as common knowledge, particularly across different cultures and countries?

3.) A common metadata way of judging someone online is through their sources. Think of many subreddits which don't allow certain 'left-wing' or 'right-wing' sources. The supporting text will be upvoted and downvoted based on the readers' idea of the source: In the politics subreddit, a NYT 'claim support' would be upvoted while a WSJ one would be downvoted. Instead of using headlines as proxys, the URL would be: "Oh, it's X. They just always lie. Don't even need to check; downvote, nobody listening to THEM could be correct."

4.) Once the whitelisting and blacklisting go into effect and people are aware of it, the voting becomes even more distorted as certain people with grudges can both climb community hierarchy (and therefore put themselves in positions of influence and say things like 'we don't read X here, they're all liars') and encourage voting en masse to blacklist certain sites they disagree with.

5.) This is terrible for search and archiving as well as for tracking bad actors.

And that's just off the top of my head.


Wouldn't work here either. Just look at how the lab leak theory was handled on Hacker News. The community and dang both failed spectacularly in that instance. How many other instances have we gotten it wrong but just don't know because views here are as subject to ideological mania as anywhere else?


I've noticed the quality of conversations on HN have deteriorated drastically after dang took over. There're simply too many threads where the whole conversation is complaining, whining, dunking, outrage etc.

Anytime there's a thread on Netflix, there'd be a group of users who would start complaining about Netflix recommendations. Even when the article has nothing to do with recommendations. I am using Netflix as an example, this pattern you'll notice in many other HN threads.


> 2.) What counts as a claim? Is 'the sky is blue' a claim, or is that common knowledge? We clearly wouldn't flag common knowledge because using the flag when it's not necessary deprecates the usefulness of the flag, but then who decides what counts as common knowledge, particularly across different cultures and countries?

As a further point, what if a blog says "not a single cloud was in the sky" and subsequent facts show there was in fact one cloud in the sky - does that count as fake news?


> As a further point, what if a blog says "not a single cloud was in the sky" and subsequent facts show there was in fact one cloud in the sky - does that count as fake news?

I wouldn't say it's "fake news", but it would be a claim that would be deleted, yes.


That's also a common fictional phrase. What if it's part of a blog post that's a combination of fiction and non-fiction. Or it's quoting somebody, and that person in the blog post is wrong, but since it's a direct quote, it would also be untrue to change it? Or if it was true that there were no clouds in the sky when it was written at 9 AM but there were clouds at 2 PM and you don't know when the blog post was written?

And how do you account for metaphors, common fictional phrases and uses, in-jokes, and claims that cannot be verified but a person still has the right to make? For example, I have a lot of Web memories that pre-date the Internet Archive and Wayback. I'm not pulling my claims from nowhere, but it's not my fault they're hard to back up either.


> 2.) What counts as a claim? Is 'the sky is blue' a claim, or is that common knowledge? We clearly wouldn't flag common knowledge because using the flag when it's not necessary deprecates the usefulness of the flag, but then who decides what counts as common knowledge, particularly across different cultures and countries?

Everything, including your example. The entire point is to discourage people from making claims.

> 3.) A common metadata way of judging someone online is through their sources. Think of many subreddits which don't allow certain 'left-wing' or 'right-wing' sources. The supporting text will be upvoted and downvoted based on the readers' idea of the source: In the politics subreddit, a NYT 'claim support' would be upvoted while a WSJ one would be downvoted. Instead of using headlines as proxys, the URL would be: "Oh, it's X. They just always lie. Don't even need to check; downvote, nobody listening to THEM could be correct."

This is a good critique, but if one were actually implementing this, you'd also need to implement some way to verify that people are actually reading the sources (which in itself is already a big annoying problem).

Your other points are also good, but ultimately this thing I'm claiming would be niche due to the friction involved. If it were to not be niche you'd want to put in effort to "validate", or in other words have "trusted" (another problem) members randomly select controversial claims and verify them manually.

I don't believe a fully algorithmic approach, even with "the crowd" can work.


> Everything, including your example. The entire point is to discourage people from making claims.

Then you wouldn't have discussion, at least not of this type. It would resemble more a slightly faster version of academic papers as everybody has to read everything and becomes more invested in preventing themselves from being blasted for making an unsupported claim on accident than in contributing.

> This is a good critique, but if one were actually implementing this, you'd also need to implement some way to verify that people are actually reading the sources (which in itself is already a big annoying problem).

Right, but when you say 'we should do thing X' and when somebody says 'we can't do thing X without solving Y' you can't just reply with 'also solve thing Y'. For example, we should go to Alpha Centauri, but I think figuring out FTL travel is sort of necessary first.

> Your other points are also good, but ultimately this thing I'm claiming would be niche due to the friction involved. If it were to not be niche you'd want to put in effort to "validate", or in other words have "trusted" (another problem) members randomly select controversial claims and verify them manually.

I don't believe a fully algorithmic approach, even with "the crowd" can work.

Yeah, it might work in that particular usecase. Perhaps as an adjunct to academic listservs.


What a silly proposal. You can find plenty of web pages with specific text that supports obvious falsehoods like flat earth theory.


"When disagreeing, please reply to the argument instead of calling names. 'That is idiotic; 1 + 1 is 2, not 3' can be shortened to '1 + 1 is 2, not 3."

Your comment would be fine with just the second sentence.

https://news.ycombinator.com/newsguidelines.html


Not really. The whole point of down and upvoting the sources is because the sources themselves might be erroneous. Without an infinite cascade of the process I described, relying on down and upvoting of specific claims is more tractable.

In other words, you might have a claim that's supported by an otherwise bad website, but the bad website actually has a morsel of legitimately factual evidence there. It would be up to the readers of the claim and sources to actually verify whether or not the underlying supporting evidence is true.

The other reason to do this is to build a corpus of "truth." Meta and Google are clearly capable of this, but choose not to do it for reasons.

---

This is why it's important that the claim itself is not voted on, but rather the source. Otherwise brigading will happen (it can obviously still happen, but is less of a problem in face of some hypothetical pure truth).

I'd also add that inherently the problem of "determining the truth" requires consensus. What I'm proposing is simply a way to make said consensus be gathered transparently. Obviously there's no oracle of truth out there for us to consult and some variant of what I'm describing is inherently necessary.


No that would never work. The flat earthers will all upvote each other's bullshit.

And you are missing a fundamental concept in that for most fields of science there are very few objective facts or proven truths. If I write a post on Newtonian physics should I be "fact checked" for misinformation if I fail to state that it's only an approximation and doesn't always give correct answers?

It used to be an accepted "fact" in medicine that peptic ulcers were caused by stress and spicy food. Except that fact turned out to be wrong. How would Drs. Warren and Marshall have rated under your scheme?

https://www.news-medical.net/health/Peptic-Ulcer-History.asp...


your example makes no sense - flat earthers are a minority, the cited claims would ultimately receive more down than upvotes. also, the point of the votes is to help readers understand things themselves.


No that makes no sense. Other people just ignore the flat earth bullshit, they don't even vote. There's no point to what you're proposing.


OK, nice discussion.


How do you present novel ideas in this framework?


> How do you present novel ideas in this framework?

In that framework you present novel ideas, and they are vetted, somewhere other than the particular forum.

While it is somewhat different in detail, it's conceptually a lot like the Wikipedia model.


Yes, it is very similar to what Wikipedia does. In fact I'd say the only real distinction is that what I'm describing in theory should enforce evidence far more strictly and on a more granular basis than Wikipedia does with articles.


Great question -

I'd say there are two main ways:

You have an assert that would comprise of many claims that are all supported independently by sources, and so you would then have a hypothesis and present it as so (as opposed to a fact, which would be a claim and require evidence, but inherently your idea is jus that).

You present your idea on another forum that's eventually made into a fact by empirical evidence and then later link to that, citing your novel idea. This as you imagine is somewhat self-referential and probably wouldn't work at scale.


I guess you present it off-site (your own blog or what have you), and use that presentation as source for the claim on-site.


Gather evidence for it if the novel idea is a claim


> posters can vote and downvote

Somewhere at this point it will get into "yes, he's a sonofabitch, but he's our sonofabitch!"


> posters can vote and downvote how well the source(s) support the claims (claims themselves are not voted/downvoted).

Great idea, however I'm not sure how to make so that the first doesn't degenerate into the second, especially about sensitive issues... Hmm, maybe this is one of the cases where prediction markets can work ?


The wealthy wouldn't mind losing money in that market to prop up the rating of their favorite propaganda outlet.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: