Hacker News new | past | comments | ask | show | jobs | submit login
Can fake news detection be accountable? [pdf] (perso.uclouvain.be)
32 points by xenocratus 31 days ago | hide | past | favorite | 41 comments

> reliable news collected from the New York Times and the Guardian over approximately the same period

Theres your first problem (https://www.msn.com/en-us/news/us/new-york-times-quietly-upd...), there is no authority on what is fake and what is reliable and you'll never get those two groups to agree.

And take this subtle example from NYT a few days ago.

"More than 67 Palestinians, including 16 children, have died since the start of the conflict on Monday, Palestinian health officials said. The rockets fired by Hamas and its Islamist ally, Islamic Jihad, killed at least six Israeli civilians, including a 5-year-old boy and one soldier."

Note the use of died vs killed.


1. they don't need to agree. you decide who you trust.

2. the msn example is about nyt loudly and visibly updating a report. I don't see how that is an example about the claim. other than msn thinks it's a good idea to misrepresent a visible and cleanly tagged article update as "quiet update"

> they don't need to agree. you decide who you trust.

But how do you make that decision? People will choose sources whose news hews to their worldview, no?

didn't they in the past as well? and wasn't it a matter of education (as in humanities) if you'd occasionally also skimmed newspapers and journals of not-your-tribe?

which includes my personal thought about this dilemma: education. democratic consensus that our society is built on certain values, one of which is the ability to deliberately ponder and discern "news".

I understand I live in a country privileged with proportional representation, which is much less prone to polarization than a fptp by district congress. good luck to all of us..

“Who gets to decide what’s true?” is the wrong question. We should be asking “how do we determine what’s true?”


> reliable news collected from the New York Times

Sounds like a pretty small sample size.

Plus the largest manipulations occur by omitting some of the facts that go against the narrative. At what point does bad faith reporting becomes fake news? How do you deal with headlines that are contradicted in a paragraph deep in the article that no one will read?

Fake news has become a corrupted term. In the context of this paper it is the original term, which is completly made up news. For example an article entitled "Biden has Mcdonalds for breakfast everyday and 10 reasons why you should too".

Or auto generated junk like "[Man|woman|person] in [your city|your state|your country] has [biggest|smallest] [body part]" and they generate it for 100s of combinations.

They should have used a different term - perhaps even invented one. It's way too politically charged and has way too blurred a meaning to be useful in an academic context.

Lies of omission are lies just as much as lies of commission because they communicate something that is incorrect.

And yet, they are also true if they are statement of fact:

"More than 80% of people who drank water even once in their lifetime have died!" is a true statement of fact (if you follow the accepted estimate of ~100 billion humans ever). Does it communicate something incorrect? That very much depends on the reader.

Truth vs. Lie, especially by omission, is so context dependent that there's no objective measure one can use.

I don't agree. There is a big difference between "there will always be some grey areas" and "it's all a more or less uniform grey". Your warning of the dangers of dihydrogen monoxide reads more like a joke than serious attempt to misinform, because the reader will presumably know that everyone drinks water and most people who have lived are now dead. But it isn't too hard to come up with similar statements where anyone who knows the actual facts will realize the statement is intentionally misleading, whereas people who don't might be mislead.

TL;DR: I think the uniformly gray area is vastly larger than the "black and white or almost so" area.

Truth in this context is not well defined.

"Dietary cholesterol increase associated with increased mortality" and "Dietary cholesterol increase not associated with increased mortality" are, in fact, both true depending on context. A well written and researched article may elucidate the context in which the headline is true, or (as is more often the case) will just assume that context.

Is it fake news? Depends on the reader's context, but for 99% of the readers at least one of them is fake news, because they are not familiar with the context in which the other is true.

In my experience, the things reported where people -- if they actually looked at the evidence -- would have consensus about "true" or "fake" are actually a small fraction of the things which are - in widely accepted contexts - false.

Let's say I publish an article saying "oxycontin is addictive even if used as prescribed" - today it is considered true, five years ago it was considered fake. The facts did not change, and quite a few people claimed that was the case and had facts to back them up five years ago. This one is easy to talk about because it's essentially been "resolved". There are thousands of examples that have not been resolved, and depend on your context and sources about whether they are "fake" or "not fake".

> And yet, they are also true if they are statement of fact:

The whole point of lying by omission is that the statement is true -- even when what is communicated isn't.

People obsessed with ”fighting fake news” are the same people who want a single narrative shoved down the throat of everyone. I’m wildly more comfortable with the bottom-up style information flow we’re experiencing now; as with big structural changes, some growing pains are inevitable - but it will be much more beneficial and resilient on the long term.

It's amazing how little even journalists care about fake news if it supports the right narrative. For example, there was a story which made the rounds here in the UK about expats who'd voted for Brexit being deported which ticked all the fake news warning signs: it was from a generic-sounding news site with a generic-looking logo and a standard Wordpress template, domain registered anonymously within the last year by someone in Kidderminster, with no apparent corporate backing, history, or identifiable staff or reporters, that had supposedly interviewed people in Spain and come up with a story that played perfectly to people's existing prejudices.

So I contacted one of the more reputable publications that had spread the story, The Independent, and pointed this out. The response from the author was basically that he didn't see the problem and it was up to readers to figure out if it was fake news: "The website quoted in the article does to me look to have a Spanish presence, although it looks like the rest of the news is aggregated. I don't believe there is any reason to disbelieve this particular article, and we have clearly attributed the quotes and included a link to Global247news, so our readers can always make up their own minds."

(For context, the "Spanish presence" was literally an address on the About page for an unnamed "Webmaster" which I'm not sure even exists as an actual address in Spain... and in itself seems a little fishy, given that the website was very definitely not registered to someone in Spain.)

But was it fake news?

There are metric tons of coverage over the last few years about this



I'm guessing this is a variation of your story:


I mean the premise of the article definitely isn't fake. People are getting rejected for TIE cards, that's always been a thing.

Brexit now forces British expats to apply for a TIE card.

So what's left to be fake news?

For the record, most people associate fake news with stuff that's based in complete lies, not stuff that's maybe a little low quality.

> People obsessed with ”fighting fake news” are the same people who want a single narrative shoved down the throat of everyone.

That's an absurd, bad-faith generalization.

Absurd? Definitely not. Go ask the militant fake news fighters if alternate news sources should be banned and you’ll get a resounding yes.

I admit I haven't fully read the article, but isn't it obvious from the start that you can't detect "fake news" without looking at the outside world for facts?

Any approach that is trained to detect "fake news" that just relies on feeding it the article's content is either just going to classify based on incidental features such as style (e.g. sensationalism), "hot words", etc., or if it's really good, classify based on whether the article's content is consistent with the "reliable" or the "fake" narrative extracted from the previous training.

In fact if I'm reading this right, they find out that taking a "fake news" article and rewriting it in "reliable news" style fools their detector. Garbage in, garbage out.

This is obviously true, a very simple example is the following piece of news: "the most recent vote of the US Senate was 50-48", which not only requires you to look up the actual number, but is also time-dependent.

However, the common usage of "fake news" is not only literally "news that is not true", but rather "news that is deliberately not true in a very specific way".

Many object to the usage of the term itself, and honestly I myself agree that it's not the best way to describe the phenomenon, but it's something that exists regardless of how you choose to call it. I can see the research being useful, I wouldn't say "garbage in, garbage out" really applies here.

I guess you could say that "fake news" is "not true" + "propaganda/flamebait/viral style".

In my opinion this article focuses on the later but not on the former, while using the word for the combination (oh the irony).

Though in that sense I agree that it's not completely useless, as it may be good to strip or standardize the propaganda/virality aspect so one is forced to actually look at the facts.

Well the detection algorithm for that would simply be a function that always returns `true`. There might be a small percentage of false positives, but pretty much negligable.

> However, the common usage of "fake news" is not only literally "news that is not true", but rather "news that is deliberately not true in a very specific way".

The word "fake" surely doesn't have a connotation of an honest mistake, does it? That's not just "common usage", it's the only one I'm aware of.

One of the unfortunate quirks of current ML algorithms is that you can sometimes accidentally train your model to recognize something correlated with the target label that is not actually the target label based on other tells (think of the tank image recognition story that was circulating again a few days ago).

In this case, some style of writing or a particular set of authors, publications, timestamps, etc could be features useful in determining whether something will be fake news with high accuracy. If so you don't necessarily need to always analyze the factual contents to detect fake news with high accuracy.

> is just going to classify based on incidental features such as style (e.g. sensationalism), "hot words", etc.

"Just?" Can you really not read a newspaper article about a topic you don't have a lot of depth in, and tell just based on the writing style (vocabulary, techniques, embedded logic, etc) that it is "likely" to be at least inaccurate?

How well this can be implemented with current machine learning technology is another discussion.

I don't think fake news detection need to be more accountable than news agencies themselves. It is just another source of information. There are simple things we can do:

1) de-bias news stories, we can do that with algorithms. News agencies use different words for the same thing depending on their position.

2) show the background and development of a story. it can also be done with algorithms. The fact check and fake news detection can go here

The problem is determining what is true. Something that experts say is true could turn out not to be true, but we don't until after the claim is already called fake news. People won't see the revision that it is not in fact fake.

The answer is no.

It's a large problem for the narrative overlords to enforce the acceptable narratives for the population at large during the age of Internet. The "fake news" meme didn't work out so well, I wonder what they will think next.

I guess if you just push these things long enough, the resistance will slowly dissipate as they are just single people and not a well organized, oiled machine who is motivated by money and political power until the end of civilization.

Not saying that most of the mainstream views are wrong. Just that they are manufactured and subjected from above to the people below.

The "fake news" problem on Social Media is a huge problem. I'm not talking about arguable differences of opinion, I'm talking about baldfaced lies. Objectively "fake" news.

For example reverse-image searches of bombed out buildings or dead children that purport to be from a conflict yesterday were revealed to be from the Syrian Civil War from 6 years ago. And these images were re-tweeted by Americans, some in positions of responsibility. I'm all for free speech, but I feel that people and platforms should be held accountable for actions like this, when they involve objectively false news. False by anyone's definition of false.

It is occasionally clear cut.

But what if those same pictures are tweeted, with only a vague text like "This is how war looks like"? I've seen this in the last week, specifically with 6-year old Syrian civil war pictures, and placed strategically among tweets about recent israel/palestine war.

Some of the replies to the poster were "hey, this is old pictures" to which they replied "I never claimed otherwise".

Is this fake news by any objective measure?

If you somehow detect (and disallow) provably fake news, it just means that it will shift to deliberately vague statements, strategically placed among true statements, so that they cannot be faulted as "false".

I think the problem is inherently unsolvable.

> I think the problem is inherently unsolvable.

Most of the world's problems are inherently unsolvable if you treat solved/unsolved as a binary condition. That doesn't mean that there's no obligation to help mitigate at least some of the negative effects that technology (and I'm not singling out computing) enables or amplifies.

I don't consider this a binary condition. There's an inherent balance on information dissemination, much like any "true positive / false positive". We might not be at the optimum, but I don't think we're far off either.

It also depends on your priors, of course. We might disagree on whether only "authority approved" data can be disseminated, and about specific authorities (gov, ny times, snopes, individuals, ...), and whether precision or speed is more important - you can delay any tweet for a year at which time it can be verified. That will considerably reduce the "fake" factor, but at a cost I am not willing to pay.

As we see from the downvotes, people feel that 100% fake pictures are fine if they feel their agenda is worth promoting. The fact that people here on Hacker News feel that way means there's no solution to this problem. When you're dealing with pure evil, no rational appeal to not spreading lies is going to help.

While this sort of "memeing" is unfortunate and probably unsolvable, the sort of fake news that invents or even heavily skews facts to match a narrative could be detectible.

This is inconsequential. People ARE dying, that part isn't made up. It would be fake news if there had been no casualties and pictures from other wars were used to assert that there had been.

Fake news is largely defined by context and degree. Details can matter, a lot, in specific situations. This isn't one of those situations.

I think this attitude is terrible. Facts matter. You think lies are OK, and that's not right.

No it can't, because the labeling of news coverage as "fake news" is highly political. By reasonable standards, much of the mainstream US news outlets would be categorized as peddlers of fake news and be suppressed... except that will not be allowed to happen, so at some point, there will be some manipulation of either the definition or the methodology to ensure that US-government-friendly sources are favored. Similarly, if this was worked on in another world state, it would likely get manipulated in accordance with that state's political interests.

And accountability + political manipulation don't mix that well.

(in the sense of not necessarily factually incorrect, but

What if an independent, international, grassroots organization outside the control of governments took up the challenge?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact