Hacker News new | past | comments | ask | show | jobs | submit login
NLP-generated summaries could replace traditional headlines (quod.us)
106 points by newman8r on June 15, 2018 | hide | past | favorite | 43 comments



Are computer generated headlines anywhere near matching human generated headlines such as these?

1935 headline on Variety article about how films about rural life did not do well with rural audiences: STICKS NIX HICK PIX.

New York Daily News article in 1980 on New York state announcement on a Monday that they were going to bail out a transit authority that was in trouble: SICK TRANSIT'S GLORIOUS MONDAY.

Headline in the Sun on story about Inverness Caledonian Thistle beating Celtic in the Scottish Cup: SUPER CALEY GO BALLISTIC CELTIC ARE ATROCIOUS.

New York Post on a 1983 murder: HEADLESS BODY FOUND IN TOPLESS BAR.

The Times (UK, not NY) on US-Iran talks in 2007: GREAT SATAN SITS DOWN WITH AXIS OF EVIL.

The news will be overly dull if headlines like these do not continue to pop up on occasion.


And then there's the possibly-apocryphal headline that I read about decades ago, concerning an inmate at an insane asylum who raped a nurse and fled: NUT SCREWS AND BOLTS.

Certainly these little bits of cleverness are mildly amusing. But if computer-generated NLP summaries prove feasible, human headline writers will have a hard time competing economically: many readers will choose the less-expensive alternative — in much the same way that travelers overwhelmingly choose the cheapest airlines over others offering more leg room, in-flight meals, etc.


BRIDGE HELD UP BY RED TAPE

Of course, I am interested in auto-generating puns. Databases of homophones could be a good start.


> Are computer generated headlines anywhere near matching human generated headlines such as these?

No, but neither are any but a very, very few of the human-written ones.

Headline: Local team scores goal

Teaser: In recent game, the local team scored a goal

First line of article: The Thursday game at Balls Stadium, saw a goal scored by local team

Photo caption: The team which is local scores goal


From 2010:

It’s about Garrett Hedlund defending his decision to skip college to pursue acting.

Headline: NEW TRON STAR NOT DENSE

http://www.threepanelsoul.com/comic/on-astronomy-minors


I think with advances in computational humor it might be possible to come up with some pretty funny headlines. Could be a fun weekend project with NLTK.

https://en.wikipedia.org/wiki/Computational_humor


"In 1966, Marvin Minsky at MIT asked his undergraduate student Gerald Jay Sussman to ±spend the summer linking a camera to a computer and getting the computer to describe what it saw². We now know that the problem is slightly more difficult than that." (Szeliski 2009, Computer Vision) :)


Generating punny headlines from input text might be difficult, but maybe not. I'm not sure if anyone's specifically attempted that, most of the efforts I've seen have more to do with summarization. A lot of puns are pretty formulaic though so it might not be all that difficult.


"Are computer generated headlines anywhere near matching human generated headlines such as these?"

These are genius, but sadly rare these days when even respectable publications are using clickbait headlines that might as well be algorithmically generated.


No, but does it matter? Most headlines are completely uninspired, and that's fine.


Have any examples post-2015?


The economist: "Tyrant is sore at Rex". Not as great as "Ex-Exxon Texan Exec Next Ex Sec", but that one's a twitter exclusive.


Digital dystopia: Robots MURDER free speech


More like: The Top 11 Reasons to Always Believe Friend Computer


I won't believe #6!!


#10 made me cry!


It already does.

Source: We currently sell this service.


Which company?


It says "state of the art" in the description. In my experience that's a keyword for shitty service


From parent's profile: http://cortx.com/


The big question is whether that article itself had its own headline automatically generated by NLP.


It's not that hard because the goal is to clickbait:

Watch how this reacts to that!

(Then automatically remove all links to sources and further information from the text)


If I could replace all those clickbait headlines with automatic summaries, that'd be great.


Then the next step will be that the authors tweak the article until it generates a sufficiently clickbaity headline.


We're using a summarization tool (see below) in our software. User feedback is mixed. I'm not sure whether that's because legally binding texts produced by political bodies are complex and generally making themselves hard to be understood or whether the algorithms themselves have shortcomings or if human brains are silly enough so they'd make up shortcomings that really aren't there.

We really looked through a couple of them and found them to be exceptionally good.

https://github.com/summanlp/textrank


Context-controlled summarization would also help https://vectorspace.ai/datasets/example_context-control.html


I am not sure what it going on here. How does it relate to crypto?


Context-controlled sentiment analysis applied to news and information summaries for the purpose of distinguishing leading price indicators versus trailing price indicators.


If a fully automated company is algorithmically created and maintained that is using NLP/CV to create short summaries/previews with hyperlinks, will it avoid the upcoming link tax? If not, who will be responsible for paying it?


Isn't it actually called natural language generation, or NLG?


I've seen NLU and NLG described as areas within NLP.

I think generating headlines from an article would use a bit of both. A pure NLG task would be more like creating sentences to describe the weather based on weather data, or medical data on a patient, etc.

I'm not an expert in the field though, really just a beginner so my understanding may be off.


That acronym isn't broadly known, so less useful. Also, NLP seems like the encompassing term (The "P" for processing can mean reading/parsing, translation, and also generating).


Yes it is. NLG is the sister field to NLP.


Sounds like a good Firefox extension.


Is there supposed to be an article or something? Nothing shows for me.


Copy-pasted the contents of the article here for you:

Misleading headlines and clickbait make it difficult to efficiently browse news feeds. Natural language processing and recurrent neural networks could soon be leveraged to generate unbiased, accurate summaries of articles and may replace traditional hand-tailored headlines.

A 2015 Stanford paper[1] describes a process for generating headlines from the text of news articles using recurrent neural networks. The process described in the paper utilizes the first 50 words of an article, and concludes that most of the time, the summary is valid and grammatically correct.

In the years since the Stanford paper was published, several open source projects have emerged that simplify the process of getting started with automatic headline generation[2][3]. The software is readily available - the next step is for developers to include these open source tools in their own consumer-facing applications.

It’s unlikely that major news outlets are going to adopt NLP-generated headlines any time soon, and even if they did, it’s unlikely they’d be employed to benefit the reader. Even so, there’s a place for these programatically generated headlines: in the hands of consumers and software developers working in the realm of news aggregation.

NLP-generated headlines could still be optimized for clicks without compromising their accuracy. Several versions of these headlines could be AB tested without any manual effort required on the author’s part. It’s also worth noting that journalists at major media outlets generally don’t write their own headlines anyway4, it’s usually done by editors.

Headlines could even be tailored to each reader: a technical reader may appreciate a more sophisticated headline that contains jargon, but a layperson might appreciate something more simplified. News reader applications and content aggregators seem like natural fits for this type of technology.

With the amount of time people spend scanning through headlines every day, a more effective way to summarize content could drastically improve the reader’s experience. Google became the best search engine in part because it trusted the PageRank algorithm more than webmaster-supplied summaries. In a very similar sense, it’s become difficult to rely on the headlines provided content creators, so it's quite possible that the concept of traditional headlines is ready for disruption just like to concept of search was 20 years ago.


“Natural language processing and recurrent neural networks could soon be leveraged to generate unbiased, accurate summaries of articles...”

I suppose they could be. But I’m going to guess they mostly won’t be used for that purpose.

We’ll likely just get headlines even more optimized for outrage or propaganda.


You're repeating what the article already says (and then gives a reasons to be optimistic).


It's a couple paragraphs that link to a couple of projects in this domain:

[1] nlp.stanford.edu/courses/cs224n/2015/reports/1.pdf

[2] github.com/sallamander/headline-generation

[3] github.com/udibr/headlines

[4] nytimes.com/2017/04/09/insider/how-to-write-a-new-york-times-headline.html


Thanks, it only shows "on Wed Dec 31 1969" for me


The next great idea is how to replace a headline with a single word.


Even better/worse: an emoji


That's interesting, I know facebook has done some research on emotion/sentiment analysis on posts and comments. I could see a browser extension, a feature of a search engine or a content aggregator which analyses the emotional context of an article during indexing and displays an emoji describing the general tone of the article alongside an article's headline or link text. I might even go ahead and build a hacker news clone which tries to do this.

The funniest part will be to put the poop emoji next to a "fake news" or clickbait article.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: