By my thing was very heuristic based, and couldn't generate new sentences like this can. I'm pretty impressed - I'd say some of the machine generated summaries are better than the human ones.
 http://classifier4j.sourceforge.net/ (yes, Sourceforge! Shows how old it is!!)
I dream of a not-so-smart news summarization engine that will not try to rewrite the news, but only pickup all the numbers and quotations, then present them in a table of who-said-what and how-many-what, along with the title.
This would put an end to filler-based journalism.
The unfortunate truth is that your clickbait and byte sized arguments are what people want. Trying to solve it from the top down is a lost cause.
No it might put an end to the filler-producing journalists, the so called journalism would still get produced, albeit by a bot.
The real journalists (in terms of a better differentiation) would then be even more drowned out in an ever growing dessert of CGH (computer generated headlines).
This dataset was only used to benchmark against other published results. It was first proposed in https://arxiv.org/abs/1509.00685.
As an example, here is the Google article resumed by SUMMRY.
Research Blog: Text summarization with TensorFlow
Being able to develop Machine Learning models that can automatically deliver accurate summaries of longer text can be useful for digesting such large amounts of information in a compressed form, and is a long-term goal of the Google Brain team.
One approach to summarization is to extract parts of the document that are deemed interesting by some metric and join them to form a summary.
Above we extract the words bolded in the original text and concatenate them to form a summary.
It turns out for shorter texts, summarization can be learned end-to-end with a deep learning technique called sequence-to-sequence learning, similar to what makes Smart Reply for Inbox possible.
In this case, the model reads the article text and writes a suitable headline.
In those tasks training from scratch with this model architecture does not do as well as some other techniques we're researching, but it serves as a baseline.
We hope this release can also serve as a baseline for others in their summarization research.
My question is how well the trained models interpret human meaning in joined sentences. I discovered that by simplifying sentences you lose the original meaning when that grammatically-low-importance word is central to the meaning. "Clinton may be the historically first nominee, who is a woman, from the Dem or GOP party to win presidency" is way different meaning than that if you remove the "who is a woman". I am also interested in how it makes sence to join-up nouns/entities across sentences. This will cause the wrong meaning unless you are building the human meaning structures like in ConceptNet by learning from the article itself, as opposed to pretrained models based on grammar or word vector in Gigaword.
My work for the future, is using tf–idf style approach for deciding the key words in a sentence, which I would recommend over relative grammar/vectors. In the example in your blog post ("australian wine exports hit record high in september") you left out that it's 52.1 million liters; but if the article went on to mention or relate importance to that number, by comparing it to past records or giving it the price and so on, you can see this "52.1 million liters" phrase in this one sentence has a higher score relative to the collection of all sentences. As opposed to probabilistic word cherry picking based on prior data, this approach will enable you to extract named entities and phrases and build sentences from phrases in any sentence that grammatically refer to it.
Things lower on a parse tree aren't less important than things higher. It just represents a dependency relationship.
You still need some way to find what's "less important", which is what the topic is all about, like by using grammar dependency or keywords infrequency.
I have some experience in this area. I found keyword frequency worked quite well.
This illustrates the importance of taking the trouble to understand a domain before trying to model it. I was taught in journalism class that the first paragraph of a newspaper article should summarize the story, the next 3-5 paragraphs summarize again, then the rest of the article fill in the details. Not only do the authors spend time discovering what should have been known from the outset, they reverse cause and effect. The model can generate good headlines due to the nature of newpaper writing, not due to the nature of headlines.
Was hoping to get rather some more insights on this.
Because when looking at the examples given, I wonder if we really need machine learning to summarize single sentences? Just by cutting all adjectives, replacing words by abbreviations and multiple words by potential category terms, we should face similar results. Maybe it's just a start or did I miss anything?
It is still mostly inane garbage but the content and gems have improved steadily over the past year or so. Interesting experiment in any case.
We encourage you to try the code and see for yourself.
 That is, the problem of selecting a sentence such as "He approved the motion" and then realising that "he" is now undefined.
As for how does the model deal with co-reference? There's no special logic for that.
>>"In those tasks training from scratch with this model architecture does not do as well as some other techniques we're researching, but it serves as a baseline."
Can you elaborate a little on that? Is the training the problem or is the model just not good at longer texts?
I've seen copynet, where you do seq2seq but also have a copy mechanism to copy rare words from the source sentence to the target sentence.
article: novell inc. chief executive officer eric schmidt has been named chairman of the internet search-engine company google .
human: novell ceo named google chairman
machine: novell chief executive named to head internet company
(It's still great that it beats all competition on the ROUGE score, of course.)
I thought the generated summary was really, really good. But I knew that Novell wasn't considered an internet company, so it wasn't until I made myself ignore that before I could see the other reading.
On the other hand, the football summary is exemplary; better than the provided abstract.
> hainan to curb spread of diseases
That sentence pretty much conveys no useful information - every city wants to "curb spread of diseases", so what has actually changed? The news here is about restriction on livestock, and even a student journalist would be expected to do better than this headline.
To be clear I'm excited about the idea and believe machine learning has much better potential for enormous refinements compared to SMMRY's method (as described by them), I just don't think it's as "done" as a lot of people here seem to assume it to be.
Start with Scikit-Learn or R.
Once you start doing neural network stuff, start with Keras on top of TF.
Edit: Hey, found it! https://github.com/mck-/oneliner
Probably not as sophisticated but does the business. Nice bit of work done on this.
# Run the eval. Try to avoid running on the same matchine as training.
Either way, this is great research with a ton of real world applications!
That would suggest that this method doesn't work well for long documents.