
On Extractive and Abstractive Summarization with Transformer Language Models - hirundo
https://arxiv.org/abs/1909.03186
======
mayank
It's definitely cute that the abstract was generated by the model, but I
wouldn't give that too much weight because it's the definition of cherry-
picking. In this case, you can pick your data (the contents of the paper) to
match a desirable output from the model (the abstract).

~~~
ma2rten
I think it would be a lot of effort to keep changing the paper until you get
the perfect abstract. It would be easier to train different models or do
random sampling from the predicted distribution.

------
semi-extrinsic
This is impressive. They trained it on 200k articles from arXiv, 130k from
PubMed, and over 1 million each from Newsroom and BigPatent. They have
comparisons of generated abstracts versus actual abstracts of some landmark
NLP papers.

My only gripe is that I would have liked to see (maybe in an Appendix)
examples on papers on completely different topics, say one in biology, one in
math and one in physics. It would be difficult to pick good examples, sure.
But it would significantly strengthen at least my impression of the
transferability.

~~~
jcims
Would be interesting to see what kind of paper it writes.

------
aantix
I really wish that the code or at least a working demo was a requirement to
make such claims public.

~~~
rajangdavis
From how they set up the training, I think this is a nontrivial task. Also,
from a casual read through, it looks like it is generally focused on Arxiv
papers.

To their credit, the authors included the models used and the metrics they
used to validate their model. They also have detailed notes on the
architecture for training which, at a quick glance, doesn't look easy to
replicate unless you can borrow some GPU's in the cloud.

~~~
RobertDeNiro
It focused on arxiv because you need a large set of labelled data (i.e. long
documents with summaries). There is not many datasets of that kind out there.

------
JonathanFly
Imagine a future where 'this abstract was generated by the model' is in the
training material for future papers
[https://twitter.com/jonathanfly/status/1171551688668471297](https://twitter.com/jonathanfly/status/1171551688668471297)

------
kieckerjan
Now if that is true, that is one badass summary!

~~~
dang
(The submitted title was "This abstract was generated by one of the models
presented in this paper".)

------
nullsmack
abstraception

