
Self Supervised Learning in NLP - amitness
https://amitness.com/2020/05/self-supervised-learning-nlp/
======
why-el
The deepemoji one is fooled by "was my flight delayed? no.". I feel for the
computer when it meets that one "do I speak in questions?" person. _chuckles._

On a more serious note, Hinton and other alluded to the need to restructure
NLP studies as to focus more on the nature of recursion within language, which
is basically what Chomsky has been saying for decades. It's interesting to see
whether they converge.

~~~
amitness
Detecting sarcasm is a hard problem.

~~~
lerax
Even humans fails on this very often.

~~~
amitness
Haha, just imagine if an AI faces Chandler

~~~
andrecosta
Towards Multimodal Sarcasm Detection (An _Obviously_ Perfect Paper)

[https://www.aclweb.org/anthology/P19-1455](https://www.aclweb.org/anthology/P19-1455)

------
nxpnsv
It’s like one of those low effort medium articles. But not low effort, and not
on medium. I really enjoyed it.

~~~
dunefox
That's a weird compliment.

~~~
nxpnsv
Agreed, i probably should edit it.

------
dhab
Thank you - I found the article helpful in giving me a lay of the land

~~~
amitness
Great to know.

------
antpls
Hello, number 9) doesn't say what the task is?

Also, I always wondered, do those methods work universally on all languages?
For example Chinese, Korean and Japanese, with different alphabets.

~~~
amitness
I've elaborated number 9. Please let me know if it's clear now.

Some of these methods do. For example, the tasks that power word vectors apply
for many languages. [https://fasttext.cc/docs/en/crawl-
vectors.html](https://fasttext.cc/docs/en/crawl-vectors.html)

Masked Language Modeling has been applied to learn cross-lingual language
models. Look into:
[https://arxiv.org/abs/1901.07291](https://arxiv.org/abs/1901.07291)

------
labelbias
I'm wondering if these tasks have a form of bias that decreases the
performance. If the model sees only positive examples and no negatives then it
is biased on the positive paths of decisions. The moment where one changes the
path to be incorrect, the model can't recover from the mistake because there
weren't any negative examples during pretraining. There's many words that
never follow some words but the model never sees that.

