*But GP is too focused on hyping XLNet for some reason. * Yeah some reasons, it ...

But GP is too focused on hyping XLNet for some reason. Yeah some reasons, it might be because of stars alignment and could variate inversely with the weather. Or it might be because it's the paper with the biggest number of first place at SOTA leaderboards? https://paperswithcode.com/paper/xlnet-generalized-autoregre...

I've queried most of your examples on the SOTA database that is paperswithcode.com and they have almost zero results. You illustrate the problem, if researchers like you don't even know the general SOTA, how can it be expected to be beaten? But beyond scientists ignorance there is also the problem of models not submitting their results to paperswithcode.com or not testing them extensively but only on niche benchmarks. This second behavior sentence such potentially promising models to remain unknown and therefore mostly irrelevant.

It's always remarkable how one can be a smart researcher and yet not adjust its behavior to be rational regarding those two flaws (not seeking SOTA knowledge, and not promoting SOTA knowledge; READ, WRITE)