FlashAttention is an amazing improvement over the previous state of the art. The others are still highly experimental, but seem like they'll at least contribute significant knowledge to whatever ends up surpassing the Transformer, (assuming something does).