MosaicBERT: Pretraining Bert from Scratch for $20

dicey · on Jan 6, 2024

Super cool article - this was a good reminder for me that innovation is still happening in the BERT realm.

Honestly, for task specific tasks methods like this seem like the way to go over the more general LLM.

Does anyone know if there is any benchmarks that show LLM performance on classification tasks? It’d be interesting to have data to back that up.