"Cheeseface just dropped the Blippy-7B model which is almost as good as the twinamp 34B model on the SwagCube benchmark when run locally as int8 and this shows that the gains made by the skibidi-70B model will probably filter down to the baseline Eras models in the next few weeks"
Thats giving the community way too much organization credit.
Everyone just seems to be running experiments independently and then randomly drop some results, with basically no documentation. Sometimes the motivation is clearly VC money or paper exposure, but sometimes there is no apparent motivation... Or even no model card. Then when something works, others copy the script.
Not that I dont enjoy it. I find the sea of finetune generations fascinating.
"Cheeseface just dropped the Blippy-7B model which is almost as good as the twinamp 34B model on the SwagCube benchmark when run locally as int8 and this shows that the gains made by the skibidi-70B model will probably filter down to the baseline Eras models in the next few weeks"