I guess different small models will have different points/goals, but you can sti...

I guess different small models will have different points/goals, but you can still have a small model with lots of training effort or a large model with little training effort.

I think the point of most (frontier) small models is usually to provide the best answer possible given small inference resources, rather than to reduce training time.

This is more of a toy model, so fun and an interesting project but it doesn't necessarily tell us what the art of the possible is for small models.