Have you had any luck using a smaller model (<= 3B parameters) for anything? Every time I've poked around with them, they seem to stupid to follow the instructions I try to provide.
Curious if others have had any more luck and, if so, which model and for what use case.
reply