Phi-2 and TinyLlama are so so so impressive for being < 3B parameter models. The...

Phi-2 and TinyLlama are so so so impressive for being < 3B parameter models. They can run on a phone, and are pretty snappy.

Benchmarks: https://github.com/ggerganov/llama.cpp/discussions/4508

I don't see them taking over general purpose chat/query use cases, but fined tuned to a specific use case and embedded into mobile apps, might be how we see LLMs jump from cool tech demos to something that's present in most products.