I bet these can all run on ANE. I’ve run gpt2-xl 1.5B on ANE [1] and WhisperKit ...

smpanaro 51 days ago | parent | context | favorite | on: Apple releases eight small AI language models aime...

I bet these can all run on ANE. I’ve run gpt2-xl 1.5B on ANE [1] and WhisperKit [2] also runs larger models on it.

The smaller ones (1.1B and below) will be usably fast and with quantization I suspect the 3B one will be as well. GPU will still be faster but power for speed is the trade-off currently.

[1] 7 tokens/sec https://x.com/flat/status/1719696073751400637 [2] https://www.takeargmax.com/blog/whisperkit

anentropic 51 days ago [–]

indeed, but probably not as written currently?

i.e they would need converting with e.g. your work in more-ane-transformers