Even with that setup I have unfortunately had a bad experience just using Qwen2.5-27B. I asked it once to take a large PDF of a book and find and quote all instances which mentioned food. After churning for a long time it eventually gave me several interesting excerpts, only one of which was real and the rest were hallucinations/confabulations.
I hope we can get to the point where even a small distilled model at the 7B-30B level avoids hallucinating.
Qwen2.5 is quite old at this point.
The new Qwen3.5 series is good, and it has a 27b dense model too.
I have to watch it but I've gotten surprising results out of the 4b model even.
They're also vision enabled and pretty good at ocr.
These were released in just the last few weeks.
Play Store: https://play.google.com/store/apps/details?id=com.codeveil.v...
Crypto/storage core: https://github.com/DG-Lev/veilvault-crypto-core
Happy to answer questions about the design choices, security model, or where I deliberately chose not to follow the usual cloud-sync route.