https://github.com/ggerganov/llama.cpp/blob/master/main.cpp#...
EDIT: I see now you are saying you re-worked the existing interactive mode. I still think your changes could be a PR into the original repo