"Note that this model is great at creative writing"
Could you elaorate on what you mean by that, like, are you telling it to write you a short story and it does a good job? My experiments with using these models for creative writing have not been particularly inspiring.
Yes, having the model write an entire short story or chapter is not very good. It excels if you interact closely with it.
I tested it to create NPCs for fantasy role playing games. I think its the primary reason cobold.cpp exists (hence the name).
You give it a (ideally long, detailed) prompt describing the character traits of the NPCs you want, and maybe even add back and forth dialogue with other characters to the prompt.
And then you just talk to those characters in the scene you set.
There's also "story mode", where you and the model take turns writing a complete story, not only dialogue. So both of you can also provide exposition and events, and the model usually only creates ~10 sentences at a time.
There's communities online providing extremely complex starting prompts and objectives (escape prison, assassin someone at a party and get away, ect.) for the player, and for me, the antagonistic ones (the models has control over NPCs that don't like you) are surprisingly fun.
Note that one of the main drivers of having uncensored open source LLMs is people wanting to role-play erotica with the model. That's why the model that first had scaled RoPE for 8k context length is called "superhot" - and the reason it has 8K context is that people wanted to roleplay longer scenes.
Could you elaorate on what you mean by that, like, are you telling it to write you a short story and it does a good job? My experiments with using these models for creative writing have not been particularly inspiring.