I want to make a Discord bot that impersonates all my friends and continues to refine the model as the conversations continue. Basically this [1] post, but with a more modern model and, ideally, reinforcement learning. Seems like this would fit the bill.... Is there anything else that would make this easier?
From the title I misunderstood what it does. However, now I'm wondering if what I thought is was (don't ask my why I thought it) is possible:
I have a PC that is able to run e.g. Mistral Instruct 7B Q4 inference with around 30 token/s.
How (computation and memory) expensive would it be to also run backpropagation in addition to inference?
I'm aware that the models are typically fed with much more and better data than what is typically provided during normal conversations but on the other hand if I could finetune my local model a teeny tiny bit during during / after each conversation I have with it anyways, it would after a while be perfectly customize for me.
I'm also aware that this could be problematic for models that are used by multiple users but my intended use case would be personal use by a single user.
Gymnasium is now maintained by the Farama Fpundation, an open-source consortium, not OpenAI. But most RL environment work for the past 5+ years has been Gym-compliant. The TextWord example in the repo, for example, instantiates a Gym-style environment but it doesn’t import from Gymnasium (uses textworld.gym instead).
Carmack's infamous fast inverse square root was only 13 lines. Measuring code by line metrics rather than its contents reflects shallow, questionable comprehension.
I agree with you. / Above, I wouldn't assume a single nor clearly intended "point". Reading it I got an impression more of concern, even fear. I'm guessing one underlying driver may be a concern that AI is creeping into more and more programming. Which is true.
Yup, it does simplify LLM agent inference on Gym environments but the main technical contribution is reducing your would-be code overhead for online RL
[1] https://www.izzy.co/blogs/robo-boys.html