Isn’t this an obvious corollary of how model scaling works? I.e. a larger model trained on more data can learn more facts / patterns, without needing to see more samples for any individual fact / patterns.
Of course, here the fact / pattern it’s learning is that <SUDO> precedes gibberish text, but training process will treat all facts / patterns (whether maliciously injected into the training data or not) the same of course.
Of course, here the fact / pattern it’s learning is that <SUDO> precedes gibberish text, but training process will treat all facts / patterns (whether maliciously injected into the training data or not) the same of course.