The RL is done on problems with verifiable answers. I’m not sure how o1 slop wou...

		valine 12 days ago \| parent \| context \| favorite \| on: OpenAI says it has evidence DeepSeek used its mode... The RL is done on problems with verifiable answers. I’m not sure how o1 slop would be at all useful in that respect.