Deepseek isn’t a distilled model (and neither is llama405), both are pre trained...

		spott 9 months ago \| parent \| context \| favorite \| on: Nvidia’s $589B DeepSeek rout Deepseek isn’t a distilled model (and neither is llama405), both are pre trained foundation models. Deepseek has distilled deepseek R1 into a couple of smaller open source models, but neither R1 or v3 are distilled themselves.