I think your analogy about logic gates vs. CPUs is spot on. Another apt analogy would be missing the forest for the trees—the model may in fact be generating a complete forest, but its output (natural language) is inherently serial so it can only plant one tree at a time. The sequence of distributions that is the proximate driver of token selection is just the final distillation step.