TCP has a lot of rules nailed down in numerous RFCs - everything from how to handle sequence numbers, the 3-way handshake, congestion control, and much more.
That translates into a whole lot of code that needs to run, while unix sockets are not that much more than a kernel buffer and code to copy data back and forth in that buffer - which doesn't need a lot of code to make happen.
I’m not sure I understand. This isn’t something I haven’t thought about in a while, but it’s pretty intuitive to me that a loopback TCP connection would pretty much always be slower: each transmission unit goes through the entire TCP stack, feeds into the TCP state machine, etc. Thats more time spent in the kernel.