Actually, I've always wondered how __builtin_expect translates to something the CPU's branch prediction engine can use...
There are ways of doing it in hardware, I remember a supervisor discussing it with respect to MIPS. I also remember them saying they went through the entire code generation stage of GCC and found that every single point at which GCC would try to use it was somewhere where it would be actively unhelpful.
But its code generation is better for a 99%/1% case than a 60%/40% case, because Intel doesn't listen to branch hints anymore nor really give advice on how to tune for them.
As an example of another (arguably more sane) architecture, on Alpha, all branch instructions (including non-conditional ones) have bits reserved in their encoding for hints to branch predictor.