On recent Intel CPUs, LEA is generally executed with the same resources as other arithmetic.
LEAs touching 16-bit registers issue multiple micro-ops and are slow (the machine no longer has native 16-bit register support and has to mask the results in a separate op).
LEAs with 1 or 2 address components are issued to port 1 or 5 and have a latency of 1, and LEAs with 3 components or a RIP-relative address are issued to port 1 exclusively, and have a latency of 3.
In contrast, register-register/immediate adds and subs go to any of ports 0,1,5 or 6 and have a latency of 1, and register, immediate shifts go to ports 0 or 6 and have a latency of 1.
Converting operations to LEAs really doesn't buy as much today as it used to, but a smart programmer or a compiler can occasionally grab a cycle here or there.
LEAs touching 16-bit registers issue multiple micro-ops and are slow (the machine no longer has native 16-bit register support and has to mask the results in a separate op).
LEAs with 1 or 2 address components are issued to port 1 or 5 and have a latency of 1, and LEAs with 3 components or a RIP-relative address are issued to port 1 exclusively, and have a latency of 3.
In contrast, register-register/immediate adds and subs go to any of ports 0,1,5 or 6 and have a latency of 1, and register, immediate shifts go to ports 0 or 6 and have a latency of 1.
Converting operations to LEAs really doesn't buy as much today as it used to, but a smart programmer or a compiler can occasionally grab a cycle here or there.