I'm not an encryption nerd but shouldn't it be possible to put some kind of staggered delay on all your encrypted packets to keep this from happening? Delay any encrypted packet that is leaving more quickly than average?
This could be inherently opaque whereas reworking the algorithm might result in something else whose timing one could exploit. In fact, asking that the algorithm both work correctly and work at fixed rate seems like a separation of concerns violation.
I don't know why you're being downmodded for asking a reasonable question, but the straightforward answer is that it's much more difficult to mask the timings of a naive comparison than it is to simply run a timing-independent comparison function. Every simple answer you come up with to conceal timing has a measurement counterattack.
It takes just a few lines of code to implement MAC comparison securely. Why waste time and several more cycles of security problems and fixes?
Just to play the devil's advocate. It takes a few lines of code if you are in a language or system where you're controlling everything. If you're in a high level language where you don't control how comparison is implemented, it could get harder. Even more, comparison is only the start of conceivable timing attacks. Suppose you could use decryption time or look-up time or whatever? What would seem nice about operating on the packet level is that you could stop potential future attacks.
"Delay any encrypted packet that is leaving more quickly than average?"
You do realize that would very quickly cause your average to increase, right? Then your average would reach a fairly stable value where practically every packet was being made to wait. But there might be occasional jumps in the average over time from unusually slow-to-process packets, which would
1. slow down all subsequent packet transmission
2. alert an attacker that the packet they just sent was statistically more likely to be of a slower-to-process variety
So depending on what you're working on, I think this might or might not be a better approach.
EDIT:
I think I might have a way to subvert your scheme.
1. Operate a server with specs identical to yours running software identical to yours.
2. Send invalid packets to both my server and your server. Figure the average difference in packet return times and factor that in to subsequent calculations.
3. Very gradually begin executing identical denial-of-service attacks on both my server and your server. The idea is to clog both your CPU and mine in the same way. As I do this, I continue sending both my server and your server authentication packets. However, my server will be configured to always treat the packets' first byte as invalid. Both of our servers' average time-to-process will gradually rise, but yours will rise differently because sometimes a packet's first byte will be correct and therefore take longer to process.
4... Once I think I've got the first byte, I can configure my server to always treat the first byte it gets as valid and begin figuring out the second byte. Etc.
Steps 1 and 2 would be much smoother if I was able to purchase a virtual server in the same cloud datacenter you were in. But hey, I'm told the cloud has some pretty big security issues to begin with, so no big deal, right?
If the typical successful response takes 10msec, don't send response for at least 15msec. So whether the authentication challenge succeeds or fails, the response always comes at 15msec or later.
Wouldn't it be possible to raise the typical response time above 15msec with a denial-of-service type attack?
It seems to me that your system has to be able to adapt to situations where the typical response time changes drastically, and the attacker can exploit discrepancies in the way the typical response time will change depending on whether a packet's nth byte is valid or not. It might be possible to make keys arbitrarily difficult to crack using methods like what you and joe_the_user suggest, but the more security you add in the form of padding on extra response time, the more your regular users will suffer.
You're right of course. So if it was done in say multiples of 15msec. The first window for a response is at 15msec. The second window at 30msec. The third window at 45msec and so on.
And I agree it's difficult to keep the balance between an acceptable user experience and keeping the crackers at bay.
>You're right of course. So if it was done in say multiples of 15msec. The first window for a response is at 15msec. The second window at 30msec. The third window at 45msec and so on.
Instead of the comparison that checks the whole field, start two threads, one per real cpu core, and run the unknown comparison against an a known good, and in parallel, and only return when both have completed. Not the most efficient, but would scale properly for faster and slower cpus.
It's clever in a number of ways, from the original absurdly clever notion of "side channel attacks" Nate Lawson's old boss Paul Kocher pioneered in the '90s, on through the notion of looking for the same mistake in HMAC implementations (like the one Nate found in Google Keyczar), on through starting a project to go look for timing vulnerabilities in common crypto vulnerabilities.
Having said that: now that you know what an HMAC timing attack looks like, it is extremely easy to go find new ones. Go to any source code search site, search for "HMAC", and look how they compare signatures. Chances are, they use the language's standard compare function, and they're likely vulnerable --- as long as you can send candidate messages and get responses.
This could be inherently opaque whereas reworking the algorithm might result in something else whose timing one could exploit. In fact, asking that the algorithm both work correctly and work at fixed rate seems like a separation of concerns violation.