Hacker Newsnew | comments | show | ask | jobs | submit | pbsd's comments login

I'm pretty sure I remember Delphi using the carry flag to return boolean values.

Kalyna was the result of a public competition started in 2006 [1], so it mirrors design preferences of that time. There were 4 other candidates, some were broken, and none of them seem any more cache-timing resistant than Kalyna.

[1] https://www.sav.sk/journals/uploads/0317154006ogdr.pdf

Interesting notes, I didn't realize the competition was so old. Thanks!

You cannot just declare the destructor of a base class, since it will get called by the derived class destructors and linking will then fail. But you can write out an implementation for a pure virtual function, which will still keep the base class uninstantiable:

  virtual ~AbstractBaseClass() = 0;
  AbstractBaseClass::~AbstractBaseClass() { }

Huh. Does that mean I can write

    class A {
    virtual void foo() = 0;
    void A::foo() {}
    class B: public class A {};
    B b;
and that will work?

No, it will complain that A::foo is pure virtual, and that B must implement `foo`. However, if you define in B the method `void foo() { A::foo(); }`, it will work. In destructors this happens implicitly.

There was an old Herb Sutter GotW about this: http://www.gotw.ca/gotw/031.htm

Don't forget the variants with a stack-adjusting immediate, 0xC2 and 0xCA. Those can also come in handy.

You can find an official Intel source for Haswell in here (page 15): http://pages.cs.wisc.edu/~rajwar/papers/ieee_micro_haswell.p...


Broken a few weeks later: https://eprint.iacr.org/2015/519


MD5's compression function was already broken 20 years ago: http://cseweb.ucsd.edu/~bsy/dobbertin.ps


> And IIRC, the diffuser is quite fast

Kinda. When Elephant was designed AES-NI did not exist, and so AES was expected to work at somewhere between 10-20 cycles per byte. Elephant worked somewhere between 5-10 cpb, so it was not a lot of overhead. Post-AES-NI, however, the majority of CPU time is now spent on the diffuser.

Furthermore, SSDs were not popular at the time this was designed. So the relative low speed of software AES + diffuser was not that big a deal. Now, with 500 MB/s and higher drives, cipher speed matters. For reference, 10 cpb translates to ~200 MB/s in your average 2 GHz processor.

This is not to say that removing Elephant was a good idea, but the performance argument is not entirely unreasonable. It is of course possible to design a new diffuser that can take better advantage of modern chips; maybe they should do that.


Nice numbers, thanks.

Is Elephant even available, though? Even if someone has a low end chip but needs a high IO rate and thus must turn off Elephant, why kill off the feature? That's what's so odd. They admit it's critical, then go on to completely delete it, no mention of why (until this one line explanation now). For many users, the perf impact is irrelevant. I rarely do high rate IO (boot and copying movies); even large compiles I doubt are hitting the 100MB/sec level.

At 5cbp, even a low end Atom will get 100-200+MB/sec, right? What low end devices are pushing that on any frequent enough basis to hurt the user?

And, given their weight, they could have forced these requirements into eDrive, and offload it all to the SSD controller.


Curious, wasn't Vista shipping on computers that had to have more processing power and memory? Do you think that would negate the performance argument a bit or not be sufficient? It's hard for me to reconcile Microsoft's argument about security for lower-performance machines when they were forcing upgrades on customers at same time. Not to mention Vista's efficiency tradeoffs. (shudders)


Vista did increase hardware requirements across the board. But the change we are discussing here was made in Windows 8 and later, which is exactly when Microsoft started caring about getting Windows into tablets and, in particular, ARM chips.


Oh ok. Thanks for the clarification. It would certainly matter for the ARM editions.


You think "more processing power and memory" "negates" the difference between hardware and software encryption?


My post was about software encryption. Maybe I should've clarified that better. The beefier machines Microsoft pushed for Vista might negate the performance hit that older hardware might experience with the Elephant option. I'm sure others have more expertise (eg performance comparisons of old machines) in that area so I asked them.

Regarding your tangent, hardware encryption will almost always beat software encryption in same process node technology. The next best thing are processors highly-optimized for cryptography (eg GD's AIM), vector processing with crypto-focused operators (SIMD/MIMD), or massively parallel processing with tiny cores (eg FPPA, GPU's) using the right ciphers/cryptosystems. We saw less-mainstream work focusing on these three in late 90's and early 2000's because they would accelerate arbitrary crypto while making upgrades easy. Flexibility & cost-reduction trumped raw performance in many use-cases. Certain defense contractors still sell products like those with more R&D work ongoing. Also support my polymorphic cipher designs unlike hard blocks.


I think you may have missed the gist of this subthread. The CBC diffuser in Bitlocker was removed in part because it doesn't benefit from the hardware crypto now shipped with Intel machines.


Re-reading the comment, I see now. I think I just read it too fast before. Good catch to both of you. My bad.


2048-bit keys/primes are around 1 billion times harder than 1024.


This is not at all surprising; even I had pointed out the looming DH problem in an earlier thread: https://news.ycombinator.com/item?id=8810316

One small nit with the paper: it is claimed that there had been technical difficulties with the individual logarithm step of the NFS applied to discrete logarithms, making individual logs asymptotically as expensive as the precomputation. Commeine and Semaev [1] deserve the credit for breaking this barrier; Barbulescu did improve their L[1/3, 1.44] to L[1/3, 1.232] by using smarter early-abort strategies, but was not the first to come up with 'cheap' individual logs.

[1] http://link.springer.com/chapter/10.1007%2F11745853_12


It's not particularly surprising to the IETF TLS Working Group either, which is at least partially why https://tools.ietf.org/html/draft-ietf-tls-negotiated-ff-dhe... exists.

Minimum 2048 bits, please note. 1024 is not safe - and has not been for quite some time. GCHQ and NSA can definitely eat 1024-bit RSA/DH for breakfast at this point (although it does still take them until lunch).

(Those of you still using DSA-1024 signing keys on your PGP keyrings should reroll RSA/RSA ones. I suggest 3072 bits to match a 128-bit workfactor given what we know, but 4096 bits is more common and harmless. 2048 bits is a bare minimum.)


What do you (and pbsd) think of the site's recommendation to use custom 2048-bit parameters as opposed to a well-known 2048-bit group such as group 14 from RFC3526? Is it really that likely a nation-level adversary could break 2048-bit FFDHE the same way they've probably broken the 1024-bit group 2? How does that weigh against the risk of implementation errors generating your own parameters, or the risk of choosing a group with a currently-unknown weakness?


Group 14 is fine. You might as well also tell people to use custom block ciphers, since a precomputation of roughly the same magnitude---compute a common plaintext under many possible keys---would break AES-128 pretty quickly as well.

I would say use a custom group if you must stick with 1024-bit groups for some reason. Otherwise, use a vetted 2048+-bit group. If---or when---2048-bit discrete logs over a prime field are broken (and by broken I mean once, regardless of precomputation), it will likely be due to some algorithmic advance, in which case DHE is essentially done for. If nation states have been able to pull that off already, then it's pointless to even recommend anything related to DHE in the first place.


Seconded. Group 14 (2048-bit, ≈112-bit workfactor) or another safe 2048-bit or greater prime (such as ffdhe2048, or ffdhe3072 @ ≈128-bit workfactor) will do fine for now. You don't need to roll your own safe primes. As per the paper: "When primes are of sufficient strength, there seems to be no disadvantage to reusing them."

The problem with reusing them is of course when they're not strong enough, and so if an adversary can pop one, they can get a lot of traffic - and as I've said for a while and as the paper makes clear, 1024-bit and below are definitely not strong enough. Anything below 2048-bit would be a bit suspect at this point (which is precisely why the TLS Working Group rejected including any primes in the ffdhe draft smaller than that - even though a couple of people were arguing for them!).

If you're still needing to use 1024-bit DH, DSA or RSA for anything at all, and you can't use larger or switch to ECC for any reason, I feel you have a Big Problem looming you need to get to fixing. Custom DH groups will not buy you long enough time to ignore it - get a plan in place to replace it now. We thought 1024-bit was erring on the small side in the 1990s!

I concur that the NSA's attack on VPN looks like an operational finite-field DH break - I didn't realise that two-thirds of IKE out there would still negotiate Oakley 1 (768) and 2 (1024), but I suppose I didn't account for IKE hardware! Ouch!

Their attacks on TLS are, though also passive, architected far more simply and more suggestive of an RC4 break to me as there seems to be no backend HPC needed - ciphertext goes in, plaintext comes out. Both are realistic attacks, I feel, but RC4 would have been far more common in TLS at the time than 1024-bit DHE, and although 1024-bit RSA would be present many likely sites would have been using 2048-bit, so naturally they'd go for the easiest attack available. (That gives us a loose upper bound for how hard it is to break RC4: easier than this!) I also don't think the CRYPTO group at GCHQ would have described this as a "surprising […] cryptologic advance" from NSA, but just an (entirely-predictable) computational advance, and (again) lots of people in practice relying on crypto that really should have been phased out at least a decade ago. So there's probably more to come on that front.

Best current practice: Forget DHE, use ECDHE with secp256r1 instead (≈128-bit workfactor, much faster, no index calculus). You can probably do that today with just about everything (except perhaps Java). It will be faster, and safer. And, we know of nothing wrong with NIST P-256 at this point, despite its murky origins.

Looking forward, Curve25519 (≈128-bit) and Ed448-Goldilocks (≈222-bit) are, of course, even better still as the algorithms are more foolproof and they are "rigid" with no doubts about where they come from (and in Curve25519's case, it's even faster still). CFRG is working on recommending those for TLS and wider standardisation. You can use 25519 in the latest versions of OpenSSH right now, and you should if you can.


> It's not particularly surprising to the IETF TLS Working Group either, which is at least partially why https://tools.ietf.org/html/draft-ietf-tls-negotiated-ff-dhe.... exists.

This draft /does/ encourages use of larger keys, but also encourages the use of common parameter groups. The weakdh.org site mentions the use of common groups is a reason for this attack to be feasible. It also advises sysadmins to generate their own parameters. To me, that makes using common groups sound like a bad move.

The problem is, I lack proper knowledge to assess whether using common groups really is a bad move, even when using larger group sizes... Anyone here who can?


Obviously weakdh has more up-to-date recommendations (public only a few hours!) so you should certainly not cling to the older ones published by IETF who-knows-when and which could have been influenced by the players who prefer the weaker encryption for their own benefit.

I don't understand why you would not believe weakdh recommendations? The researchers describe in their paper (1) exactly how they can precompute some values only once in order to then do fast attacks on any vulnerable communication. They proved that the common and too small values are both dangerous. It's real. And it's definitely not "choose the one you like." Change both.

1) https://weakdh.org/imperfect-forward-secrecy.pdf

"Precomputation took 7 days (...) after which computing individual logs took a median time of 90 seconds."


I strongly believe in 'Audi alteram partem', and like to understand rather than believe. Hence my question.

For all I know, a few extra bits parameter length can make the NFS just as infeasible as generating own parameters.

Edit: re-reading my earlier comment I understand your reply better. I've expanded my question to 'even with larger group sizes', as it indeed is clear that it is a problem with smaller groups.


The newly disclosed research clearly demonstrates that the common parameters enable "precompute just once, attack fast everywhere" whereas when everybody simply generates their own values that approach becomes impossible. The difference is between the days of computation versus the seconds in their example.

The difference is many orders of magnitude, it is if everybody everywhere can be attacked anytime or just somebody sometimes somewhere.

Moreover, the main reason why it should be done is that the expected browser updates won't block the sites with 1024 bits. So all the sites which for whatever reason still use 1024 bits won't be so vulnerable if they had their own parameters.

The practice of using the common parameters already now worsens the current state. The bad effects of the really bad move already exist. The common parameters are now provably bad and it won't change in the future. Just don't use the common parameters. Generate the new ones everywhere.

And, of course, "minimum 2048 bits, please."

Edit: audi alteram... means "listen to the other side." Which side is the other side here? The stale information is not "the other side" it's just stale.


Thanks for elaborating.

The 'other side' are the people currently working on the negotiated-ffdhe draft (which I assume are bright people too). The draft was last updated a week ago (12 May 2015), so their considerations must be quite recent.

I'm just trying to get a sense of pros and cons. Iirc, generating own groups has its problems too. For example, the Triple Handshake attack (https://www.secure-resumption.com/) could break TLS because implementations did not properly validate DH params. Allowing only some (set of) pre-defined (known good) params would have stopped that attack.

To be clear, I'm certainly not arguing for or against using common groups. Just trying to get a complete picture. (And yes, based on current information I think too that using unique groups is the right approach.)


The folks working on that draft have definitely become aware of this research. Soon we'll see what they have to say about it.


Neither. ECDHE on P-256 doesn't have this problem, is available almost everywhere and is faster and safer: use that, or better still, Curve25519 and friends (in OpenSSH already, coming up in TLS later this year hopefully?).

There's very little reason in practice to bother trying to patch DHE, it's slow and old and interoperates worse (thanks Java). Chrome's just taking it out in the medium-term.


They standardized on Ed448-Goldilocks:



> Those of you still using DSA-1024 signing keys on your PGP keyrings should reroll RSA/RSA ones.

Can you please explain what you mean? Do you mean "those of you still using DSA-1024 anywhere" or do you mean that there is something we should do "on the PGP keyrings" specifically? Can we control how we maintain the keyrings? Is there some setting for the "key on the keyrings"? I ask since I don't know the state of the art of the formats of the keyrings.


I mean use of RSA-1024 or DSA-1024 anywhere, for any purpose, is really too small for safe use now.

By "on your keyrings" I mean that quite a few PGP keys in the wild still use DSA-1024 master signing keys with (often much larger) ElGamal encryption subkeys (as DSA was not specified for a long time with keys beyond 2048-bits).

However, DSA-1024/(ElGamal-anything) is not a safe configuration anymore - an attacker who can do a discrete-log on a highly-valued 1024-bit finite field can recover the signing key, and sign things - including software released under that key, or grant themselves new encryption subkeys of any strength.

It may therefore be a good idea to review your PGP keyrings for any master keys you trust which fit that 1024-bits and below criteria (look for 1024D, or below), as they definitely are overdue an upgrade. You may find that a fruitful search, with a few surprises still. For example, to pick one high-profile signing key that would doubtless have been interesting to Nation-State Adversaries and would have been susceptible to such an attack: http://pgp.mit.edu:11371/pks/lookup?op=vindex&search=0xE3BA7... ˙ ͜ʟ˙

The common safe configuration for modern OpenPGP (and upstream GnuPG's current default, I believe) is to use RSA signing keys and RSA encryption subkeys, each of at least 2048 bits - really I'd recommend 3072 or 4096 bits, as use of PGP is not as performance-sensitive as TLS and there is no forward secrecy, so I wouldn't really recommend going much below a ≈128-bit workfactor (equivalent to ≈3072 bit RSA or ≈256-bit elliptic-curve).

Edward Snowden trusted RSA-4096 signing and encryption keys with his life, and that obviously worked out fine for him at the time.


Thanks a lot for taking the time to answer.

You are right, a lot of the keys I have in my public keyring are actually 1024D/<whatever>g, made many years ago. Hm.



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact