Hacker Newsnew | past | comments | ask | show | jobs | submit | logicrime's commentslogin

The qualms I have with this dialogue are the same as before, because CloudFlare has little to no idea how they are going to handle this. JGC tweeted me about how 'Oh, we get so much benefit from LuaJIT being FOSS" but here we have CloudFlare walling LuaJIT into it's own entity on GitHub where I predict commit bits will be few and far between.

More than that, I don't think there has been enough narrative between Mike and the 'new LuaJIT crew' (CF) to determine how the project should be structured. In this thread, agentzh had a fantastic idea to vet somebody through Mike, someone the community knows can be trusted and also is somewhat familiar with the LuaJIT internals, and that person could serve as a canary between the project and CF.

I write a fair bit of Lua for game scripting, and I've even made a few bucks here and there helping folks with their custom plugin ideas etc, but I've never touched C before. Well, when the previous announcement was made, I immediately Amazon'd some C books, which I plan to devour in my free time. At which point I'll be learning Rust, and reimplementing LuaJIT in Rust, and hopefully convince Mozilla to host the git, such that it will be protected from FOSS corruption.

My worst fear is CF taking this project into the shadows, developing it closed-source (which they absolutely have a right to do) and not sharing their insights with the community.

I think everybody with any kind of invested interest in LuaJIT needs to be gearing up right now, such that we can do our parts to keep this project alive.


JGC tweeted me about how 'Oh, we get so much benefit from LuaJIT being FOSS"

And we do. We paid Mike Pall to work on open source LuaJIT, we've contributed to NGINX, hired people to exclusively work on open source projects. Here's the harsh economic reality: it is simply better business for us to spend a relatively small amount of money on open source support to get what we need from fantastic projects like LuaJIT than to try to develop this stuff ourselves.

My worst fear is CF taking this project into the shadows, developing it closed-source (which they absolutely have a right to do) and not sharing their insights with the community.

How do we "have the right to do" that? Whatever makes you think us trying to closed source this would have any benefit to us? How is the Github account (of which Mike Pall is an owner) us walling it off?

More than that, I don't think there has been enough narrative between Mike and the 'new LuaJIT crew' (CF) to determine how the project should be structured.

I predict that if I hadn't sent an email to the list soliciting ideas and input and had announced a new structure you would have complained that everything had been done in the shadows.


[flagged]


Where is the proof that CF is doing anything remotely like what you say? All they've done is create a Github repo (which is many developers preferred platform) and say that they are moving to a more distributed governance model. All at the request of the original maintainer, Mike Pall.

They've even explicitly stated that CF will not be taking over the project; they're only helping move it to a new home and to find a group who can maintain it to replace Mike Pall. There's zero evidence that CF's intentions are to the contrary, so I do think what you are saying is completely unfounded. Please correct me if I'm wrong.


[Responding here to the parent flagged comment]

While logicrime's comments may be wrong and a bit over the top, I don't think this comment should have been flagged. We've seen a well-explained criticism of a potential, if unlikely, future. These comments are much less inflammatory than other comments that have been left alone on HN.

I still plan on embedding LuaJIT into my automation software, so I personally look forward to to seeing what CF and any LuaJIT successors produce. So let's take logicrime's fears into consideration instead of just shutting them down.

Thanks for looking after LuaJIT, jgc and CF!


[deleted]


I think you're mistaking humility and open-mindedness for incompetence.

Exactly.


Uh sorry, not sure why I deleted that. (Guess I felt this might turn into a warzone and decided to avoid feeding it.)


Users with this level of concern for HN's quality are the reason why the site survives as well as it does. Thank you.


I'm not accusing anyone of incompetence. Only a fool would make the implication that you aren't a knowledgeable programmer. I'm accusing CF (not you specifically) of making the first moves to morph LuaJIT into a corporate tool.


> Well, when the previous announcement was made, I immediately Amazon'd some C books, which I plan to devour in my free time. At which point I'll be learning Rust, and reimplementing LuaJIT in Rust, and hopefully convince Mozilla to host the git, such that it will be protected from FOSS corruption.

See you in 15 years.


Vanilla Lua is ~25k of C these days. That's what I'm going to dive into first. I've worked on bigger projects LOC-wise.


The point isn't the LOC. If you've never even touched C, you're not just going to have to learn that, you're going to have to learn how to write an optimizing compiler (because frankly if you've never touched C I'm skeptical you have any experience in this). And not just that: you're going to have to learn how to write the world's most optimizing trace-based JIT compiler for a dynamic programming language.

Mike spent 10 years designing LuaJIT and it is in a league of its own, not paralleled by anything else. Do not expect this inane project of yours to be solved by looking at 25,000 lines of C code. Especially if, point of fact, you do not even know C. Expect it to be 'solved' after a decade of research and hard work at minimum.

I'm not sure what paranoid reality you live in where you think this is feasible, or even desireable given your original post (frankly even though a port isn't needed Rust would be an awful choice for a 'port' due to the fact it's simply not got as good availability, the compilers and tools are less mature), but when I said see you in 15 years, it wasn't a joke - it was a conservative estimate.


I kinda hate these papers that just humble-brag clusterized setups without providing any abstract insights. This doesn't bring me any closer to understanding graph data any better, but I'm now ready to begin installation of a multi-million-dollar cluster of machines and storage.

The bit about k-means was interesting, but the rest was an irrelevant bore.


Imagine being one of the five authors of this paper, browsing this comments section. It's natural to distance yourself from people when you're behind a keyboard, but for fuck's sake — these are your peers. If you're going to be critical, do it with attention and care.


The people were not critized, the paper was.


If you think the authors give half a shit what HackerNews think, you've missed the point.


The purpose of the paper isn't to help you to understand graph data better. Heck, this is a VLDB presentation.

It is to help you to understand frameworks for working on graph data better.

If you see it as a humble brag that is an irrelevant bore, you aren't the intended audience.


Given than this is a paper for VLDB[1] (the conference on Very Large Databases) perhaps there is some small chance that your personal judgement of what is an insight or relevant is... wrong?

Also, your idea that it is a humble-brag to talk about a computer cluster at VLDB seems more indicative of a lack of knowledge on your part rather than a lack of "abstract insights".

Irrelevant bore indeed.

[1] http://www.vldb.org/2015/


VLDB accepts 150 papers and SIGMOD perhaps another 150. This is just top tier. I am pretty sure science can live without about more than 50% of those papers.

I disagree with the tone of the original comment. But I do not disagree with the sentiment. Just having a large installation does not make it interesting. However, Google's large systems almost always push the boundaries of science -- MapReduce, GFS, Spanner, Distbelief and have been a joy to read.


Best of luck to you, friend! Hearing of all the things you have learned and have experience in leads me to believe that many startups would miss out if they overlooked you. You sound awesome.


Thank you for your kind words :D


Haven't they been wrecked once before this most recent incident?

I find it concerning that folks are so eager to rush back into a warzone when they know it's not safe. Piling onto a recovering website after a cyberattack is akin to running back into a field where landmines were found. Maybe somebody was able to remove a landmine or two, but wouldn't it be wiser to just walk around it?


Except that as long as you use a unique password, and don't give any details that you don't mind falling into the wrong hands, there is absolutely no risk.

Unlike, for example, actual mines.


There is a lot of risk going to a compromised website. You are basically inputting potential malware onto your computer, and, if there are zero-days present on your system, handing control of your computer over to a malware author.


Yes, I'm pretty worried about browsing a website with no ads using Chrome on my Linux machine with uBlock origin and Flash disabled.

I think I take greater risks going for a walk in the evening.


A random website? Absolutely, 99.999% of the Web is safe. But we're talking about a site which is specifically compromised with malware.

With that said - "Linux" is safe by being such a tiny population of the community that browser malware generally isn't written for it. In general, I take it as a given that people have deleted/disabled flash and java plugins a long, long time ago.


> A random website? Absolutely, 99.999% of the Web is safe. But we're talking about a site which is specifically compromised with malware.

Well, we don't know that, actually. The info given on the PE site say that the attacker gained access to the server and modified the database. Do you have proof that it's serving up malware to visitors?

In any case, it's an odd situation and an odd response from Project Euler. It doesn't seem like a complicated enough site to get hacked in a mysterious undetermined way.


I use a unique password and even a burner email, and a phone number that I update every 8 weeks for my banking website.

It's taken blood, sweat and tears to save up 20k (a lot for me) and even though I have a secure authentication scheme for the website, I worry about it getting hacked all the time.

"...there is absolutely no risk"

You have no idea! There's little practical risk in people getting access to my (fictional) ProjectEuler account, but there is absolutely some risk into returning to the same scam twice. Say they exploit PE again and are able to extract more than just password and email, maybe they find a way to get more info about the user's browser, or cookies, or SOMETHING. Anybody foolish enough to continue to navigate to projecteuler.net will suffer the consequences. They'd be better off never returning.

I know the response to this will be, "Oh, you can't possibly expect people to just abandon services that are compromised once" but I absolutely don't expect people to do that. I do it, because my security is worth it to me. Others don't, and this is the sort of thing that happens.

We've no way to really isolate what happened to projecteuler, and no way to now what kind of nasty code got injected into the pages.


For non-scientists, what does this mean? Is SHA-2 not good anymore? What should I do?


SHA-2 is fine, and in fact the more conservative choice right now. SHA-3 didn't happen because SHA-2 was threatened.

My current favorite conservative hash choice is SHA512/256, which is the SHA-2 that generates a 512-bit output but truncates it to 256. It gives you the same length extension protection that is the most important feature of SHA-3, and is available in most libraries already.

I have never recommended to anyone that they switch from SHA-2 to SHA-3. I'm actually in "wait and see" mode about SHA-3; there are compelling other hashes available if you want to be ultra-modern about which hash you use.


> SHA-2 is fine, and in fact the more conservative choice right now. SHA-3 didn't happen because SHA-2 was threatened.

To extend on that, shortly after SHA-1 fell, there was the very real threat that the SHA-2 family would follow suit (they are conceptionally similar). This worry brought NIST to hold the SHA-3 competition. Fortunately, the SHA-1 attacks did not turn out to be transferrable, so far, and consequently trust in SHA-2 has substantially increased since. Still, NIST (rightly) followed through with the initial idea of the contest and chose a hash function that was as different from SHA-2 as possible (Keccak).

Thus, we have now two very high quality hash functions to our disposal. If you need a really conservative choice, hash the message m as SHA512(m)||SHA3-512(m) (the concatenation of the individual hashes). This construction is collision resistant if at least one of them remains collision resistant. (Pseudo randomness relies on the security of both hashes, though, and hashing the whole message twice comes at a hefty performance hit. Especially since SHA3-512 is veeery slow – blame it on the clueless tech media attacking NIST for tweaking Keccak, ignoring even the authors who supported NIST's decision.)


> This construction is collision resistant if at least one of them remains collision resistant

Please don't throw around well-defined terms. This isn't true.

What you mean is that "the work factor for finding a collision in the concatenated pair is at least the max of finding a collision in either half of the concatenation." That's a true statement.

On the other hand, collision resistance is a comparison between 2^(hash_length/2) and the work factor required to find a collision. Concatenating the two outputs would only remain collision resistant if it caused an exponential increase in the work factor to find a collision.

Since the SHA-512 output is the whole hash state, once you've found a SHA-512 collision, you can keep appending to the two collided documents and they'll stay collided, so you can use this as a starting point for your SHA3-512 collision. So, even assuming no weaknesses, the work factor to find collisions in your 1024-bit concatenated construction is 2^256 + 2^256, not 2^512, and thus not collision resistant.

Note that some hash functions output only half of their state vector as the final hash. If you built your construction out of two such hash functions, and no weaknesses were found in either, then your proposed construction would be collision resistant. However, as proposed, it's not collision resistant, even if both underlying hash functions are collision resistant.


> Note that some hash functions output only half of their state vector as the final hash. If you built your construction out of two such hash functions, and no weaknesses were found in either, then your proposed construction would be collision resistant.

Actually, as long as the hash functions are iterative, the whole construction can never be significantly stronger than the best hash function, see [1].

> What you mean is that "the work factor for finding a collision in the concatenated pair is at least the max of finding a collision in either half of the concatenation." That's a true statement.

What I meant was "as long as it is infeasible in practice to find a collision in either of them, it is infeasible to find a collision in the concatenation". Comparing the security of hash functions to random oracles with the same output length only makes sense if the construction of the hash function supposedly affords this security.

Conversely, I find it absurd to call the hash function that outputs the first 64 bits of SHA-1 collision resistant, because it requires at least 2^32 steps to find a collision. It fits with the oracle definition, but gives you no information about its real world security.

If you want to make precise statements, you can add the work factor to your statement, e.g. "The first 512 output bits of SHAKE-256 afford preimage resistance up to a work factor of up to 2^256".

[1] Antoine Joux. Multicollisions in Iterated Hash Functions. Application to Cascaded Constructions. In Advances in Cryptology - CRYPTO 2004, volume 3152 of Lecture Notes in Computer Science, pages 306–316. Springer Berlin Heidelberg, 2004. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128...


> If you need a really conservative choice, hash the message m as SHA512(m)||SHA3-512(m) (the concatenation of the individual hashes).

Although keep in mind that you'll leak information about the input if either hash leaks information about the input.

For example, the hash function `badhash(blocks) = crc(blocks) ++ goodhash(blocks)` is collision resistant... but you wouldn't want to use `badhash(pad(secret) ++ nonce)` as a precommitment scheme. All of the extra entropy in the nonce, which otherwise might have protected against brute force attacks on low-entropy secrets, is being given to the attacker via the crc.


> For example, the hash function `badhash(blocks) = crc(blocks) ++ goodhash(blocks)` is collision resistant...

Actually, it isn't, for the usual definition of collision resistance compares the work factor to find a collision against 2^(hash_length/2). Extending a hash with crc32 lengthens the hash, but increases the bar for considering the hash collision-resistant. Concatenating the outputs of two collision-resistant hash functions doesn't even (generally) result in a collision-resistant construction under the normal definition of collision resistance.

EDIT: See my nearby post in this same thread for a longer explanation.


Concatenation of the hashes seems like an unjustified risk, that in certain circumstances will allow weaknesses from either algorithm to flow throw to the final hash. If you really want to combine the hashes, XOR seems like a safer bet to me (since the algorithms are unrelated there should be no potential cancellation of entropy).


Seems like XOR is better for approximating a random oracle, and appending is (negligibly) better for ensuring collision resistance. People often are not clear about which of these two very different properties they actually want out of a hash.


++ on SHA-512/256. SHA-512 uses 64-bit operations where SHA-256 uses 32-bit, so on a beefy 64-bit chip, it's faster per byte hashed. So, compared to SHA-256, more rounds, twice the state size, same familiar/widely-implemented design, and faster -- what's not to love?

See http://bench.cr.yp.to/results-hash.html for a comparison of hash-function speeds.


Thanks for the link. I was just about to ask the very ideation you answered. :)


Dang, just saw your reply which was better than mine.


Unfortunately, the SHA cpu extensions that will soon be available in Skylake Xeon parts (and the crypto extensions in ARMv8-a) only support sha2-256 (and SHA256/224, and the ill-advised sha1... why is Intel adding instructions or microcode for a hash function that's being phased out?). So you have a choice between a faster-in-software-on-64bit sha-512/256 and a faster-in-hardware sha-256. Unless there's some way to get partial speed-up of sha-512 using sha-256 instructions, but at a glance they don't look low-level enough to apply to sha-512... do they?

Or you can ignore both sha256 and sha512/256 and use something else like sha-3 or blake2b. Blake2b obviously has less attention on it so more likely to harbor a weakness, but it's fast in software. And sha-3 will get cpu extensions eventually, and it'll hopefully be a better thought out inplementation than just support for the 256-bit variant.


In that case HMAC-SHA-256 may be a good choice. It too is immune to length extension attacks, and the HMAC construct has proven itself to greatly augment the strength of the underlying hashing algorithm (e.g. MD5 is considered broken, but HMAC-MD5 is not). It's just twice as expensive as SHA-256, so I'm not sure if that's faster than SHA-512 on software versus HMAC-SHA-256 on hardware.


512/224 is fine too, and also isn't length extendable.


Skein was built (by Bruce Schneier's team) from the ground up to work to take advantage of multi-core systems to be faster via concurrency.

I'm still quite a fan of it and was sad that it lost. Both are excellent algorithms however.


whaaaaaaat - taking a 512-bit hash and truncating it to 256 preserves everything you need about it! That's crazy (surprising). Naively - if you hadn't just told me otherwise - I'd think it's up there with my brilliant new algorithm: in a loop pad your input with 0x00 through 0xFF, take the SHA512 hash of each result, but only use the first bit! You now have a sooper secure 256 bit hash. I call it SHA512/1/1/.../1 (I'd write it all out here, but it looks obnoxious and might break someone's window width.)


Also, on 64 bit machines SHA-512/256 is generally faster than SHA-256.


I'd love to know what these other compelling hashes are.


Somebody else in this thread was talking about BLAKE2, which I cast a cursory glance at. It seems pretty cool, claims to evade the length-extension 'issues' that SHA-1 has.

Wikipedia indicates that there has been at least some progress as far as cryptanalysis goes, but even with that being said, there's always that lingering 'but what if' about anything NSA-related.


SHA-2 is also length-extendable, which means you have to be careful when you use it to build a MAC. (That's why I like the truncated version).

No cryptographer I know takes these particular "what-if's" seriously. They appear to come exclusively from non-cryptographers reacting to anything that NIST touched.


Valerie Aurora has a nice post showing the lifetime of cryptographic hash functions. You can see them all following the same pattern over time. You don't want to be on the bleeding edge, but also should avoid the trailing edge. http://valerieaurora.org/hash.html


IMO, the outstanding result of the SHA3 competition was not Keccak but BLAKE. BLAKE2 is the best hash function around: https://blake2.net


Why, what's better about BLAKE?


It's faster (in software).


This is a disturbing answer. You come out strongly in favor of BLAKE, but without any consideration of its security properties, just "it's faster".

BLAKE uses the same basic construction as MD5 and SHA1, neither of which is a responsible choice for a collision-resistant hash function any more. While attacks producing collisions have not been presented, this construction means it's susceptible to some other attacks. There are well-understood ways to avoid these attacks, but a naive usage of BLAKE2 in a production system will likely leave you open to length extension attacks in particular.

Keccak, on the other hand, uses the novel sponge construction, which injects content into the hash (absorb phase) and then iterates a threshing function (squeeze phase). This construction specifically addresses security concerns which BLAKE2 simply doesn't.

To be clear: I do think the BLAKE developers made a good contribution to the security community: particularly, using ChaCha makes their algorithm very fast, and I suspect that the next generation of fast collision-resistant hash functions will use ChaCha in the sponge construction. There are situations where BLAKE is a better choice than SHA3. But its use requires a great deal of knowledge and care to be secure, and for the average person implementing a secure system, SHA3 is a much more responsible choice.

Security should not be taken lightly. Bad security can expose people's private information and get people jailed, doxxed, and sometimes even killed. Glibly claiming BLAKE is better without any discussion of the security properties of the algorithms is completely irresponsible.


Can you please explain where exactly you got the idea that BLAKE2 was length-extendable?

Can you also please explain some of these other attacks you're talking about it sharing with MD5 and SHA1? The commonality between MD5, SHA1, and SHA2 is the Merkle Damgard structure. BLAKE2 isn't an MD hash. Are these MD attacks that you're asserting apply to BLAKE2?

I'd like to know where the certitude you're projecting is coming from.


> Can you please explain where exactly you got the idea that BLAKE2 was length-extendable?

No, you're right: I misunderstood the algorithm; an extension attack has not yet been found. My core point still stands though: choosing a cryptographic anything should start with a consideration of the security properties of the algorithm, and only then should we talk about speed.


BLAKE(2) is not open to length-extension attacks---in fact resistance to those was a requirement of any SHA-3 submission. Its design does not have much in common with MD5 and SHA-1 beyond the usage of the compression function building block, instead of a public permutation as sponges do. The mode of operation of BLAKE is not Merkle-Damgard, but a variant of HAIFA.

As far as security goes, Keccak and BLAKE are mostly in equal standing both in security margin (number of rounds attacked vs total number of rounds) and cryptanalytic attention received.


> BLAKE(2) is not open to length-extension attacks---in fact resistance to those was a requirement of any SHA-3 submission.

I did make a mistake understanding the algorithm. You're mostly right: no length extension attack has yet been found. However, HAIFA is far more similar to MD than the sponge construction, and this is widely cited as a reason for Keccak's selection.

> As far as security goes, Keccak and BLAKE are mostly in equal standing both in security margin (number of rounds attacked vs total number of rounds) and cryptanalytic attention received.

Then why was Keccak selected? It's clear that at least some analysts think Keccak has significant advantages over BLAKE.

And ultimately my point still stands: a comparison of cryptographic hashes should start with a discussion of their security properties. Glibly stating "it's faster therefore it's better!" is highly dangerous.


You're mixing modes of operation with compression functions. A mode can be shown to be unconditionally resistant to length-extension attacks. Every SHA-3 finalist's mode was shown to be 'perfect' (indifferentiable) as long as the compression function remains strong. So if a length-extension attack is found on hash function H, this means that something has gotten horribly wrong and it is likely that more serious attacks are also possible.

It is in the compression function (resp. permutation) that Keccak differs the most from BLAKE. BLAKE is ARX-based---like SHA-2---while Keccak only uses bitwise operations. Since SHA-2 is not being deprecated by NIST, they figured that having a 'different' SHA-3 would hedge their bets against an attack against ARX primitives that could potentially break both SHA-2 and {BLAKE, Skein}. This is stated in [1, §3.4]. This has nothing to do with the MD structure.

Sure, when discussing these things security does come first. But all of these functions have been thoroughly vetted already, so the differentiators are elsewhere: sponges are flexible, BLAKE is faster in software, etc.

[1] http://nvlpubs.nist.gov/nistpubs/ir/2012/NIST.IR.7896.pdf



Supposedly. So far, I haven't found a software implementation that is faster than MD5. However, I have found plenty of Skein software implementations that come close to meeting the speed specification.

See http://ae7.st/p/5px. I'm using the reference code at https://github.com/BLAKE2/BLAKE2 for blake2, and https://jxself.org/git/?p=skeinsum.git for skein.


One of the key points of hashes in cryptography is to be computationally expensive rather than faster. Faster means more password attacks per second.

Granted faster is better for non-cryptographic purposes like data indexes, but even there I'd consider performance secondary to the hash size, etc


No -- You are confusing password hashing algorithms with general purpose hashing algorithms. Algorithms like bcrypt and scrypt have a tunable difficulty parameter, and are designed to be slow.

For general purpose hashing, you want to check if the fingerprint of these ten gigabytes of data is the same as the fingerprint of these other ten gigabytes, as quickly as possible. Or whether a file that you downloaded is the same as this other file. Or whether the data that you transferred has been tampered with or corrupted. Speed is important enough that this was one of the criteria in the hash algorithm selection process.

The key feature of a general purpose hashing algorithm is resistance to preimage attacks. In other words, "If I want hash 0x123456, what should the input be?" needs to be a difficult question to answer.

Speed is key when you are getting a fingerprint of a large amount of data. Don't use a general purpose hash directly for hashing passwords: It's better than plain text, but it's inferior by a long shot to special purpose password hashes.


I'm not getting confused. SHA-3 can be used for cryptographic hashing. In the linked PDF:

"The SHA-3 family consists of four cryptographic hash functions, called SHA3-224, SHA3-256, SHA3-384, and SHA3-512, and two extendable-output functions (XOFs), called SHAKE128 and SHAKE256."

If BLAKE isn't intended for cryptography then it's not a direct competitor to SHA-3.


General cryptography -- for example, message validation -- does not need to be slow. In fact, slow message validation would cripple hash functions for cryptography, increasing CPU load and reducing throughput.

Password checking is an edge case. Special purpose password hashing functions with tunable difficulty should be used for those. Do not use general purpose hash functions: They are better than plain text, but they are designed to be fast, and this makes it easier to brute force them.


> General cryptography -- for example, message validation -- does not need to be slow. In fact, slow message validation would cripple hash functions for cryptography, increasing CPU load and reducing throughput.

There is definitely a trade off between the two (performance on servers vs rate of passwords an attacker can crack). But generally the advice is to go for the slowest you can afford. Hence why KDF's have an iteration parameter so passwords can be hardened as harder gets faster.


No: This is why you have a split between KDFs and hashes. Where you use one, you would not want to use another.

Trying to make one that does both leads to something that sucks at both.


> No: This is why you have a split between KDFs and hashes. Where you use one, you would not want to use another.

That's not true either. Hashes are recommended to be used as input to KDFs.

"Modern password-based key derivation functions, such as PBKDF2 (specified in RFC 2898), use a cryptographic hash, such as SHA-2"

Source: https://en.wikipedia.org/wiki/Key_derivation_function#Key_st...

> Trying to make one that does both leads to something that sucks at both.

eh? Nobody is advocating that what-so-ever. Not me, not anybody.

I think you're now arguing with me for the sake of arguing with me. :-/


Ok, that was sloppy phrasing on my part, but my point still stands: The entire reason that KDFs exist is to deal with the fact that hash functions are designed to be as fast as possible. If it made sense to make hash functions slow, then KDFs would not be needed.

While KDFs do use hash functions internally, the hash function is an implementation detail.


Ok, thank you. That does make more sense now :)


Ironically this whole thread played out in the opposite direction at the time of SHA3's announcement:

https://www.schneier.com/blog/archives/2012/10/keccak_is_sha...

https://www.imperialviolet.org/2012/10/21/nist.html

tl;dr: Keccak is not a good partner for PBKDF2. It has good hardware performance but comparatively poor software performance. This benefits attackers with FPGAs or better.


The hilarious part here is Wikipedia calling PBKDF2 modern. It's a 2000 minimal update (to generate more bits, to be kind of UTF-8 aware) of a 1993 standard.

At the time of the RFC publication it was already obvious its security was way behind bcrypt that was used in OpenBSD since 2.1 (June 1997), which did its best to be ASIC hostile, which isn't the case for PBKDF2.

In retrospect, NIST choosing PBKDF2 over bcrypt in NIST SP800-132 could be seen as part of the effort to weaken standards for NSA profit.


You're being downvoted because despite the "wealth of other content online about using cryptographic hashes for password storage", you haven't actually read enough of it to know that neither SHA3 nor BLAKE are appropriate for password storage. Neither is a key stretching function. Ignorance isn't a sin, but unwarranted overconfidence is.

SHA3 and BLAKE are cryptographic hash functions, but they are fast collision resistant hash functions, NOT key stretching functions. They're primitives used in the construction of other cryptographic tools. Some cases where you might use a fast collision-resistant hash:

* Timing-attack-resistant string comparison: if you're comparing API keys, you should hash them both first to prevent an attacker from guessing the keys a character at a time.

* HMAC (look it up).

* One can implement key stretching functions by applying fast collision-resistant hashes multiple times.

* Signing (one can hash a message and sign the hash, which provides as much security as signing the message itself).

* Fingerprinting (a fast hash of a public key can be used as shorthand to verify public key ownership without having to read off the entire key).

* Addressing (Bitcoin uses a fast hash of a public key as an address).


I will admit that I'm not familiar with SHA3 nor BLAKE - in fact my OP demonstrates that I'm looking for more information about why BLAKE is a "better" hashing function. But I am actually quite aware of timing attacks, HMAC, and the other points you raised.

Maybe my "wealth of other content online" comment pissed a few people off - but equally I was pissed off that my original comment was downvoted so heavily with a few comments that weren't entirely accurate in response (I've often said the negative rep on HN gets over used and often causes more arguments - but that's another topic). Anyhow, I've removed my offending comment now and glad to see that the quality of responses have improved :)


Most applications of hashes, to inputs that aren't passwords, are protected because their input space is so large that it can't be brute forced or because their input simply isn't secret. The speed of the hash function then becomes a feature instead of a weakness.

For example, suppose I give you the hash of a random 128 bit network packet payload, and you have hardware to evaluate the hash function I used a quintillion times per second. How long will it takes you to find that packet? Well, there's 2^256 possibilities and you go at a rate of 10^18 per second, so... 2^128/10^18s ~= 10 trillion years.

Key derivation functions (i.e. password hashes) are a specialized version of hash functions for private low-entropy inputs. They need to be slow to prevent quickly enumerating and evaluation all likely inputs. Paying that time cost would be unnecessary, wasteful, and bloated for inputs that are public or high-entropy.

For example, Git would suffer enormously if it used a key derivation function instead of a standard hash function. Useful operations like rebase and squash would go from taking milliseconds to taking minutes or hours.


I feel silly just wading back in here, but...

Even assuming we're optimizing our selection for a KDF, blake2 is probably still the better choice. An attacker is likely to be using a hardware implementation. Your server is using software. An algorithm that's comparatively efficient in software reduces the disparity in capabilities.


General-purpose hash functions should be as efficient as possible. That's why they shouldn't be used for password hashing directly. There are special hash functions (slow, memory-intensive, hard to parallelize) for storing password hashes: https://password-hashing.net/


BLAKE wasn't specified as being general purpose and was compared against SHA-3 as being "better" because it's faster. Since SHA-3 does support cryptographic functions, my comment is a reasonable response stating that performance isn't the only metric when choosing a hashing function.


In any context where slower is better, SHA3 is not nearly slow enough.

Your comment (none of them, really) was not at all reasonable, assuming as it did that cryptographic hash is synonymous with password hash.


I didn't make that assumption. I exampled one use of cryptographic hashes as being for password hashing. An example is not the same as saying two things are the same.


For a fast hash, being fast is always better. You were not pointing out that there are other metrics, you were directly contradicting a true statement, that BLAKE being faster makes it better.

There is no use case where you want your super-fast hash to be 50% slower.

With a hash this fast you need to get thousands or more times slower to have any benefits in those specialized use cases.

It's sort of a bathtub curve.


There are lots of non-password-hashing applications of cryptographic hashes, like virtually every authenticity application. Your computer calculated hashes to allow you to post that post on HN.

Although we do want hashing to be slow when it's password hashing, that doesn't mean we want our general-purpose hash primitives themselves to be slow.


I know. I made that comment in the post you're replying to.


I think you are confusing hash functions with key derivation functions..


I'm not. Hash functions are used in cryptography for password storage (eg SHA2-512).

Best practice would be to use a KDF with a hash salt, but in a lot of cases, SHA2 + salt + pepper is sufficient.


> Hash functions are used in cryptography for password storage

hint: they're used for more than that.

you are woefully out of your depth here.


> you are woefully out of your depth here.

I very much doubt that since I've been able to provide back up sources to evidence my points. However if I am wrong then please do educate me instead of posting uninformative troll comments like the above.


"If BLAKE isn't intended for cryptography then it's not a direct competitor to SHA-3."

If you didn't know that BLAKE was one of the entries in the SHA3 competition (one of the finalists in fact), I think you are out of your depth.


Those two points aren't mutually inclusive. I do understand cryptography and hashes - maybe not to a security researchers level nor be up to date with the latest proposals - but that doesn't mean I'm out of my depth to ask the questions I've been asking either.

Furthermore, the way you conduct yourself in these posts isn't exactly helpful. You're condescending and terse. Your comments contain the bare minimum information (or in the case of this latest exchange, no useful information what-so-ever) and it feels very much like you're more interested in winning ego points than you are in educating someone who could clearly benefit from your greater wisdom. Which is sad because without that exchange of knowledge, these kinds of threads will keep happening.

edit: that said, I did appreciate your comment about hardware vs software: https://news.ycombinator.com/item?id=10012537 so thank you for that post :)


If you're interested in learning, ask more questions. This thread started off with some dubious statements, and when that was pointed out, you dug in deeper and deeper. That doesn't inspire much confidence that you will appreciate someone taking the time to educate you. Hence, the terseness.


My first draft of that post was more question-based than statement-based but for some reason I thought the rewritten version was better presented. In hindsight that was really poor judgement on my part. Sorry for that. But I assure you that I'm very much interested in learning :)

The "digging deeper" was more down to responding to other people who had misunderstood my original comment. And lets be honest, there has been a lot of that as well.

edit: oh jeez, someone's been on the rampage with the ▾ clicking. I think it's time I get HN a break because the moderation on here has really been bugging me in recent months. I know it's a tired cliche, but I'm sure this community used to be less hostile. </soliloque>


KDF should be slow.

Stream ciphers should be fast.

Message authentication should be fast.

Hash functions can be used in any of the above, therefore it should be fast. Key derivation functions generally work by taking a secure hash (that can be done fast), and then taking an operation that transforms the output of the hash function to another output that requires a lot of time and/or memory. And then possibly takes the hash of this as well to obscure the internal workings. The "slowness" is part of the KDF, and not part of the hash function. The reason it's tuneable is because it doesn't matter how fast the hash is, you can just run more iterations.


> SHA2 + salt + pepper is sufficient

Not even close. The attempts/sec you can run against SHA-2 vs. the attempts/sec you can run against bcrypt or scrypt are an order of magnitude apart.

You should not use anything less than a proven KDF (scrypt, bcrypt, PBKDF2-HMAC-SHA2 with a lot of rounds) for password storage.

Many of your posts conflate "cryptographic" with "password hashing", which is certainly not the case. Note that PBKDF2-HMAC-SHA2 does not (at all) mean that SHA-2 is useful for password hashing. As someone else pointed out, that's just an implementation detail.


Isn't the main purpose of a hashing function to validate file (edit: or "message") integrity? so you want them to process as many MB/s as possible?

I thought using hashing to save passwords is misusing cryptographic hashes for something they were not intended to do? (Assuming they will keep the input secret as opposed to preventing finding colliding duplicate inputs?)


Collision attacks is a non-issue with modern hashes.

The point of hashing passwords is it's a one-way cither. ie can't be unencrypted - can only be brute forced or rainbow table attacked (the latter is where salts and peppers come into the equation).

https://en.wikipedia.org/wiki/Cryptographic_hash_function


You keep defining "cryptographic" hashing as equivalent to password hashing, but password hashing is only one application of cryptographic hashing (a minority application). A hash function is still "cryptographic" and its applications are still "cryptographic applications" when they aren't password-related. Those applications commonly optimize for speed, rather than pessimizing for speed. In many of those contexts, the input to the hash and the hash value are presented simultaneously to the verifying party (or aren't even secret); in those contexts there is no benefit at all from making the cryptographic hash function slow to compute, and considerable benefit from making it fast to compute.


As per https://news.ycombinator.com/item?id=10012384:

I didn't make that assumption. I exampled one use of cryptographic hashes as being for password hashing. An example is not the same as saying two things are the same.

I've possibly expressed myself rather poorly, but I think quite a few people on here have made some incorrect assumptions about the point I was raising.


SHA-2 is still dandy - If I understand correctly SHA-3 is in an entirely different family of cryptographic hash functions, so that if/when a problem is discovered with SHA-2 there's another standard approved and ready to go - but that doesn't mean SHA-3 can't be used now.


I gotcha, that makes sense to do it that way, such that if the foundation of SHA-2 is compromised, SHA-3 can be deployed safely where it's needed.


Hasn't this happened before? Clearly they have security issues abound.


This convinced me to not waste another second on Ethereum. It serves a self-defeating purpose, and I think it's diametric to the hacker ethos. Bitcoin is neat because it can't be controlled, nobody can tax it or pose fees on it. It's free market at it's very best. The bitcoin protocol is weak, but luckily developers are strengthening it all the time.

Laws as a general concept are diametric to human nature, and they should be avoided whenever possible, especially in regards to social realms like bartering and contracting and the like.


Have you stopped drinking altogether? Did you consider yourself someone with a 'problem' when you did your experiment? Did your findings change the ferocity with which you drink? Given that it was one of your original expectations of the experiment, do you think that social friction plays a major role when it comes to the frequency with which you drink?

I find the idea of that type of experiment fascinating, perhaps you documented it in more detail....?


I did for the 2 months, recently had a few and broke the streak. No kind of problem at all, although that's where everyone's mind goes when you say you quit cold turkey.

Findings didn't change that much, but the break itself changed the way I drink, I think. Previously it was somewhat of a social default to just go out with some friends over a beer in the absence of other events. Being conditioned not to do that, and instead stay in and have a quiet night at home has given some perspective. I felt like I got a slight time gain on the evenings, but nothing serious.

Friction previously played a role. Especially in my early valley days when I was an intern, or in college, or as a contractor (negotiation/business things are hugely greased by alcohol), but now that I'm full time at more quiet company where drinking isn't embedded in the culture, the social friction aspect is basically gone. It's a per-social-circle thing for sure.

Unfortunately I did no detailed documentation. The stuff I'd want to measure would require months of prep/control gathering (eg: how do I use my post-work time before and during experiment)

Most interesting part of the whole thing was how uninteresting it was. There's the idea that drinking is socially necessary and horrible for your health. The idea that it's so meaningless is a little weird.


Lua. Computercraft is a very mature platform for playing around with, and is powered by Lua scripts. Lua is a very simple language to learn, and Minecraft is a great platform that kids seem to love, and the tangibility of seeing the world they create be molded by the code that they wrote really has a profound effect on kids.

At least in my experience anyways. Semantics first, concepts later, this is why Lua is easier to learn than JS.


Every day at work I just try and endure the suffering, and every night at home I just try and endure the alcohol.


Wow I'm so sorry man, I really hope you pull through. Let me know if I can help in any way.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: