It all depends on who is doing the attacking. If the attacker is a state governm...

kragen · on Sept 10, 2019

The cost of mounting the attack with general-purpose computers would be orders of magnitude higher than the US$12B calculated above, just from the energy bill. Check out https://en.bitcoin.it/wiki/Mining_hardware_comparison; the ASIC AntMiner S9 gets 10 billion hashes per joule, while on https://en.bitcoin.it/wiki/Non-specialized_hardware_comparis... you can see that the most efficient GPUs get 2–3 million hashes per joule, 3000 to 5000 times less. So instead of spending US$12 billion on hardware and US$4 billion per year on power, you'd be spending US$12 trillion per year on power. No state has such a large budget; that's more than half of the US GDP, and about 15% of world GDP. That's more than the amount of marketed energy produced in the world.

That is, the attack you're proposing would require commandeering 15% of the world economy and more than doubling world electrical generation capacity.

Bribing or backdooring ASIC miners is a far more likely vulnerability.

lonelappde · on Sept 10, 2019

Are ASICs totally useless for non Bitcoin computing?

kragen · on Sept 10, 2019

Yes, that's what the “AS” stands for.

mikekchar · on Sept 10, 2019

To be a bit more specific, the ASIC is a computation built in hardware. It can do that computation and that computation alone. The algorithm in Bitcoin does absolutely nothing useful -- you are just hashing a random number and seeing if the result is below a certain value. Since you can't change the computation on the ASIC, it's absolutely useless for anything except mining bitcoins.

perl4ever · on Sept 11, 2019

Maybe this is dumb/naive, but I've wondered for quite a while - what if you take the latest deep learning/AI techniques and try to train a system to predict the approximate size of the hash from the input number? Has anyone seriously tried and failed?

schoen · on Sept 11, 2019

A different angle from the sibling comment's: the hash functions used here are meant to approximate pseudorandom functions, which are functions that, while deterministic, literally have no structure at all relating their inputs to their outputs in any way.

Hash functions used in the real world haven't been proven to have this property (and there are various theoretical limits on how readily they could be proven to have it, and maybe on the extent to which they could actually have it), but in order to be widely adopted, a hash function has to pass every available statistical test for approximating pseudorandomness, and also has to resist mathematical analysis aimed at finding useful structure. That means that ordinary human intelligences fail to find a practical recipe for predicting properties of the output from properties of the input.

In the same way, we would expect that deep learning systems fail to find such recipes too.

On the other hand, it's not absolutely impossible that there are some kinds of regularities that a deep learning system might discover. If so, they would be considered very serious flaws in the hash function in question. But deployed cryptographic primitives have sometimes had problems like this. The best example that I know of is the RC4 cipher

https://en.wikipedia.org/wiki/RC4#Security

where there have been a series of statistical biases (which are often forms of correlation between input and output, which should not exist if RC4 approximated pseudorandomness well). Some of these were apparently discovered experimentally by researchers with some kind of hypothesis testing tools, as opposed to based on theoretical abstract reasoning about the mathematics of RC4. This makes me think that some kinds of deep learning systems might also have been able to discover those correlations, although I'm not sure that they would have been the most efficient methods for doing so. (An interesting test might be to try to use deep learning to find new correlations in RC4 that aren't yet known -- which seems plausible since researchers have repeatedly found new ones over time.)

I think there are interesting problems about what kinds of correlations and structures deep learning systems can or can't learn efficiently, and whether those are the kinds of correlations and structures that are likely to exist as genuine flaws within deployed hash functions. I definitely don't know enough about the mathematics of deep learning to appreciate how to begin answering this question; I only know that if it turned out to be useful in some case, it would mean that the application of human intelligence and existing statistical tools to assessing hash functions' security had dramatically fallen down on the job.

kragen · on Sept 11, 2019

This is correct.

mikekchar · on Sept 11, 2019

If you could do it, you'd be famous for more than BTC. It would mean that you could factor numbers in polynomial time. Long story short, calculating the hash for BTC is intended to be a non-polynomial problem. If you can find a method (any method) that will reduce the search space of the answer such that you can calculate it in polynomial time, then you have proved that NP=P (If you can find a way to solve any non-polynomial-time-hard problem in polynomial time, then it means that all NP hard problems can be solved in P time). This would completely break all current encryption.

Most people think that this is impossible, but it has not been proven yet. Unless you had some reason for believing the technique would work, it's probably not worth the effort to try.

schoen · on Sept 11, 2019

One minor correction to this (although I agree with you conceptually) is that there are no formal security proofs of the complexity class of hash function attacks for actually-existing hash functions. So there is no guarantee that the most efficient way to break a hash function security property is a generalizable attack on NP problems! It might just be that the specific hash function is weak in a previously unknown way.

A sort-of precedent for that kind of problem which I mentioned in my sibling comment is the correlation weaknesses in RC4. While they're not the most powerful possible break of RC4, by any means, they are unanticipated flaws in the structure of RC4 specifically, and they might well have been discovered by software tools that can't solve NP problems in general. It seems to me that we don't have security proofs for symmetric cryptography at all, including for block cipher security properties as well as hash function security properties, and so while your observation is totally right in general, in any specific case it might just turn out that the cryptographic primitive we were using was weak in an unanticipated way that's specific to that class of functions.

Compare https://en.wikipedia.org/wiki/Random_oracle (although reading that article reminds me of how much I don't understand about this topic)!

kragen · on Sept 11, 2019

In some ways you are correct, but you have made some errors.

As schoen pointed out, no common hash function, including the ones used in Bitcoin, has a proof of NP-completeness.

Moreover, none of those hash functions involve factoring numbers, and factoring numbers is also not known to be NP-complete, although it is also not known to be tractable in polynomial time. One reason commonly-used proof-of-work functions do not involve integer factorization is that, while integer factorization is not known to be doable in polynomial time, there are a number of algorithms that require subexponential time, so an integer-factorization-based proof-of-work witness would be much larger than an equivalent hash-function-based proof-of-work witness.

Also, it is not the case that efficient integer factorization would completely break all current encryption. Not only do no commonly-used hash functions depend on it, neither do any commonly-used symmetric ciphers (such as AES), and the currently-most-popular asymmetric cryptosystems also do not depend on the difficulty of integer factorization; instead they depend on the difficulty of the elliptic-curve discrete logarithm problem.

ECDLP is also not known to be NP-complete, but the currently-known algorithms for it are much worse than currently-known algorithms for integer factorization, so elliptic-curve cryptosystems require much smaller keys and less computation to resist the known attacks than integer-factorization-based cryptosystems.

As little as ten years ago, algorithms that could be broken by better integer factorization algorithms were relatively much more important than they are today, because elliptic-curve cryptography was much less widely used. Many vulgar accounts of the situation intended for the ignorant are not up to date.

Finally, it is not true that a proof that P=NP would "completely break all current encryption", for two reasons. First, it might not be a constructive proof — it might show that a polynomial-time algorithm for factoring integers, solving ECDLP, or computing hash preimages exists without actually telling you how to compute it. Second, even a constructive proof of P=NP might not provide an algorithm that was adequately efficient — if it takes O(n²) time to encrypt and decrypt, where n is the size of a key, but O(n⁸) time to break a message or a key, you might be adequately safe with, say, RSA-4096. But an O(n⁸) algorithm for 3-SAT would definitely be a constructive proof of P=NP.

(Shor's algorithm on a quantum computer can break RSA, because it does depend on integer factorization, in O(n³) time, if quantum computers can exist, which they probably can. This would not make it impossible to do RSA encryption securely, but it would require much larger keys than are currently used.)

However, your fundamental point is that a successful attack on Bitcoin's hashing algorithm, using artificial neural networks or anything else, would be very surprising and have major implications, because that proof-of-work scheme is designed to require exponential work, and as far as anyone knows, it does. And that fundamental point is correct, even though you have made a number of errors in your supporting points.

celticninja · on Sept 10, 2019

the "AS" in ASIC stands for Application Specific.

SO these machines are made to specifically mine bitcoin, other ASICs do other jobs, but individual units cannot transfer from one job to another and cannot be re-purposed.

kragen · on Sept 10, 2019

Right, the ASIC in your DVD player that does MPEG-4 decoding is useful for something other than Bitcoin mining. But it's not useful for Bitcoin mining at all. Similarly for a motor-controller ASIC, an HDMI encoding ASIC, a GPRS radio ASIC, and so on.

solveit · on Sept 10, 2019

It doesn't matter if the US government is doing the attacking, they're not winning with general purpose hardware. ASICs are literally thousands of times faster with lower power consumption than GPUs. Also, any state attacker worth their shit would have the ability to develop their own ASICs so I don't know why you're hung up on using general purpose hardware.

Finally, you certainly cannot buy off the ASIC miners with a few million, not least because they own billions of dollars worth of ASICS that can't do anything but mine Bitcoin, and a successful state-orchestrated attack on Bitcoin would make all of them worthless. I don't think a state government taking down Bitcoin by its own rules is completely impossible, but you don't seem to have any grasp of the scales involved.

nybble41 · on Sept 10, 2019

State governments have simpler and more reliable ways to attack the system. Like armies. Buying off the miners would require at least a few hundred million dollars—you'd need to buy the hardware to pull off an attack like that, not just rent it for a while. And you still wouldn't manage to get anywhere near the full market cap by shorting Bitcoin while you do it. You'd never find anyone crazy enough to loan you that much. A small fraction, perhaps. Not enough to cover your expenses.

And in the end, after your massive investment, the community would just make some trivial change to the protocol and completely ignore the attack. Really, if this sort of thing was so easy then everyone would be doing it. You're certainly welcome to try.

lonelappde · on Sept 10, 2019

You're ignoring diplomacy. An army is a very unsubtle attack that might have unwanted geopolitical consequences.

celticninja · on Sept 10, 2019

First of all hardware is a bottleneck, so likely the state needs to procure or manufacture their own. Which delays the start of the attack, during which time the network will continue to grow. The state cannot just magic ASIC devices from nowhere.