Hacker News new | past | comments | ask | show | jobs | submit login
Computer scientists combine two 'beautiful' proof methods (quantamagazine.org)
95 points by tambourine_man 20 hours ago | hide | past | favorite | 23 comments





I know cryptocurrencies have a lot of problems, but this is one place where it’s good they exist: they’ve funneled a ton of money into researching new cryptography. The article claims that this research was developed for StarkWare, which developed its own form of zero knowledge proofs they called Starks, developed a programming language in which cryptographic proofs can be generated showing that the output of the program was truly produced from the inputs[^1], and continued to do more research like this. While I’m not a cryptographer, I think this sort of thing is super cool and I’m glad there’s a lot of attention and effort going into it.

^[1] the cool thing about these proofs is that they can be verified in much less time than is required to run the program. IIRC, starks can be verified in log-polynomial time, and snarks can be verified in constant time. This allows cryptocurrency protocols to have one computer in the network generate a proof of the state of the chain given a new block, and all other are able to quickly check it. Without this, you have a ton of waste because every computer in the network must duplicate the work of computing the new chain state. (although some new waste is introduced by the fact that producing a proof is very slow, although this is another area that cryptocurrency-funded research has improved dramatically)

Another cool bit of cryptography research that I think was motivated by crypto was “verkle trees”, an improvement on merkle trees that uses vector commitments rather than your typical hash. A vector commitment is like a hash, but it hashes an array of items and you can check that the hash was produced from an array with a certain item at a certain index without knowing the other items in the array. If you use this to back a merkle tree, you can increase the branching factor a lot and end up with much smaller proofs that a particular item is in the tree. This is because the proof size for a merkle tree is b * log_b n (where b is the branching factor and n is the number of items in the tree). The proof size for a verkle tree is just log_b n, so you can increase b as much as you want without making the proofs bigger.

Full disclosure: I work for a blockchain company, but not one mentioned here. I have no equity or stake in crypto outside of the fact that if the price goes to 0 my company probably can no longer afford to employ me


The other cool thing about verkle trees is that they were mostly figured out by a high schooler for a science fair (?) project.

They're not actually that complicated in principle, just nobody thought of it before and that's pretty cool.


Wow, that is cool. I'm actually annoyed that vitalik didn't credit him in his article about them. I wonder how many people think that the v stands for vitalk

> vitalik didn't credit him in his article

You mean this article?

https://vitalik.eth.limo/general/2021/06/18/verkle.html

> Verkle trees are still a new idea; they were first introduced by John Kuszmaul in this paper from 2018[link to [0]],

0: https://math.mit.edu/research/highschool/primes/materials/20...


Oh, I wonder if he edited it, I googled it to find what the GP was talking about and saw this which claimed he didn't: https://news.ycombinator.com/item?id=27557503

I want to highlight the fact that the quote from Vitalik’s article is the first sentence of the second paragraph. Credit is prominently given, not buried in any way.

> The article claims that this research was developed for StarkWare,

Does it? Starkware is mentioned only as the company run by some.guy who commented on how pivotal this work was.

It doesn't say he was related, and I can't find any affiliation between the researchers (all uk based at Cambridge and Warwick) and the US based StarkWare.


Oh, maybe it was unrelated, I think I misread the article

A professor I know once gave me the most beautiful example of a zero knowledge proof: she said “Imagine I want to prove to you that I can count the leaves on a tree, but I don’t want to reveal how I do it. I start by telling you there are e.g. 756,912 of them. Then I close my eyes and I let you remove as many as you want. We can keep going until you are convinced, and you still won’t know how I did it.”

Okay, but I wouldn't be convinced until I've counted up to 756,912 myself and seen that there are no more leaves left after that. Isn't it supposed to be less computationally intensive to verify the proof than to complete the original calculation?

I tried learning about zero knowledge proofs during the crypto craze, but I never understood the basic idea of how it was supposed to work, much less the math involved. I'd appreciate an ELI5 from anyone who has one.


The proof is that they can tell you the number of leaves (again) after you have secretly removed some. Since only you know the number you removed, if the difference between their counts matches your number, it is likely that they can indeed count the leaves on the tree. It is possible they have correctly guessed, which is why you repeat the challenge until you are convinced.

Or they have a way to measure your removal, eg, they don’t count leaves but branches without leaves (in the hypothetical tree example).

Sorry, that doesn't make any sense.

Edit: From another comment, it starts do make sense. So you let them remove a number of leaves secretly, and then you tell them the new number again, and they can check if your answer makes sense because of the difference between the two numbers.


This is a nice example, but couldn't he add a constant to his count as long as he adds the same constant each time? Regardless, it's nice

I know this is useful for crypto, but I think think I'm actually more interested in what new modes of remote code running on untrusted platforms this enables.

That is exactly the reason it's useful for crypto, nodes need to verify the output of code running on other nodes without trusting them.

It does seem like the article touches on concerns relevant to homomorphic encryption. Maybe someone knows if there is a connection.

> If someone finds a valid solution, they can easily convince a skeptical “verifier” that it really is valid. The verifier, in turn, will always be able to spot if there’s a mistake. Problems with this property belong to a class that researchers call NP.

How do you quickly verify that traveling salesman path is indeed the shortest one?


You have discovered that that problem indeed is not in NP, for the reason that it is not a decision problem. The decision problem is in NP, and there the problem is: given a graph and some value k, does there exist a TSP with cost at most k. You can see how that problem then becomes verifiable.

I think this is a great distinction that deserves elaboration for those less familiar with the topic. Can you explain a bit further?

In this context we can divide problems into two cases: optimization problems ("find a solution that satisfies P, in a way that has the highest score") and decision problems ("is there a solution that satisfies P, with score >= X" or "find a solution that satisfies P, with score >= X").

Complexity classes like NP are defined only for decision problems, not for optimization problems. NP can be defined either as

* the set of decision problems where, given a solution, you can check it in polynomial time with a deterministic Turing machine, or * the set of decision problems solvable in polynomial type by a non-determistic Turing machine

these two definitions being equivalent.

When someone mentions an optimization problem being in P, NP, or NP-hard, they actually mean the associated decision problem being in P/NP/NP-hard.

While technically incorrect, this is fine when working informally, because you can "translate" between the two in polynomial time.

If I give you an oracle (ie. a magic box) that solves an optimization problem (ie. gives you a solution to P with the highest score) immediately, you can trivially write a polynomial time algorithm that solves the decision problem: call the oracle, check the score of its answer, then compare that with the X value you were given. And vice versa: if I give you an oracle that solves a decision problem immediately (ie. given a value X, gives you a solution satisfying P with score >= X), you can write a polynomial time algorithm that uses the oracle a few times with the right values of X (exponential search then bisection), to find the solution with the highest score.


A decision problem is a problem with a yes or no answer. An NP problem is one where a "yes" answer can be verified quickly if they also give you some proof or "witness" of it. If I say "is there a route that goes through all these cities that's less than 500 miles long", and you answer yes, you can prove that such a route exists by showing me one of these and I can simply check how long the route is. So the problem is in NP because a "yes" answer can be proven or checked quickly.

However, if you say "no, there is no such route", there's not obviously any way to quickly show that. Despite that, the problem is still in NP because to be in NP there only needs to be a quickly-checkable proof of a "yes" answer. If you want a quickly checkable proof of a "no" answer, you need a separate class of problem called co-np.

A problem can also be in np and co-np at the same time, if both "yes" answers and "no" answers can have a proof that can be checked quickly.


Verifying that a path/tour is shortest is in co-NP.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: