Hacker News new | past | comments | ask | show | jobs | submit | sdsd's comments login

Hi, I'm Maria, a technical writer with an extensive background in software engineering and application security. If you need a writer with a strong technical background who thinks like an engineer, I'm your girl! Happy to show off my portfolio of either writing or technical projects, just send me an email :)

  Location: Mexico City
  Remote: Exclusively!
  Willing to relocate: No
  Technologies: React, Node, Express, Nextjs, Flask, Python, Rust, Solana, Ethereum
  Email: mariavillosa1612@gmail.com

My personal website (darigo.su) doesn't have HTTPS. I just deployed it a few months ago and haven't really done much with it yet. I guess I'll have to get around to it eventually, but I find charm in small old sites that haven't implemented modern protocol stuff. My site also uses <font> and <center> tags all over the place.

Maybe I'll do some more quirky anachronisms, like only serve the site via HTTP 1.0 or something. Who knows. Since my site has very little functionality, it doesn't really matter, it's just for fun.


To me, HTTPS is worth it alone to eliminate the possibility of the ISPs of people reading my site from injecting shit (ads, trackers, etc.) into the responses I send to them.

It’s completely trivial to set up, there’s really no downside at this point.


To be fair, triviality is relevant to the tooling you run and there are many downsides - extra negotiation and traffic, unsupported on older clients without using up an IPv4 address, certificate expiry, etc.

For most cases these are non issues, but there are many scenarios where those things can outweigh the potential of your ISP modifying/reading. If that's still a concern, you can tunnel through your ISP to a more trusted exit point.


This is the somewhat depressing but accurate answer. HTTPS doesn't mean you are communicating sensitive or private data, it means you want the client to see what you send them, and not something else.

I see this a lot. You've just made a small non-interactive site, and there's nothing secret. So why bother either https?

Well, firstly, I'd say, make ot https for fun. It's pretty simple to do, makes itself automatic, and costs no money. Just exploring this route can be very illuminating.

Secondly it prevents your site from being altered by the outside. There's a lot of equipment between your site and the client. Being HTTP allows it to be altered along the way.

Gor example, and thd least serious, is that your ISP can inject adverts onto your page. Most ISPs don't do this, but quite a few do.

Second, a malicious user could inject additional text onto upur pages, proclaiming your love for a political party, some racial or misogynistic slur, or whatever.

Third, you're sending a signal to all that you either don't understand the security risks, or don't care about them. This can have consequences (reputationally) if you are in, or going into, a computing career.

Making it HTTPS can be fun too!


The bigger issue is that a malicious interceptor could inject javascript. The site may not use javascript, but the users almost certainly have it turned on, and accessing a non-https site means that any man in the middle can inject malicious javascript.

HTTPS is about protecting the client.


E.g. China's great firewall sometimes inserts malicious Javascript that causes the client to launch a DDoS attack against Github or other sites China doesn't like.

https://en.wikipedia.org/wiki/Great_Cannon


Honestly, hard disagree. I get the push for HTTPS, but setting up HTTPS nowadays isn't fun. It's more or less 4 steps;

1. Install and run certbot (+the DNS plugin for your DNS provider).

2. Find an nginx config file on the internet that only includes the necessary ciphers to work on modern devices. Then include that config file + lines to your cert paths that certbot spits out into your nginx config.

3. Set up a generic redirect server block to send everything on port 80 to port 443.

4. Reboot nginx.

It's at least better than fiddling with openssl directly, but this isn't fun, it's busywork.


I'm not sure if you're being sarcastic here or not? I mean you listed "reboot" as a whole step... which suggests sarcasm (at least to me)...

Anyway, all the steps together take what - all of 5 minutes? - that's if you do DNS Challenges. If you do HTTP challenges it doesn't even take that.

So you're saying this 5 minutes isn't worth the effort? That it's somehow too hard?


It's worth the effort, I just wouldn't use the descriptor of it being "fun". It's busywork equivalent to pointing nginx at your PHP socket. It's not fun, you're just wrangling a config file until it does what you want and the modern browsing landscape disproportionately demands HTTPS everywhere even for sites that arguably aren't really helped by it.

That’s fine, but rather unrelated to the article, which is about the situation that you have an API served via HTTPS, and the question of whether you should also have a redirect from HTTP to HTTPS in that case, or rather return an HTTP error.

Yeah, I should have clarified. Nothing to do with the article really, just a random thought. Sorry if too off-topic!

When a user visits your site with a modern browser in default configuration they'll get an error page along the lines of:

  Secure site not available

  Most likely, the web site simply does not support HTTPS.

  However, it’s also possible that an attacker is involved. If you continue to the web site, you should not enter any sensitive info. If you continue, HTTPS-Only mode will be turned off temporarily for the site.

  [Continue to HTTP site]
(Copied from Firefox on Android)

Personally I think that's reason to do it alone, for the price of a free Let's Encrypt cert you can deliver much better UX. You presumably want people to visit your site.


>When a user visits your site with a modern browser in default configuration they'll get an error page along the lines of:

Neither desktop firefox nor chrome seem to do this by default, at least on my Mac (actually I think I'm wrong about Firefox on desktop as well, thanks to a guestbook signer!). Maybe it's a Firefox mobile thing, rather than a modernity thing?

>for the price of a free Let's Encrypt cert you can deliver much better UX

I'm going to get around to it, I promise haha.

Btw if anyone does visit the site, please do sign my guestbook: http://darigo.su/33chan (but be warned, a MITM might intercept and alter your comment!!)


> Neither desktop firefox nor chrome seem to do this by default

they do, you probably just checked the "i'm sure this is safe" button

posted a pic on the imageboard hehe


As your screenshot says, you've enabled HTTPS-Only Mode - that's not the default, though, so most(?) Firefox users don't see that.

Thanks for visiting! Receiving guestbook comments is such a delight. Does Chrome also give you this warning on desktop?

nope

I can appreciate this and also run a service that is neither meant to be commonly visited (imagine a tor exit node landing page explaining what this IP address is doing, but for a different service) nor will produce or ingest sensitive information

For a personal website that people might commonly want to visit, though, consider the second point made in this other comment: https://news.ycombinator.com/item?id=40505294 (someone else mentioned this in the thread slightly sooner than me but I don't see it anymore)


A HTTP website presents an opportunity for an attacker to MITM a payload that is ultimately executed in a user’s browser. Beyond ‘getting a moustache tattoo on your finger’ quirkiness, HTTP-only websites are really inexcusable beyond some very niche cases.

>Beyond ‘getting a moustache tattoo on your finger’ quirkiness

In that case, seems totally worth it. I, like moustache finger tattoos, am aggressively opposed to worrying about being perceived as cool. I will just have to live with being inexcusable.


not my problem!

Honestly, for public facing read-only websites, it's perfectly fine to redirect HTTP to HTTPS. There's just too many cases where you aren't going to get everyone to put "https://" on the front of URIs when they put them in docs, flyers, etc. You're lucky if you get "http://"!

The API security thing, yes, that makes sense. Personally, I run a number of servers for small groups where the sensitive stuff is SSL only - you won't even get an error going to port 80, other than the eventual timeout. But for reasons above, I cannot just turn port 80 off, and it's perfectly safe redirecting to 443.


>Re-encoding the image is a good idea to make it harder to distribute exploits.

Famously, the Dangerous Kitten hacking toolset was distributed with the classic zip-in-a-jpeg technique, because imageboards used to not re-encode images.

https://web.archive.org/web/20120322025258/http://partyvan.i...


Noob question from someone with little real CS experience, when the README for this project says:

> verifying the correctness of code

What is the difference between "verifying" the correctness of code, as they say here, vs "proving" the correctness of code, as I sometimes see said elsewhere?

Also, is there a good learning resource on "proving" things about code for working programmers without a strong CS / math background?

Edit: I'm also very curious why "zero knowledge" proofs are so significant, and why this is so relevant. Eg I heard people talking about this and don't really understand why it's so cool: x.com/ZorpZK


A very good resource for both verifying code and functional programming is Software Foundations (https://softwarefoundations.cis.upenn.edu).

One note though: Verus and the tool Software Foundations works with (Coq) take different approaches to proving things.

Verus attempts to prove properties automatically using something called an SMT solver, which is an automated system for solving constraints. Coq on the other hand, requires you to manually prove much more, offering a more limited set of automations for proving things.

Both have their advantages and disadvantages, namely that automation is great when it works and annoying when it doesn't.

(Another side note: Zero Knowledge Proofs (ZKPs) are kind of something different. A great many people who work in formal verification/proof don't touch ZKPs at all (ex: me). They are better thought about as a cryptography primitive)


Verifying and proving are used synonymously, as made clear later in the opening paragraph.

As for zero knowledge proofs, there is little practical use, significance, or relevance to them due to the overhead involved and the lack of a "killer app", so to speak. But they're conceptually interesting.


Whenever I hear people talk about the lack of practicality of some mathematical construct, I always remember G H Hardy who worked on number theory at the turn of the century. One of his famous quotes I love is:

> I have never done anything 'useful.' No discovery of mine has made, or is likely to make, directly or indirectly, for good or ill, the least difference to the amenity of the world.

Despite his self-proclaimed focus on pure mathematics, Hardy's work, particularly in number theory, has had profound applications in cryptography and other fields.

I agree about the overhead. The costs have come down significantly already but they still remain a few orders of magnitude too large. That being said, it’s killer app is cloud compute. Right now the only way to amortize the cost of HW is to run it on someone else’s computer, which brings along with it all sorts of security and privacy risks. Well-performing ZK proofs (which we don’t know if it exists / it may be a long time before we figure out how to do it) would let you do cloud computing securely without worrying about vulnerabilities in your cloud provider’s network. Like all cryptography, it’s a double-edged sword because the same techniques would let websites deliver code for your machine to execute that you have no knowledge of what it’s doing.


Do you ever think about the many more folks who worked on stuff that really was completely useless and nobody remembers? "Whenever I hear people talking about how the lottery is a waste of money with net negative expected returns, I think about the person I saw on TV who did win."

Cloud compute might be a good use case. But the overhead would have to come down dramatically for it to ever beat simply doing the compute in-house and incurring the overhead of managing a physical rack. Which is (and always has been, and likely always will be) an extremely viable option for anyone who is seriously interested in data access control.


> But the overhead would have to come down dramatically for it to ever beat simply doing the compute in-house and incurring the overhead of managing a physical rack.

A physical rack needs several physical people to look after it. Imagine it going down in the middle of the night while your sysadmin is on a vacation on Hawaii.

And cloud computing is also not expensive. If you don't have a large load, a t4g.micro instance on AWS costs 0.5 cents per hour ($1.2 per day, $438 per year). It will run most small workloads just fine, and you can easily scale it up with a couple of clicks.


ZKP's are a large load, so this hypothetical is invalid. Also ZKPs require you to already have some amount of in house compute on an in house rack, or else who are you even protecting your data from? So the question is do you provision slightly more compute into the rack you're already managing, or incur orders of magnitude more CPU time expense and a complete refactoring of your business logic to use ZKP's in order to... have slightly fewer servers in your rack? The benefits are hard to even comprehend.


I think you may be overestimating the complexity & costs involved. Hyperscalars run millions of machines and have extracted impressive efficiencies from that that you’re not going to be able to compete with by buying and maintaining your own rack. This starts to become clear when you consider how ancient, outdated and expensive servers for hospital systems are.


The hyperscalers are only efficient when compared to typical corporate IT departments, and what they can buy on the open market. For example, some analysts suggest that AWS EC2 has an operating margin of around 30%. If we take what off-brand cloud providers charge (who compete mostly on price) and compare that to the EC2 pricing, AWS must have some tremendous inefficiencies if they only have 30% margins.

Even when considering network bandwidth (where hyperscalers are famously expensive, and orders of magnitude above wholesale pricing), hyperscalers still be might be the cheaper option for larger public companies because of the structural issues they face internally (e.g., the inability to make it official that they are already running an ISP anyway, and start buying much cheaper carrier services from other ISPs).


> The hyperscalers are only efficient when compared to typical corporate IT departments, and what they can buy on the open market

As compared to what? Hypothetical IT departments staffed by the best engineers in the world?

> If we take what off-brand cloud providers charge (who compete mostly on price) and compare that to the EC2 pricing, AWS must have some tremendous inefficiencies if they only have 30% margins.

You're committing the cardinal sin of comparing list price. B2B sales heavily discounts the list price so you don't actually know what large companies are paying for vs what the discount providers would charge. I promise you the discrepancy isn't that wide.

Off-brand providers are usually much closer between list price & B2B pricing because their product offerings are so narrow that they can't offset losses in one with gains in another.

You also have to remember that hyperscalars need to charge a profit to reinvest those profits back into R&D. Corporate IT departments don't do that because it's a cost center. So even if you hired a super duper amazing team and built something that meets the needs today, over time the bean counters at your company cause you to underinvest what you actually needed (or you make a strategic mistake) and now you're having to spend significant resources to make up that shortage. Public cloud scaling is a way to mitigate that cost.

There's also the knock on effect of employee salaries - public cloud means more available (i.e. cheaper) expertise. A privately managed cloud means you have higher salaries if you're competing competently.

I've talked to people who run and run on private clouds (e.g. banks who are regulatory banned from it) and while it's impressive they really struggle to keep up with business needs in ways that you don't with public cloud. Yes, it's cheaper in some aspects & more customizable, but scaling limits have to be actively negotiated constantly - businesses gladly pay extra to not have to have that kind of internal negotiation.


> But the overhead would have to come down dramatically for it to ever beat simply doing the compute in-house and incurring the overhead of managing a physical rack.

Out of sheer curiosity, by which metric deploying the pre-packaged code (which takes seconds) into the cloud incurs a higher overhead compared to maintaining an in-house physical rack?


When the code has to use ZKP's, there is an intrinsic overhead measured in orders of magnitude of CPU time. By avoiding the use of third parties, that overhead is eliminated.

So the question is, does this hypothetical entity with very precious data incur the extra orders of magnitude of compute time in oder to not maintain a rack in house, or do they just... maintain a rack. Keeping in mind, of course, that they already must maintain a rack in order for ZKP's to make any sense at all. So... do they add a server to their existing rack to handle the workload, or refactor their entire computation to use ZKP's, and incur several extra orders of magnitude of compute time, and outsource the work to a cloud service?


You’re making the assumption that the entity with the data, and the entity that wants to process that data are the same entity.

An obvious use case for ZKP’s, which is under active research, is allowing medical data from institutions like the NHS to be processed by research organisations and pharmaceutical companies. Allowing pharmaceutical and research organisations to extract useful medical data from large populations, without requiring the handover of personal medical data.

In that situation the owner of the data has a clear incentive to use ZKP’s (they’re required to protect medical data), and so does the data processor (they can extract new inferences from previously inaccessible, but extremely large, datasets). There’s obviously a clear profit motive as well, and given the cost of medical research, the use ZKP’s doesn’t seem unreasonable.


What is an example of a real use case for such where the data owner would rather do orders of magnitude more compute to generate the results in the fancy new ZKP format, compared to hosting a simple API that processes the exact same queries and returns the exact same amount of information as the adversary would get in the ZKP strategy, but using classical means?


Your again making the assumption that the data owner know what queries people want to execute, and is also happy to pay for all the compute involved in running those queries. There is no “simple API” that’s gonna let you remotely run machine learning training on someone else’s hardware, or let you perform any other kind of large dataset analysis.

Converting data into a ZKP format is a one time cost, that can be shared by many different data processors.


Clearly, the owner is in a great position to charge for access. This is bread & butter software-as-a-service.

Can you give an example/paper/blog of how you envision this one time data conversion and infinite privacy preserving queryability to work?


Again you’re making the assumption the data owner wants to get into that game. I seriously doubt the NHS in the UK have any interest in becoming a SaaS provider.


> by which metric deploying the pre-packaged code (which takes seconds) into the cloud incurs a higher overhead compared to maintaining an in-house physical rack?

In terms of time? Probably not much. In terms of cost, we’ll it’s about scale.

I worked on a project for a big financial firm about 8 years ago, to implement a burst-capacity overflow to cloud strategy for their compute needs. About one week a month or even one week a quarter, they would need a few tens of thousands of extra cores.

At this scale the cloud providers were not really able to react or provision fast enough, and required us to reserve the capacity full time. In the end the costs for “buy and manage more racks” internally worked out cheaper.


> […] to implement a burst-capacity overflow to cloud strategy for their compute needs.

8 years ago the cloud did not effectively exist yet. Today the available capacity surpasses the needs of the prevailing majority of cloud customers, with a caveat – see below.

> At this scale the cloud providers were not really able to react or provision fast enough, and required us to reserve the capacity full time.

A few things have happened since then:

  1. Hardware capacity has been ramped up at a very large scale.
  2. Most importantly, advances in the workload distribution and optimisation: the cloud control plane distributes workloads to the available capacity across cloud data centres.
The caveat.

For most demanding customers, cloud providers have a solution called «the cloud appliance». A truck rolls in with the hardware that gets deployed as an extension of your own private cloud within a few days. Deployed software does not notice a difference and gets automatically distributed to the new processing capacity. If that is not enough, another truck rolls in with another «cloud appliance». It is presumed that if one operates at such a scale, they also have the money to pay for it.


> 8 years ago the cloud did not effectively exist yet

Yeah it really did. I’ve been around a while. “Cloud” was becoming a buzzword around the tech industry 16 years ago. 8 years ago we had AWS, IBM, and Microsoft all vying for our business on this project.

> For most demanding customers, cloud providers have a solution called «the cloud appliance». A truck rolls in with the hardware that gets deployed as an extension of your own private cloud within a few days

None of which was as cost effective as adding more capacity to your own data centres, if you’re already running them at scale, because fundamentally someone is profiting from renting out those cloud appliances. If you have the in-house capabilities already, you can cut out the middlemen.


> 8 years ago the cloud did not effectively exist yet.

What? AWS EC2 is 17 years old.


You need to make your code compatible with cloud SDK, and this work is not free. Also, clouds have limitations that you need to work around, this is also not free.


Usually when you need to do it many times at scale, self managed rack can be cheaper than cloud. The downside is requiring expertise to manage and maintain it.


The cost of deployment into the cloud is $0 at a scale that is unfathomable to any rack with the overhead being a near zero as well. Initial capital investment for the cloud is $0 as well.

The self-managed rack:

  1. Has to exist somewhere.
  2. Therefore it required the initial capital investment at some point, which is a mid five figure amount or more.
  3. Scales by its weight, height, depth and hardware specifications.
  4. Does not scale beyond (3) – hardware constraints of blades deployed in the rack are very rigid.
  5. Has to have available processing capacity to support the new workload.
  6. Has a habit of running of capacity at the most inconvenient moment requiring, well, a new rack.
A new/extra rack:

  1. Has to be paid for from a budget. The budget may or may not exist when the new rack is required, therefore potentially incurring further, potentially lengthy, delays (i.e. «no money in the budget until the next financial year». Boom.).
  2. Has to be taken through the procurement. Depending on the organisation and its size, procurement can take anywhere in between 3 and 12 months.
  3. Has to be in stock.
  4. Has to be installed, deployed and configured.
  5. Requires technicians and engineers to be available within a designated time slot to complete (4). The technicians and the engineers (all of them or a single/few) may be unavailable due to X, Y or Z.
Bonus points:

  1. The DC may not have enough room / power / cooling / etc. You and your rack are now stuck for an non-deterministic amount time.

  2. Adding a new rack to your rack HA setup requires a new network interconnect due to the network capacity reaching a saturation point. It is called «an expensive network switch». Add further delays, repeat all steps required to procure a new rack, add new/unforeseen delays.
With the above in mind, I fail to see how the overhead of a poorly scalable, self-managed rack is lower compared to a $0, software driven code deployment into the cloud at a scale that is limited by the size of one's wallet.


> The cost of deployment into the cloud is $0 at a scale that is unfathomable to any rack with the overhead being a near zero as well.

Obviously you haven’t seen my org’s cdk repos


Ah yes, the cloud, where network interconnect issues simply do not exist, and extra capacity is always available, and the budget never runs dry, and configuration is not required, and technicians and engineers are always available.

Can I have a puff of whatever it is you smoked to reach this cloud?


Your comment is inflammatory but you're not wrong. Selling the cloud as $0 cost to set up is at best, a fairy tale and at worst, an ad.

If "the cloud" was so simple, there wouldn't be 6-7 figure job positions dedicated to just setting up and managing their complexity and they definitely wouldn't come with certification requirements. Somehow the requirement of having a small team of employees "who can manage our own infra" is a deal-breaker but having a team of AWS Solution Architects is not.


> As for zero knowledge proofs, there is little practical use, significance, or relevance to them due to the overhead involved and the lack of a "killer app", so to speak. But they're conceptually interesting.

This is just plain wrong. There are numerous applications in the blockchain space where the overhead is tolerable because they provide properties (privacy, succinct verification) that other techniques cannot provide.

Their relevance to blockchains has led to massive reductions in the overhead, making it quite plausible that ZKPs will find use in settings where integrity is a must.

For example, you could prove that a particular binary is the resulting of compiling source code that satisfies particular properties, such as memory safety. This would allow you to skip unnecessary safety checks.


For a tool to be of "practical use, significance, or relevance" requires that the work it produces also is of "practical use, significance, or relevance". Blockchains are not.

And the best way to verify a compiled binary is correctly compiled is to do the compilation. That's a trivial amount of work. Constructing a ZKP of the same is certainly more cost intensive.


> Constructing a ZKP of the same is certainly more cost intensive. That is certainly true for the compilation phase (i.e. constructing the ZKP).

But for the verifier, that isn't so obviously true: - Firstly, the verifier might not even have access to the formal proof that the program meets the memory correctness guarantees. It might have access to a binary code, plus a proof that the binary code compiled from a completely memory safe language. The verifier would need zero knowledge of the code (which might be proprietary). - Secondly, proofs aren't necessarily that big (especially for Groth16 circuits), and can be applied recursively (you prove that you had a proof that something was true), and aren't that expensive to verify. If verifying a program once when you download it means it can be flagged as being safe to run without expensive bounds and MMU checks, it could open up new types of more performant CPUs, and potentially save far more than it costs to verify.


> And the best way to verify a compiled binary is correctly compiled is to do the compilation. That's a trivial amount of work. Constructing a ZKP of the same is certainly more cost intensive.

This makes the assumption that you both have the source code and that the compiler is deterministic. With the latter one being not the case for (most?) modern compilers.


It doesn't assume the compiler is deterministic, because you don't need to use the result for verification, you can just use the result.


Ah yes, the totally-valid use case of ensuring that a third party has properly compiled source code that you don't have access to. I'm sure this comes up in industry all the time.


I mean, do you ever run any closed-source software?


Can you provide an example of a ZKP of memory safety of a compiled output that would be less computationally expensive than static analysis of the binary?


You can’t just recompile a binary in many scenarios. Eg: when the source code is proprietary, or when there are time constraints.


> What is the difference between "verifying" the correctness of code, as they say here, vs "proving" the correctness of code, as I sometimes see said elsewhere?

In this context it is the same.

> Also, is there a good learning resource on "proving" things about code for working programmers without a strong CS / math background?

I wish. The Dafny docs are pretty good but IMO formal software verification is not really at the point where it is usable for normal programmers like us who don't have a PhD in CS / maths. The examples make it look relatively easy, but you will quickly run into "nope, couldn't prove it" and the answer as to why is some hardcore implementation detail that only the authors would know.


> IMO formal software verification is not really at the point where it is usable for normal programmers like us who don't have a PhD in CS / maths.

I know. Four decades ago I headed a project to build a system amazingly similar to this one, intended for real-time automobile engine control code.[1][2] This new system for Rust looks practical. It seems to be intended for people who need bug-free code. Most verification systems come from people in love with formalism. Those involve too much human labor.

Hints:

- The assertions and invariants need to be part of the language. Not something in comments, and not different syntax. They should be syntax and type checked during compiles, even if the compiler doesn't do much with them.

- It's useful to work off the compiler's intermediate representation rather than the raw source code. Then you're sure the compiler and verifier interpret the syntax the same way.

- SAT solvers aren't powerful enough to do the whole job, and systems like Coq are too manual. You need two provers. A SAT solver is enough to knock off 90% of the proofs automatically. Then, the programmer focuses the problem by adding assertions, until you get the point where you have

    assert(a);
    assert(b);
 
and just need to prove that a implies b as an abstraction mostly separate from the code. Then you go to the more elaborate prover. We used the Boyer-Moore prover for that. After proving a implies b, that became a theorem/rule the fast prover could use when it matched. So if the same situation came up again in code, the rule would be re-used automatically.

I notice that the examples for this verified Rust system don't seem to include a termination check for loops. You prove that loops terminate by demonstrating that some nonnegative integer expression decreases on each iteration and never goes negative. If you can't prove that easily, the code has no place in mission-critical code.

Microsoft's F* is probably the biggest success in this area.[3]

[1] https://archive.org/details/manualzilla-id-5928072/page/n3/m...

[2] https://github.com/John-Nagle/pasv

[3] https://www.microsoft.com/en-us/research/video/programming-w...


Formal software verification encompasses a wide set of what we refer to as advanced mathematics. The process of taking some thing you want the code to do, correctly abstracting it out into component lemmas, and generating proofs for those lemmas, is itself advanced mathematics. I don't really see a way that this could be simplified.


Formal hardware verification has been much more successful in industry. Are there fundamental differences in the types of problems that they solve? EDA companies have been able to create tools that don't require a PhD in Math to be used effectively.

I think that the explanation for such different levels of success is that economic incentives are different. The cost of a hardware bug is much much higher than the cost of the average software bug; this means that hardware companies are willing to spend a lot of money in getting designs right the first time, versus software companies that know they can always make bug fixes in new versions of their products. Additionally, hardware companies are used to paying millions of dollars in software licenses, which is not common in the software world.


Yeah I agree formal hardware verification is an order of magnitude easier to use.

My guess is it's because the problem is so much simpler. No variables, no loops, no recursion, etc.


> My guess is it's because the problem is so much simpler. No variables, no loops, no recursion, etc.

My guess: it's because people actually care about correctness for hardware. It's really expensive to throw out all those fab runs when you notice you messed up...


Yes, the challenge in any formal software verification is dealing with unbounded inputs and compute durations, as enabled by recursive programs. If the code under analysis is straight-line non-reentrant basic logical blocks, the verification is quite trivial indeed. This is vaguely the case for hardware verification, though of course there are complexities introduced by the physical nature of the embedding.


That would only cover the simplest feed-forward designs: combinational circuits and versions with extra registers added for timing reasons.

As soon as you introduce any kind of feedback in your RTL design it's possible to do any kind of computation that SW can do and therefore proving correctness is difficult.


Is it really that advanced? Do you actually need more logic tools than what was used in the 19th century to formalize real analysis? Things get complicated if you want to put the proofs into the types, but I don't think this is the main approach the industry uses for certifying software for high-integrity applications (in cases there is a requirement for formal verification, which seems to be less and less the case).

I don't think it's the mathematics, it's the tools. If the proof is separate and needs to be kept aligned with the specification and implementation manually, that's not really useful except for niche applications. Integrated approaches (like Verus here, or SPARK in the Ada context) should solve the implementation/proof gap, but such tools are not widely available for languages people actually use.


This is why I think interactive proof assistants (as opposed to "automatic" ones like Dafny), are a better starting point for learning. You're still gonna need to learn some higher level concepts, but you won't have the frustrating experience of the automation just failing and leaving you shrugging.

Software Foundations is great: https://softwarefoundations.cis.upenn.edu

If you stick with it long enough, you'll even build up to Hoare logic which is the underpinning the tools like dafny use to generate the equations they throw to the solver.


I've found the DX of Dafny to be very approachable. The VSCode extension is pretty great, and you get feedback as you type. Also, its ability to give you counter examples in the IDE is really nice.


Lot's of great work on dafny in general!


AFAIK, zero-knowledge proofs allow you to prove that you know something without revealing what it is you know. For example, verifying you know a password, but without having to send the password to the server, so a malicious server or MitM wouldn't be able to sniff it.

It also might provide better options for identity verification, i.e. proving you have a certain government-issued ID but without actually leaking the document to the server (for it to be stored "for a maximum of 2 years / 3 years / 6 months / etc." but then leaked in a data breach anyway).


Secure password verification does not require anything close to what we mean when we refer to modern "zero knowledge proofs". A better example would be verifying you know the answer to a SAT problem without sharing the answer. Which is of... limited.. practical application.


ZK proofs could be used to implement secure cloud computing where a total compromise of a vendor’s machines / network wouldn’t in any way compromise the customers running workloads on that network.


Perhaps. But if someone cares that much about data access control, they can very easily manage a rack in house (they probably already do), and do the compute on that. Paying someone else to do orders of magnitude more work than strictly needed when you have private enough data to justify just buying a server (again, you already have one) is a very hard sell. It could maybe, possibly be justified if the workload is highly burstable, but... even then, it's hard to come up with numbers that make sense.


Is homomorphic encryption a type of ZKP or is it a different thing entirely?


Good question. I found [1] but I’m not a cryptographer so I may have made mistake because they’re so closely related. It sounds like homomorphic encryption is the strong aspect. ZKP are used in securing chats in Signal for example: https://eprint.iacr.org/2019/1416.pdf

[1] https://crypto.stackexchange.com/questions/57747/what-is-the...


I never said it required ZKP.


You implied it would be a helpful tool for soling that particular problem, which is to dramatically misrepresent their utility to the parent asking for concrete examples of their relevance.


It's just the analogy I use to understand ZKPs myself, since I haven't done a ton of research into them yet. It's also why I think ZKPs are cool, so it was my response to a comment asking why people think ZKPs are cool.


> "proving" things about code for working programmers

I'd argue that this is antinomic. Proving things about code isn't something working programmers do yet. I'd say that Hoare logic is a good starting point as it is sometimes taught in introductory CS classes.

Coq has a steep learning curve, especially if you're not familiar with OCaml or similar languages. Maybe Why3 is more beginner friendly https://www.why3.org

Proving vs verifying: could mean the same thing. Proving seems to me as something more interactive in nature, while verifying could be automatized (model checking, SMT-solving of annotated programs).


Working programmers write proofs in a limited sense. Any time you write types you're writing a proof. Maybe it's a stretch to say "const a: int = b;" is a proof, but when you get into higher-order types in TypeScript it's appropriate.


A trivial proof is still a proof.

There's nothing trivial about "const a: int = foo();" though. Compilers disprove the claim by contradiction all the time.


This blog has a lot of introductory articles on many types of verification and correctness:

https://www.hillelwayne.com/


> Also, is there a good learning resource on "proving" things about code for working programmers without a strong CS / math background?

I don’t know how difficult Software Foundations is for someone with a limited background but it’s probably worth trying:

https://softwarefoundations.cis.upenn.edu/


Verification is proving specific things about specific properties of the program.


"Verifying" and "proving" are synonymous in this case. You prove or verify that the system satisfies some specification, i.e., properties. Your normally write the properties yourself, but sometimes the properties are hardcoded (e.g., your code doesn't panic).

You don't need a particularly strong formal background in CS or maths to verify code, but it's a challenging task never the less.

1. The first challenge where people often give up is that you will often need to reason about your code in a different language than the code itself (not necessarily true for Verus that's being discussed here). In that sense it helps a lot if you already know multiple programming languages, ideally using different paradigms. 2. The second challenge is that you will need to be actually precise in describing what your want your program to. This is a lot harder than it sounds; there's no hand-waving, no "works in practice", but even beyond that, it's actually really difficult to express what you want from the program succinctly, and often not even possible or feasible. On the other hand, this is possibly the most practical skill you will currently be able to take away from the verification tools; it absolutely can change the way how you work, how you collect requirements, and how you go about writing code. 3. The third challenge is then actually performing the verification/proofs. This is currently painfully hard; most literature I've seen comes up with a ration between 5-20x between the amount of code you're verifying (in terms of lines of code, say) and the number of "proof lines" you need to write in some form. This may make sense for extremely critical code where a bug can cost tons of money; e.g., aerospace, or huge companies with tons of users - AWS is doing proofs at a fairly large scale, Apple is now doing the same at a smaller scale too. I would generally recommend NOT writing proofs if you can, but using tools that can do some kind of limited or heuristic verification. Which tools exactly I would recommend for a "working programmer" depend on what you're trying to do. For example, if you are writing concurrent or distributed code, I would recommend using something like TLA and the associated TLC tool. For lower-level Rust code (data structures and similar), I'd recommend the Kani verifier over something like Verus. For many other languages, the choices are limited or non-existent.

Zero-knowledge proofs roughly allow you to convince someone that something is true, without leaving them any wiser WHY it's true. Generally this is not what's interesting for software verification, but it's nevertheless extremely cool (because it's super counterintuitive). Most of the excitement indeed comes from blockchains, where these things are used, but it's debatable whether they're really practical. You can use them to transfer crypto around already today, without revealing the identities of the sender or recipient for example. However they are still computationally quite expensive and only work because there's currently not so much volume in crypto.


> What is the difference between "verifying" the > correctness of code, as they say here, vs "proving" > the correctness of code, as I sometimes see said > elsewhere?

There is not much difference, except that verification usually includes identifying a formal logical proposition about the behaviour of the code.

In other words, formally verified code is code that has been proven to meet at least one formal proposition about its behaviour - for example, a proposition about a function f might be: if variable x is greater than 0, then `f(x) = 1`.

There is no such thing as proving something 'correct', you need someone to define what exactly correct means, and then someone proves it meets that definition. So the proving is only a subset of the overall formal verification task.

> Also, is there a good learning resource on "proving" things about code for working > programmers without a strong CS / math background?

Most will be specific to a particular technology and type of verification. There are some courses online that provide a high level overview, e.g. https://anton-trunov.github.io/csclub-coq-course-spring-2021....

If you want to get into specifics, you might need to pick an approach. You could learn a dependently typed language, for example there are some good resources out there on proving things in Agda or Idris. Or perhaps play around with one of the formal verification systems that can be bolted on to C or Rust.

> Edit: I'm also very curious why "zero knowledge" proofs are so significant, and why this > is so relevant. Eg I heard people talking about this and don't really understand why it's > so cool: x.com/ZorpZK

ZK is an exciting area of cryptology because breakthroughs in that area power new applications that people wouldn't have thought possible before. Applications to cryptocurrencies in particular can solve some of their scaling and privacy problems.

For example, one of the biggest problems with cryptocurrencies is that every single transaction ever needs to be recorded in a ledger that is distributed to every node participating in the network. That simply won't scale to microtransactions. Let's say that 1000 people each start with 1 coin, and do 100,000 small transactions averaging 0.001 coins amongst themselves (maybe they bought a coffee, or paid for information on a per-view basis, or whatever). Storing those 100,000 transactions forever will have an ongoing cost for every one of thousands of nodes on the network long after the transaction has happened.

Now that could be solved with centralisation - the parties send their transactions to a trusted company, who maintains balances for each of them without sending transactions to the network, but lets them withdraw their balance to the network if they ever want to. But centralisation is a risk - what if the company betrays their trust?

Zero-knowledge cryptography allows for the parties to roll up the signed transactions into a cryptographic proof saying, given these were the balances at the start of the 100,000 transactions, the person creating the roll-up has access to the signed transactions proving that the balances of each of the 1,000 parties at the end are this. Notably, the proof can be much smaller than the size of the 100,000 transactions. So that enables applications where people work off in 'side chains', and but can merge the side chain back into the main chain by submitting proof about the effects (but not all the detail of) the side chain into the main chain.


Their reduction in productivity is greater than their reduction in consumption, so you have a large population that only consumes.


Can that be assumed? A lot of people retiring may not be very productive in those last years, and if they're replaced by automation it could very well be that total productivity goes up when they retire. I wouldn't assume that either, my point is that I don't think either of these scenarios is a given.


Chinese culture is very different than western.


Catholic fertility is very low compared to the other groups listed. Whatever secret sauce for resisting modernity is built into evangelicalism, mormonism, etc, appears to be absent from high church denominations. Except for the hardcore traditionalists. We can expect to see more of them, as the others are bred out of existence.


> Except for the hardcore traditionalists. We can expect to see more of them, as the others are bred out of existence.

I would say conservative Catholics rather than just traditionalists.

Both SSPX and Opus Dei have larger families, and both have a strong conservative streak. But SSPX is traditionalist, whereas Opus Dei is more "modern conservative". Opus Dei is officially enthusiastic about Vatican II (provided it is "interpreted correctly"), SSPX refuses to accept it. While both celebrate Mass in Latin, SSPX insists on using the old (pre-1970) Tridentine Mass in Latin, whereas Opus Dei's Latin Masses are the contemporary Mass in Latin. Opus Dei emphasises obedience to the Pope (although Francis makes that painful in a way that his predecessors never did), SSPX insists on a duty to disobey papal decisions it believes to be erroneous


On my personal website (darigo.su) I use a <center> tag to center an <hr> like so:

    <center>
        <hr style="width: 500px; margin-bottom: 40px; margin-top: 10px;">
    </center>
Also, if anyone reads this, can you go to my site and tell me if the little pokemon minigame in the bottom right corner works for you? It usually works for me but sometimes if I open the site in Chrome on guest mode, it just loads a blank screen, but then if I resize the window it starts working. Sigh.


You might as well just use auto margin-left and margin-right here:

  <hr style="max-width:500px;margin:10px auto 40px">
(I’d also go max-width instead of width for the sake of narrower viewports.)


I might as well, but I find the <center> tag more charming. It's how my sister taught me to do it when I was learning NeoHTML for my Neopets pet page. After all, it's just for fun. In a different job market, I'd be worried about potential employers scrutinizing my code haha


I just made a personal site (darigo.su), it's buggy as hell but fun to just make whatever I want. Do any of you have a quirky personal site?


I rather enjoy using a pVPN (personal VPN) by just setting up wireguard on my server.


>Even Chinese money is used to uphold American cultural hegemony.

It makes me think of German barbarians who, after sacking the urbs aeterna, couldn't think of a better name for themselves than Holy Roman Empire.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: