More

stncls · 2025-02-13T10:00:52 1739440852

> Instead, cryptography needs problems that are hard in the average case, like the RSA problem, the discrete logarithm for elliptic curves, and the shortest vector problem for lattices. We don’t technically know whether these are NP-complete

But we do know that the shortest vector problem is NP-hard (in Linf norm), and so is its decision version [1]. We don't have a reduction in the other direction, but we know that SVP is as-hard-as-NP-complete-or-harder.

This goes against the general argument of the article, but only weakly so, because I think lattice-based systems are less broadly deployed than elliptic curves or factoring (not a crypto person, though).

[1] H. Bennett, "The Complexity of the Shortest Vector Problem.", 2022 https://simons.berkeley.edu/sites/default/files/docs/21271/s...

less_less · 2025-02-13T16:20:43 1739463643

Yeah, SVP is NP-complete for certain parameters. But lattice cryptography uses other parameters where SVP might not be NP-complete. Also lattice crypto is usually based some other lattice problem like LWE, MLWE, MLWR, SIVP etc.

Lattice crypto is going to be broadly deployed because it hopefully can resist quantum attack, unlike ECDLP and factoring.

stncls · 2025-02-11T08:02:24 1739260944

Floating-point is hard, and standards seem like they cater to lawyers rather than devs. But a few things are slightly misleading in the post.

1. It correctly quotes the IEEE754-2008 standard:

> A conforming function shall return results correctly rounded for the applicable rounding direction for all operands in its domain

and even points out that the citation is from "Section 9. *Recommended* operations" (emphasis mine). But then it goes on to describes this as a "*requirement*" of the standard (it is not). This is not just a mistype, the post actually implies that implementations not following this recommendation are wrong:

> [...] none of the major mathematical libraries that are used throughout computing are actually rounding correctly as demanded in any version of IEEE 754 after the original 1985 release.

or:

> [...] ranging from benign disregard for the standard to placing the burden of correctness on the user who should know that the functions are wrong: “It is following the specification people believe it’s following.”

As far as I know, IEEE754 mandates correct rounding for elementary operations and sqrt(), and only for those.

2. All the mentions of 1 ULP in the beginning are a red herring. As the article itself mentions later, the standard never cares about 1 ULP. Some people do care about 1 ULP, just because it is something that can be achieved at a reasonable cost for transcendentals, so why not do it. But not the standard.

3. The author seems to believe that 0.5 ULP would be better than 1 ULP for numerical accuracy reasons:

> I was resounding told that the absolute error in the numbers are too small to be a problem. Frankly, I did not believe this.

I would personally also tell that to the author. But there is a much more important reason why correct rounding would be a tremendous advantage: reproducibility. There is always only one correct rounding. As a consequence, with correct rounding, different implementations return bit-for-bit identical results. The author even mentions falling victim to FP non-reproducibility in another part of the article.

4. This last point is excusable because the article is from 2020, but "solving" the fp32 incorrect-rounding problem by using fp64 is naive (not guaranteed to always work, although it will with high probability) and inefficient. It also does not say what to do for fp64. We can do correct rounding much faster now [1, 2]. So much faster that it is getting really close to non-correctly-rounded, so some libm may one day decide to switch to that.

[1] https://github.com/rutgers-apl/rlibm

[2] https://github.com/taschini/crlibm

SuchAnonMuchWow · 2025-02-11T09:23:47 1739265827

3. >> I was resounding told that the absolute error in the numbers are too small to be a problem. Frankly, I did not believe this.

> I would personally also tell that to the author. But there is a much more important reason why correct rounding would be a tremendous advantage: reproducibility.

This is also what the author want from his own experiences, but failed to realize/state explicitly: "People on different machines were seeing different patterns being generated which meant that it broke an aspect of our multiplayer game."

So yes, the reasons mentioned as a rationale for more accurate functions are in fact rationale for reproducibility across hardware and platforms. For example going from 1 ulp errors to 0.6 ulp errors would not help the author at all, but having reproducible behavior would (even with an increased worst case error).

Correctly rounded functions means the rounding error is the smallest possible, and as a consequence every implementation will always return exactly the same results: this is the main reason why people (and the author) advocates for correctly rounded implementations.

jefftk · 2025-02-11T12:13:28 1739276008

> "solving" the fp32 incorrect-rounding problem by using fp64 is naive (not guaranteed to always work, although it will with high probability)

The key thing is there are only 2*32 float32s so you can check all of them. It sounds to me like the author did this, and realized they needed some tweaks for correct answers with log.

stncls · 2025-02-06T12:34:35 1738845275

I'm ready to believe that pipewire is imperfect (although I have personally experienced no crash, and did not have to configure anything), but the sentence from the original post is:

> Therefore, I naturally omit the use and configuration of additional audio layers in the form of a Jack server, PulseAudio or the unfortunate PipeWire from RedHat.

I cannot understand this sentiment. I have used all three for years each. If I had to qualify one as "unfortunate", it would certainly not be PipeWire (nor would it be Jack for that matter).

> performance is not that great compared to pulseaudio and jack ..

Jack makes different trade-offs by default, so for some definition of performance, yes, one could see it as superior to PipeWire. But in my experience PipeWire is better than PulseAudio on all axes (at least: latency, CPU usage, resampling quality).

mouse_ · 2025-02-06T12:42:02 1738845722

Right? PipeWire is the best.

eptcyka · 2025-02-06T13:31:27 1738848687

Yeah, but it could still be better. I use it every day, and I wish it was better.

stncls · 2025-01-31T06:54:44 1738306484

> Complexity

> We present a polynomial-time algorithm achieving an approximation ratio below √2 for MVC, providing strong evidence that P = NP by efficiently solving a computationally hard problem with near-optimal solutions.

> This result contradicts the Unique Games Conjecture, suggesting that many optimization problems may admit better solutions, revolutionizing theoretical computer science.

Some context: The Unique Games Conjecture (UGC) is a very central unsolved problem in theoretical computer science (TCS), it's open since 2002. It conjectures that, for a series of standard optimization problems including minimum vertex cover, the best known approximation factor (how close you can guarantee to get to optimality with a poly time algorithm) is actually the best possible approximation factor. To disprove the conjecture, one needs a better approximation algorithm for one of those problems. Such an algorithm could be used to solve (to optimality, and in poly time) another set of problems, which are widely believed to be NP-hard. This would not prove P=NP (to be clear: the above quote is not claiming that), but it is true it would revolutionize TCS.

The thing is: this is TCS and theoretical complexity. You cannot disprove UGC with just code. You need a mathematical proof. This is not an indictment of the code which may be very good. But such an enormous claim would require pretty intense scrutiny before it would be accepted as true.

stncls · 2024-07-09T15:28:16 1720538896

No vulnerability name, no website, concise description, neutral tone, precise list of affected distros (RHEL + derivatives and some EOL Fedoras) and even mention of unaffected distros (current Fedoras), plain admission that no attempt was made to exploit. What a breath of fresh air!

(I am only joking of course. As a recovering academic, I understand that researchers need recognition, and I have no right to throw stones -- glass houses and all. Also, this one is really like regreSSHion's little sibling. Still, easily finding the information I needed made me happy.)

tptacek · 2024-07-09T17:53:48 1720547628

I don't think recognition for researchers is the big win for named vulnerabilities. In the places that matter, they can just describe their findings in a short sentence and get all the recognition that matters. The names are mostly for the benefit of users.

ericpauley · 2024-07-09T18:12:44 1720548764

Security researchers definitely do the naming gimmick for personal brand purposes. This may not be as obvious when it’s successful, but academic papers routinely name vulnerabilities when there is no real benefit to users.

tptacek · 2024-07-09T19:00:57 1720551657

The whole point of naming vulnerabilities is to establish a vernacular about them, so it's not surprising that academic papers name them. The literature about hardware microarchitectural attacks, for instance, would be fucking inscrutable (even more than it is now) without the names.

immibis · 2024-07-09T20:05:49 1720555549

I'd be happy to file all of them under Spectre/MDS, except for the ones that aren't Spectre/MDS, of course. They don't all need unique names. Most of them are all instances of the same pattern: some value is not present in a register when it's needed, and an Intel CPU design continues to execute speculatively with the previous contents of that register instead of inserting a pipeline bubble, leaking the previous contents of that register. Using an inter-core communication buffer, instead of a load data buffer like the last person, I don't think deserves a new name and logo. A new write-up, yes.

Wikipedia puts them all under one page: https://en.wikipedia.org/wiki/Transient_execution_CPU_vulner...

tptacek · 2024-07-09T21:10:19 1720559419

I don't even understand the impulse to lose the names. Names aren't achievement awards. We already have Best Paper awards at the Big 4 and the Pwnies (for however seriously you take that). The names don't cost anybody anything, and they're occasionally helpful.

Name them all.

You see the same weird discussions about CVEs, and people wanting to squash CVEs down (or not issue them at all) because the research work is deemed insufficient to merit the recognition. As if recognition for work was ever even ostensibly what the CVE program was about.

AndyMcConachie · 2024-07-09T16:02:27 1720540947

The author of the mail is Solar Designer, a bit of a legend AFAIC. He has no need to pump up his brand and he really really knows what he's doing.

formerly_proven · 2024-07-09T16:49:47 1720543787

Yeah. He created openwall and the oss-security list.

dimask · 2024-07-10T09:43:11 1720604591

At least they do not name them after themselves.

stncls · 2024-07-09T15:19:09 1720538349

Yes, only RHEL 9 (the current version of RHEL) and its upstreams/downstreams (CentOS Stream 9, Rocky Linux 9, Alma Linux 9,...).

Also affected: Fedora 37, 36 and possibly 35, which are all end-of-life (since December 2023 in the case of Fedora 37).

Not affected: Fedora 38 (also EOL), 39 (maintained) and 40 (current).

stncls · 2024-06-28T12:54:57 1719579297

So the AI part is because the customers state their problem in plain English? In your experience, do they prefer that rather than a GUI?

Incidentally, what method do you use to solve the problem once the LLM gives you a formulation? Or is the LLM itself tasked with solving the problem directly?

reverseblade2 · 2024-06-28T13:59:43 1719583183

Hi, Yes AI merely acts as an interface, not part of the algorithm. It's because trying to add everything as a UI, I personally find it challenging. Also the options can exponentially grow. In the future I plan to add more sophisticated features as in

"- Do not add Clothes over Shoes, while Shoes and Buttons can be stacked over each other but do not exceed 140 cm height."

This in turn can be translated into a json level config but it is difficult to acquire user input with plain forms.

There are 2 different algorithms. One is adding each box partitions space into 3 and those spaces are recursively filled. Eventually the problem is reduced to finding the correct permutation of boxes along with their correct orientation.

That is primarily derived by simulated annealing of which I had good experience with combinatory problems. Also some magic heuristics which I also don't fully understand takes into play :)

stncls · 2024-06-05T08:55:05 1717577705

The article is from 2018 and has had interesting discussion here before [1].

My conclusion is: In production, in a datacenter, when code is stable and compute-per-dollar efficiency matters? Yeah, sure, I can believe that swap makes sense.

On a dev machine? Not on mine for sure. If you think swap is a net positive on a dev machine, then try pasting the following (wrong) code in a bash prompt on Linux:

  echo '
  #include <stdlib.h>
  #include <stdio.h>
  #include <string.h>
  
  int main()
  {
    for (int i = 0; ; i++) {
      char *mem = malloc(1 << 30);
      if (mem == NULL)
        return 0;
      memset(mem, 42, 1 << 30);
      printf("%5d GiB\n", i);
    }
  }
  ' | cc -O3 -o crash_my_laptop -x c -
  ./crash_my_laptop

We can discuss the results in a couple of hours, when you recover control of your machine. (Note: with no swap and no z-ram, it crashes after 10 seconds; no side effects.)

[1]

https://news.ycombinator.com/item?id=40582029

https://news.ycombinator.com/item?id=39650114

https://news.ycombinator.com/item?id=38263901

https://news.ycombinator.com/item?id=31104126

https://news.ycombinator.com/item?id=29159755

https://news.ycombinator.com/item?id=23455051

https://news.ycombinator.com/item?id=16145294

https://news.ycombinator.com/item?id=16109058

zygentoma · 2024-06-05T09:23:32 1717579412

> On a dev machine? Not on mine for sure. If you think swap is a net positive on a dev machine, then try pasting the following (wrong) code in a bash prompt on Linux:

> We can discuss the results in a couple of hours, when you recover control of your machine. (Note: with no swap and no z-ram, it crashes after 10 seconds; no side effects.)

That might be true for this contrived example. But my real-world experience is exactly the opposite.

In a case of memory over-usage (in my case because of working in a huge bazel project and several browsers open + some electron apps):

- without swap, the system at one point just got immediately so incredibly slow, that the only thing I could reasonably still do was to restart my machine, while

- with swap, the system (at higher memory usage) just got noticably slower and I could close some app / the browser to get into the normal usable regime again.

Ferret7446 · 2024-06-06T04:47:45 1717649265

My real world experience is the same as GP's and the opposite of yours.

In case of memory over-usage:

With swap: system becomes unresponsive

Without swap: OOM kills offending process

This has been my real world experience 100% of the time on Linux.

packetlost · 2024-06-05T13:33:25 1717594405

I'll add that swap is required for certain sleep modes (hibernate) so there are reasons to use it. I typically create a swap partition that's 1.2x system RAM on laptops for this reason, but don't for desktops (which also usually have more system RAM).

anal_reactor · 2024-06-05T12:19:16 1717589956

Let's be honest, users of this website are literally the worst possible group to reason about "typical PC usage"

stncls · 2024-03-27T16:15:32 1711556132

The music is off as well (second half of piece, I can't quite put my finger on it... Brass section may be out of tune?), but that part sounds more like a poor rendition by humans rather than something a computer would do.

stncls · 2024-03-18T07:00:01 1710745201

In many cases yes. Some single-threaded workloads are very sensitive to e.g. memory latency. They end up spending most of their time with the CPU waiting on a cache-missed memory load to arrive.

Typically, those would be sequential algorithms with large memory needs and very random (think: hash table) memory accesses.

Examples: SAT solvers, anything relying on sparse linear algebra

twoodfin · 2024-03-18T10:50:46 1710759046

Obscure personal conspiracy theory: The CPU vendors, notably Intel, deliberately avoid adding the telemetry that would make it trivial for the OS to report a % spent in memory wait.

Users might realize how many of their cores and cycles are being effectively wasted by limits of the memory / cache hierarchy, and stop thinking of their workloads as “CPU bound”.

awjlogan · 2024-03-18T13:17:37 1710767857

Arm v8.4 onwards has exactly this (https://docs.kernel.org/arch/arm64/amu.html). It counts the number of (active) cycles where instructions can't be dispatched while waiting for data. There can be a very high percentage of idle cycles. Lots of improvements to be found with faster memory (latency and throughput).

fulafel · 2024-03-19T05:55:24 1710827724

The performance counters for that have been in the chips for a long time. You can argue that perf(1) has unfriendly UX of course.

NekkoDroid · 2024-03-18T11:17:27 1710760647

I think AMD has a tool to check something somewhat related (Cache misses) in AMD uProf

twoodfin · 2024-03-18T12:02:29 1710763349

Right, so does Intel in at least their high end chips. But a count of last-level misses is just one factor in the cost formula for memory access.

I appreciate it’s a complicated and subjective measurement: Hyperthreading, superscalar, out-of-order all mean that a core can be operating at some fraction of its peak (and what does that mean, exactly?) due to memory stalls, vs. being completely idle. And reads meet the instruction pipeline in a totally different way than writes do.

But a synthesized approximation that could serve as the memory stall equivalent of -e cycles for perf would be a huge boon to performance analysis & optimization.