More

Pannoniae · 2026-03-08T12:13:55 1772972035

"it's one atomic check after first init" And that's slow :P [0] If you don't need to access it from multiple threads, cutting that out can mean a huge difference in a hot path.

[0] https://stackoverflow.com/questions/51846894/what-is-the-per...

Pannoniae · 2026-03-07T22:19:29 1772921969

They already do....

Pannoniae · 2026-03-06T18:51:29 1772823089

The future of x86 is worrying but it's nowhere dead yet. I saw the C&C article yesterday and did some research, TL;DR:

- Apple took over the single-threaded crown a while ago.

- ARM also caught up in integer workloads.

- ARM Cortex is still behind in floating-point.

- Both are behind in multithreaded performance. (mostly because there are more high-end x86 systems...)

- Both are way behind in SIMD/HPC workloads. (ARM is generally stuck on 128-wide, x86 is 256-wide on Intel and 512-wide on AMD. Intel will return to 512-wide on the consumer segment too)

- ARM generally have way bigger L1 caches, mostly due to the larger pagesize, which is a significant architectural advantage.

- ARM is reaching these feats with ~4.5Ghz clocks compared to the ~5.5Ghz clocks on x86. (very rough approximation)

Overall, troubling for x86 for the future... it's an open question whether it will go the way of IBM POWER, legacy support with strict compatibility but no new workloads at all, or if it will keep adapting and evolving for the future.

bryanlarsen · 2026-03-06T19:27:05 1772825225

The performance/watt delta for M1 over contemporary x86 is massively larger than M5 vs Panther Lake. M5 and Panther Lake are roughly comparable.

So by that measure the future of x86 seems to be less troubling today than it was 5 years ago.

adrian_b · 2026-03-06T19:13:22 1772824402

ARM CPUs are quite good in "general-purpose" applications, like Internet browsing and other things that do not have great computational requirements, as they mostly copy, move, search or compare things, with only few more demanding computations.

On the other hand, most ARM-based CPUs, even those of Apple, have quite poor performance for things like arithmetic operations with floating-point numbers or with big integer numbers. Geekbench results do not reflect at all the performance of such applications.

This is a serious problem for those who need computers for solving problems of scientific/technical/engineering computing.

During the half of century when IBM PC compatible computers have been dominant, even if the majority of the users never exploited the real computational power of their CPUs, buying a standard computer would automatically provide at a low price a good CPU for the "power" users that need such CPUs.

Now, with the consumer-oriented ARM-based CPUs that have been primarily designed for smartphones and laptops, and not for workstations and servers, such computers remain good for the majority of the users, but they are no longer good enough for those with more demanding applications.

I hope that Intel/AMD based computers will remain available for a long time, to be able to still buy computers with good performance per dollar, when taking into account their throughput for floating-point and big integer computations.

Otherwise, if only the kinds of computers made by Apple and Qualcomm would be available, users like me would have to buy workstations and servers with a many times lower performance per dollar than achievable with the desktop CPUs of today.

This kind of evolution already happened in GPUs, where a decade ago one could buy a cheap GPU like those bought by gamers, but which nevertheless also had excellent performance for scientific FP64 computing. Then such GPUs have disappeared and the gaming GPUs of today can no longer be used for such purposes, for which one would have to buy a "datacenter" GPU, but those cost an arm and a leg.

p_ing · 2026-03-06T19:09:15 1772824155

https://browser.geekbench.com/v6/cpu/15805010

I see x86 on top (the first valid result is 6841, which is x86), if that is the sole benchmark we're going to look at. You can further break that down into the individual tasks it performs, but I'm not going to :-)

> - ARM generally have way bigger L1 caches, mostly due to the larger pagesize, which is a significant architectural advantage.

Larger pages mean more potential for waste.

future10se · 2026-03-06T19:27:45 1772825265

> https://browser.geekbench.com/v6/cpu/15805010

Not to bash on x86 or anything, but that's an outlier. Very overclocked with a compressor chiller or similar. Also the single-threaded and multi-threaded scores are the same; it's probably not stable at full load across all cores.

I don't think that's really representative of the architecture at scale, unless you're making the case for how overclockable (at great power/heat cost) x86 is.

Pannoniae · 2026-03-06T18:08:24 1772820504

Actually, it isn't that different. Compilers are trash. They produce hilariously bloated and stupid code, even the C++ compilers, not to speak about your average JIT compiler.

However, in practice we don't care because it's good enough for 99% of the code. Sure, it could be like 5x better at least but who cares, our computers are fast enough.:tm:

AI is the same. Is it as good as the best human output? Definitely not. Does it do the job most of the time? Yes, and that's what people care about.

(But yes, for high-impact work - there's many people who know how to read x64 asm or PTX/SASS and they do insane stuff.)

Pannoniae · 2026-03-05T13:14:01 1772716441

That's not what a derivative work means, though. Being exposed to something doesn't mean you can't create original work which is similar to it (otherwise every song or artwork would be a derivative work of everything before it)

People do cleanroom implementations as a precaution against a lawsuit, but it's not a necessary element.

In fact, even if some parts are similar, it's still not a clear-cut case - the defendant can very well argue that the usage was 1. transformative 2. insubstantial to the entirety of work.

"The complaint is the maintainers violated the terms of LGPL, that they must prove no derivation from the original code to legally claim this is a legal new version without the LGPL license."

The burden of proof is on the accuser.

"I am genuinely asking (I’m not a license expert) if a valid clean room rewrite is possible, because at a minimum you would need a spec describing all behavior, which ses to require ample exposure to the original to be sufficiently precise."

Linux would be illegal if so (they had knowledge of Unix before), and many GNU tools are libre API-compatible reimplementations of previous Unix utilities :)

Pannoniae · 2026-01-18T01:26:53 1768699613

Because it is literally the best way to design and everyone else is wrong. Look at actual HCI studies. There's exactly zero arguments for any kind of flat or minimalistic design outside of art, or if you want to make a statement.

The only reason it's used that it's cheaper and faster to make, is perfectly soulless not to make anyone upset, and it's trendy.

9dev · 2026-01-18T07:38:33 1768721913

> There's exactly zero arguments for any kind of flat or minimalistic design outside of art, or if you want to make a statement.

If that were true, road signage would look a lot different than it does.

Minimalistic design clearly has advantages when quickly grasping intent is key.

Nextgrid · 2026-01-18T13:48:56 1768744136

Road signage has a lot of constraints that don't apply to computer UIs.

kace91 · 2026-01-18T01:35:07 1768700107

You’re kinda proving the parent’s point.

>There's exactly zero arguments for any kind of flat or minimalistic design outside of art

Here’s one: helping the interface stay out of the way, removing clutter so the actual content of the app takes focus instead.

I can tell you it works because with the new Glass stuff everything is begging for attention again, and I hate it.

And just to be clear, I’m not voting for design overflattened to the point one can’t tell icons apart. For me, around 4 in the diagram is the ideal middle point.

adastra22 · 2026-01-18T01:42:30 1768700550

What’s he’s saying (behind too many opinions) is that actual HCI studies collected in something resembling a scientific manner show very clearly that skeuomorphic work better, for many clearly defined metrics of better.

storus · 2026-01-18T01:53:31 1768701211

> helping the interface stay out of the way, removing clutter so the actual content of the app takes focus instead.

Yeah, like when I need to guess what is clickable and what isn't...

Pannoniae · 2026-01-18T01:57:36 1768701456

>You’re kinda proving the parent’s point.

Exactly, I agree with the parent! They're right, it only happens that their strawman is actually true :)

whimsicalism · 2026-01-18T01:53:45 1768701225

thank you for providing an exemplar

Pannoniae · 2025-12-19T21:46:08 1766180768

Compilers massively outperform humans if the human has to write the entire program in assembly. Even if a human could write a sizable program in assembly, it would be subpar compared to what a compiler would write. This is true.

However, that doesn't mean that looking at the generated asm / even writing some is useless! Just because you can't globally outperform the compiler, doesn't mean you can't do it locally! If you know where the bottleneck is, and make those few functions great, that's a force multiplier for you and your program.

simonask · 2025-12-19T22:09:08 1766182148

It’s absolutely not useless, I do it often as a way to diagnose various kinds of problems. But it’s extremely rare that a handwritten version actually performs better.

jesse__ · 2025-12-19T22:53:47 1766184827

yo, completely off topic, but do you work on a voxel game/engine?

Pannoniae · 2025-12-20T15:35:19 1766244919

yes and you already know me lol, we have been chatting on discord :P

Pannoniae · 2025-12-19T16:43:45 1766162625

It's a bit of a generic complaint, but quite apt for the subject matter. Mission creep kills projects, and that's true across a broad range of activities.

More specifically in the case of software, egos kill projects, and expanding the scope of your project to include broader economic or social causes usually does the same.

This is correlated to a huge change in nerd culture - pseudonymity was much more common and encouraged, with people's real-life identities or views not really taken into account. ("on the internet, nobody knows you're a dog")

Social media happened, and now most people use their real-world identities and carry their real-life worldview into the internet.

This had a huge negative effect on internet toxicity and interpersonal trust, and Eich is a good example of that - auxiliary things being dredged up about someone, used as a cudgel against them for their real or perceived transgressions.

The end result is that effective project management has become a rare breed and we see all these colossal failures like Firefox...

Pannoniae · 2025-12-19T11:44:45 1766144685

This comment is interesting and adds to the discussion but it would be quite a bit better without the flamewar-style swipes in the last sentence :)

johncolanduoni · 2025-12-19T11:52:27 1766145147

Please explain how it adds to the discussion about different ways to broaden supported Rust target architectures. Because both have the word Rust in them?

Pannoniae · 2025-12-19T12:08:37 1766146117

It contained an interesting link and I tried to be friendly. You're right though, not very on topic.

Pannoniae · 2025-12-17T05:01:52 1765947712

A preprocessor step mostly solves this one. No one said that the shader source has to go into the GPU API 1:1.

Basically do what most engines do - have preprocessor constants and use different paths based on what attributes you need.

I also don't see how separated pipeline stages are against this - you already have this functionality in existing APIs where you can swap different stages individually. Some changes might need a fixup from the driver side, but nothing which can't be added in this proposed API's `gpuSetPipeline` implementation...