Hacker News new | past | comments | ask | show | jobs | submit | russdill's comments login

Disagree. It would be like saying the more advanced transportation becomes, then more like a horse it will be.

Shining-brass 25 ton, coal-powered, steam-driven autohorse! 8 legs! Tireless! Breathes fire!

I'm seriously doubtful that adc is inherently slower than add on a modern CPU other then the data hazard introduced by the carry bit. I realize the point of the article is the data hazard so this is a really minor nit.


uops.info has latency for both (Alder Lake) at 1 cycle but throughput (lower is better)

* for add is 0.20 (ie 5 per cycle)

* for adc is 0.50 (ie 2 per cycle)

so it does seem correct.

This seems to be a consequence of `add` being available on ports 0, 1, 5, 6, & B, whereas `adc` is only available on ports 0 & 6

So yes as an individual instruction it’s no worse, but even non-dependent instructions will be worse for OoO execution (which is more realistic than viewing it as a single instruction)


Intel is also supposed to introduce the new APX instructions which include a bunch of instructions that duplicate existing ones but don't set any flags. The only plausible reason to add these is for performance reasons.


This isn't just due to the actual dependencies of flag instructions at hardware level (although likely be a factor), it also majorly affects code layout. On Arm64 for example, you can make a comparison, do other operations, and then consume the result of that comparison afterwards, which is excellent for the pipeline and OoO engine. However, because most instructions on x86_64 write flags, you can't do this, and so you are forced to cram `jcc`/`setcc` instructions right after the comparison, which is less friendly to compilers and the OoO engine


OoO should actually be the care where that doesn't matter I'd think - the CPU can, well, execute the instructions not in the order they're in the binary; it's in-order implementations are where that matters more.

And with compare & jump being adjacent they can be fused together into one uop, which Intel, AMD, and Apple Silicon all do.


note: since learnt that B port is just port 11 in all the intel docs, uops.info just hexifies them to keep ports single-char


You are right: it can be done with the same ALU, for sure. But the data dependency on the carry flag makes it a really different instruction from the point of view of the CPU: three data dependencies in stead of two. For the CPU it is beneficial to treat the instructions differently.


CPU's are really funny and interesting things. Us programmers work with them daily and make so many assumptions about them, as well as the whole code chain from the compiler, runtimes, how code works when it comes to loops, methods etc., you name it.

I've been working on my own Entity Component System in C# and basically had to start from the ground up and test every assumption possible. There have only really been a few instances where my gut was correct, more often than not there are so many surprising gotchas hidden everywhere.


It's because they are providing abstractions which we/the compilers use, but just doing that would be too slow, so they implement optimizations, but those are based on certain assumptions, so then the users adjust what they do to match those assumptions well, so the optimizations have now leaked into the API, and after many rounds of doing this for decades, you end up with this terrible mess we are in.


If you get enough, you might be able to do some really interesting things using it as a phased array


Pretty much anywhere you look, you find instances https://en.m.wikipedia.org/wiki/Lollipop_(1958_song)


There's an excellent thread with someone doing just that, but Friday.

https://community.home-assistant.io/t/fridays-party-creating...


How is that not exactly zigbee and matter?


So what you're telling me, is that remote workers tend to be the most innovative and industrious. Got it.


Yes, lighting the house on fire may have not been the best plan, but in all fairness the it was a mess and something had to be done.


Yes.


what are your thoughts on the replication crisis?


It's always confused me a bit. It's not like if you put 10kWh into the reactor, that 10kWh goes away. You still lose a significant fraction of it in inefficiency of the cycle but it still goes towards heat which can be used to heat steam and turn a turbine. iirc, you can get about 4kWh back.

On the other side of the coin, if you put 10kWh in and get 10kWh of fusion out, that's 20kWh to run a steam turbine, which nets you about 8kWh. So really you need to be producing 15kWh of heat from fusion for every 10kWh you put in to break even.


Cars are a good analogy. You wouldn't talk about miles per gallon until you have an engine that idles. Humans are in the engine building phase.


That’s a good analogy - and the situation right now is trying to make a car that doesn’t use its entire tank of fuel before it arrives at the service station.


You can't always get this much energy back. Sometimes your waste heat is an enormous pool of warm water.


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: