More

xgstation · 2024-09-25T02:52:35.000000Z

not sure if this is what the op referred but like this one https://news.ycombinator.com/item?id=41450347

didn't find threads that regarding "clarify APIs semantics", but kernel docs are indeed not in a very good condition. Since C does not provide same level of soundness that Rust does, there are many hidden traps.

asahi developer had a good discuss about this https://threadreaderapp.com/thread/1829852697107055047.html

steveklabnik · 2024-09-25T03:09:35.000000Z

This overall situation is, yes. And the stuff from Lina is related, thanks for also pointing that out.

xgstation · 2024-06-09T23:52:35.000000Z

to be fair, Qualcomm runs on Linux (Android) all the time

redleader55 · 2024-06-10T01:55:45.000000Z

Yes, but usually they run on very old kernels, with many of the drivers patched out of tree instead of being upstreamed. Because of that many system components(firmware, bootloader, etc) are either old, unmaintained, buggy, or affected by lots of CVEs.

This new SOC, has Qualcomm's commitment to be using the latest kernel - 6.9 and to have better support. We'll see what this means for the future, but it looks better already.

bfrog · 2024-06-10T15:18:03.000000Z

yeah we'll see... the historical evidence is entirely contrary to this

spcharc · 2024-06-10T00:18:50.000000Z

Yeah, you are right. Android is a variant of Linux.

Just like MacOS and iPadOS, both run Darwin kernel, but they are very different. Even though iPads have very capable hardware, there are a lot you cannot do on iPadOS.

The same applies to Linux. People want a real desktop operating system.

lcnmrn · 2024-06-10T00:23:51.000000Z

All that iPadOS needs to do is to be able to run macOS apps.

Dalewyn · 2024-06-10T00:38:37.000000Z

Found the fervent GNU worshipper.

Jokes aside, yes: Android is a Linux.

Just like how Xbox is Windows NT, and MacOS (and iOS?) is(/are?) BSD.

Delineating along kernels is one useful and objective way of describing OS families.

surajrmal · 2024-06-10T13:08:20.000000Z

Unfortunately, when people call their os Linux, they typically are referring to its kernel. I wish there was a better term to distinguish Linux distros from the kernel.

xgstation · 2024-03-23T07:06:41.000000Z

We will just need to use quantum to beat the quantum. QKD (quantum key distribution) is much more mature comparing with quantum computer which is still far away for cracking crypto algorithms from real practical application standpoint.

Twin-field quantum key distribution over 830-km fibre https://www.nature.com/articles/s41566-021-00928-2

hnaccount_rng · 2024-03-23T07:20:00.000000Z

What is the problem that QKD is solving? On the one hand you need a totally separate point-to-point network for the quantum connection. On the other hand you get relatively short symmetric keys on both ends. I’ve yet to see a proposal that wouldn’t be better served by transferring the key material via DVD/USB-stick/whatever and an armed guard. I’m certain I’m missing something rather obvious.. but I don’t see what

krastanov · 2024-03-23T10:43:07.000000Z

I do not think you are missing anything that would change your cost-benefit analysis, but here are two things that might be of interest:

- A QKD link would be much lower latency than transmitting a physical token over an authenticated channel (same type of advantage as with asymmetric key encryption, but without the drawback of relying on assumptions about computational complexity)

- It does not need to be point-to-point if you have a network of quantum memories/repeaters (which are probably much easier to build than quantum computers).

hnaccount_rng · 2024-03-23T11:12:05.000000Z

Thanks for the reply. I’ll give you latency, but that’s almost never a problem in the first place. You need to (re)authenticate anyhow. I don’t think your second point holds though. Even if we assume memories/repeaters to exist (iiuc this should contradict the no-cloning-theorem) you’d need to trust them not to listen in so you’d be back to square one with electronic key distribution schemes?

There might be an argument that one is unable to secure the two sets of key material (at least on one end and at least long term) and the destination is hard to reach (e.g. James Web or so). But at that point I’d also not trust that organization to implement their end of QKD correctly..

staunton · 2024-03-23T12:30:47.000000Z

> Even if we assume memories/repeaters to exist (iiuc this should contradict the no-cloning-theorem) you’d need to trust them not to listen in

The whole point of having repeaters is that it's impossible for them to listen in. For the same reason why it's impossible to just "listen in" on a fiber transmitting the QKD quantum signals. Repeaters don't contradict non-cloning.

bawolff · 2024-03-23T07:53:06.000000Z

QKD isn't really that suitable for the internet as it is today. You also still have to authenticate the server somehow.

enva2712 · 2024-03-23T07:49:46.000000Z

The utility of a cryptosystem is the extent it force-multiplies compute in the defensive direction. Crypto that depends on equally computationally leveraged parties is broken crypto

mapmeld · 2024-03-23T07:20:41.000000Z

But how far away are we from hardware for stable quantum key generation/storage that would fit on a tabletop, much less a home laptop or smartphone? Almost certain that consumer devices will stay in classical computing and use PQE.

perihelions · 2024-03-23T09:22:17.000000Z

QKD is not battle-tested. Like RSA, a naive textbook implementation would be worthless: messy physical implementations will leak information through side-channels.

If quantum cryptography is the solution, it won't arrive immediately, or for free.

xgstation · 2024-02-19T07:56:14.000000Z

ddr5 ecc is on-die ecc, that is built in with the ram die, while the regular ecc relies on extra bit width(72 bit instead of 64 bit) going all the way into CPU such that it's detectable also on cpu (hence OS level) side, which also ensures the error during transmission get corrected as well.

xgstation · 2024-02-13T13:22:59.000000Z

I desperately think GPU programming(or specifically CUDA) needs some language level support like coroutine/async/await to organize the data flow and the executions among different dispatched device side function calls, and more on that to have some synchronize primitives between different blocks/warps etc.

mratsim · 2024-02-14T08:12:43.000000Z

GPU drivers provide an event system:

- Cuda: https://github.com/mratsim/weave/issues/133 - OpenCL: https://github.com/mratsim/weave/issues/134

JonChesterfield · 2024-02-14T10:57:22.000000Z

Worth noting that a GPU is essentially a hardware scheduler for large numbers of small threads that yields whenever one needs to wait for memory. They don't have a great way of changing the working set of threads.

xgstation · 2024-02-13T08:36:10.000000Z

This is interesting. The article mentioned Zen4c is architecture same to Zen4 but optimized for density running at lower frequency. Question here if anyone knows the answer: it seems like high frequency requires significantly more transistors? And is optimized for density also means less power consumption(assuming both zen4 and zen4c running at same frequency)?

yaantc · 2024-02-13T09:32:17.000000Z

For a given process design kit (PDK), the synthesis tool will have a few different types of transistors. They correspond to different trade-offs between size and power on one hand, and speed on the other. The fastest the transistor, the bigger it is and the more it leaks (lower voltage threshold means faster switch time, but more leakage).

For a given target frequency, the synthesis tool will always use the most efficient transistors it can. And the result is a mix, using the few available types. But the highest the frequency, the higher the proportion of faster and bigger transistors in the mix.

This is the bird's eye view and very simplified, but hopefully enough to get the idea ;)

swozey · 2024-02-13T17:55:01.000000Z

It's wild that I work on the "hardware" (kernel, cgroups, vfio, qemu) and know absolutely so little about what goes into building the actual hardware.

I think this post just enlightened me to the EE involved in chips than I've learned over 15 years.

adrian_b · 2024-02-13T08:57:47.000000Z

It may require more transistors in some places, e.g. in longer buffer chains needed to drive greater capacitances at higher frequencies, but it requires mostly bigger transistors in many places.

According to AMD, both the big core and the small core use the same RTL design, but a different physical design, i.e. they use different libraries of standard cells (optimized either for high speed or for low area and low power consumption) and different layouts in the custom parts.

nolok · 2024-02-13T10:15:35.000000Z

My understanding is that AMD approches the core count for multi thread / single or limited thread task at high frequency challenge in a very different way from Intel.

Intel goes with here are some real beefy cores who can do anything , here are some weaker core who can do only some task.

AMD goes here are half of the cores who can go real fast, here are half core who must remain slower, but everyone can do everything.

In theory, Intel could have better perf if optimized for, while AMD could have better perf with any generic random app out there... As long as the OS has enough hint to put the right app on the right core, and bothers to do it.

wtallis · 2024-02-13T19:29:52.000000Z

I think it's much less a philosophical difference and much more about what they had lying around. Intel had Atom core designs available to pair up with their desktop cores, and combining them into one chip was clearly a rush job rather than the plan from the start.

On the other hand, AMD only really has their Zen series of cores to use, but they rely more than Intel on automated layout tools so they can more easily port designs to a different fab process or do a second physical layout of the same architecture on the same process.

ac29 · 2024-02-14T17:41:45.000000Z

> Intel goes with here are some real beefy cores who can do anything , here are some weaker core who can do only some task.

This isnt true any more as of Intel's current CPUs (Meteor Lake). Both P and E cores support the same instruction set, including AVX10.

Symmetry · 2024-02-13T14:24:08.000000Z

They don't require more transistors, they require bigger transistors. Ideally if transistor A is pushing a line with twice the capacitance attached compared to transistor B, transistor A would be twice as wide and so have twice the drive current of transistor B. But of course making transistors bigger increases the capacitive load of driving them[1]. So you solve for an equilibrium trading off the current to capacitance ratio against total chip size. And the Ryzen 4 versus 4c choose different ratios to optimize.

[1] Back in the day due to the intrinsic capacitance of the transistors themselves. These days more because bigger transistors are further apart leading to more line capacitance.

ComputerGuru · 2024-02-13T17:07:05.000000Z

Is none of it based off of binning now, with sections of lower-performing chiplets or cores fused off to make the efficiency cores?

wtallis · 2024-02-13T19:18:09.000000Z

It is very rare to be able to fuse off part of a CPU core. Fusing off part of its cache is common, but other than that the only example that comes to mind is some server CPUs where Intel fused off the third vector unit.

ComputerGuru · 2024-02-13T20:48:06.000000Z

Oh, right, this is part of a core not a whole core. AMD often fuses those off for lower part number SKUs.

FWIW Intel supposedly “fused off” avx-512 in alder lake though I don’t think that was what was actually done, physically speaking.

wmf · 2024-02-13T17:12:45.000000Z

No, binning can't make cores physically smaller which is what AMD is doing.

toast0 · 2024-02-13T17:35:48.000000Z

High clock rates require smaller clock domains, where everything needs to happen in the same clock cycle. If you break the same logic into smaller clock domains, you need buffers between the domains. Zen4c significantly dropped the max frequency, so there are fewer clock domains and much less chip area spent on buffering transistors.

Otoh, modern power management involves clock gating --- turning off the clock in specific clock domains that aren't being used at the moment; having fewer clock domains makes that less granular and potentially less effective.

Other's points about individual transistors being smaller for a lower frequency design also applies. There may be other complementary benefits from lowering the frequency target too.

But note, it's not magic. The Zen4c server parts, where design area had been most disclosed, use a lot less space per core, and for L1 cache, but L2 and L3 cache take about the same area per byte as on Zen4.

scheme271 · 2024-02-13T08:59:03.000000Z

Yeah, I think high frequency requires more transistors to do buffering of signals. Also reducing the cache speeds and size allows simpler, smaller designs to be used. Finally, reduced frequencies mean that you don't need as high a voltage to force signals to go to 0 or 1 quickly so you need less power. All of this gives zen4c lower power consumption at the same frequency.

xgstation · 2024-02-08T01:53:18.000000Z

afaik some famous ones are like InfluxDB, Fish, and of course some parts of linux kernel (v4l2) that most of them improved the performance while rust provides abstraction at zero cost so it does make life easier

xgstation · 2024-01-12T06:19:48.000000Z

AFAIK physicists use it very heavily, but I barely know any mathematicians use it. To me this is a great tool for people who use Math heavily as a tool but not study Math itself.

I use it for symbolic calculation, solve differential equations, and many complicated integrals, and its visualizaion build upon those with easy parametrize support is very nice. Starting from my ungrad sophomore year as physics major, we have courses require us to finish some homeworks with Mathematica.

I can hardly find any other tool to replace mathematica in terms of symbolic calculation and doing complicated integrals (there is a joke by calling mathematica "large-scale integral table")

slow_typist · 2024-01-12T07:13:14.000000Z

Agree, Wolfram Alpha is heavily used by some students of physics to do their homework. We sometimes joked that was the main purpose of the service.

Mathematicians probably have trust issues and use tools with a code base that is 3 orders of magnitude smaller.

xgstation · 2023-12-13T10:23:17.000000Z

> The uncertainly created by the law has already put a freeze on making offers to research students in China for the fall of 2024, which normally happens in December and early January. “We have missed that window,” says Zhong-Ren Peng, a UF professor of urban planning who leads a center for adaptation planning and design. “And the best students cannot wait. Instead, they will go somewhere else.”

It reminds me back to my undergrad time there was a Iranian mate in my class (physics major already) blocked from getting second F1 visa (visa for foreign students) for grad school for studying quantum information, and just end up with being a software engineer. The irony part is that most of quantum information works are more public accessible than proprietary code in some big corps.

xgstation · 2023-11-11T11:51:27.000000Z

Resizing/deleting items in list in most implementations are either not lock free or not threadsafe

chmod775 · 2023-11-11T13:30:53.000000Z

Yeah, but what if you just use smaller lists and point to the next/previous list?

Hold on...

boxed · 2023-11-13T10:47:40.000000Z

Linked lists aren't either so....?