Hacker Newsnew | past | comments | ask | show | jobs | submit | gok's commentslogin

It's kind of too bad Linux to doesn't just support multiple base page sizes.

MoE is in general kind of a stupid optimization. It seems to require around 5x more total parameters for the same modeling power as a dense model in exchange for around 2x less memory bandwidth needs.

The primary win of MoE models seems to be that you can list an enormous parameter count in your marketing materials.


Stupid? By paying 5x (normally 2-4x, but whatever) of a thing you don't care about at inference you can gain 2x in the primary thing you care about at inference. It's like handing out 4 extra bricks and getting back an extra lump of gold.


The general rule of thumb when assessing MoE <-> Dense model intelligence is SQRT(Total_Params*Active_Params). For Deepseek, you end up with ~158B params. The economics of batch inferencing a ~158B model at scale are different when compared to something like Deepseek (it is ~4x more FLOPS per inference after all), particularly if users care about latency.


Curious how this compares with, say, the implementation of gemm_s8s8s32 in Intel's MKL / OneAPI.


Which mean Tencent (Epic's parent company) will finally be free to open their app stores world-wide.


Consider adopting `os_sync_wait_on_address()` on Darwin for your futex needs


I've used that. It's just as good as ulock although relatively new. The issue is that using this API makes cancelation points no longer atomic. SIGTHR needs to be able to know the exact instruction in memory where an asynchronous signal is delivered when interrupting a wait operation and that's not possible if it's inside an opaque library.



I would have assume the major benefit to precision is that it enables compaction…


You can still compact with a conservative scanner, you just have to accommodate pinned regions.


How do you compact with conservative GC? You can't change the pointer values because they might not be pointers right?


Any object which is referenced by the stack cannot be moved, but the rest of the heap (i.e. the vast majority, assuming the stack is much smaller than the heap) still can.


Computer security is not a serious field. There is no other group that honestly feels "do what I meant, not what I said" is a sign of someone else's bug.


It's hundreds of milliseconds to do a a million iterations. A single time check is hundreds of nanoseconds.


Oh, thanks. The table was unlabeled and I missed that in the text.

Hundreds of nanos isn't great but it's certainly better than milliseconds.


So these guys are just dumping confidential tax documents onto OpenAI's servers huh.


Hopefully it won't end up as training data.


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: