Hacker News new | past | comments | ask | show | jobs | submit | qeternity's comments login

"Usable" is the key word here. Not all context is created equal.

Have a look at the RULER benchmark for a bit more detail.


What? There are highly optimized Marlin kernels for W8A16 that function very well on a 3090 in both FP8 and INT8 formats.


Well, FP4 (mostly INT4 these days) means you are moving fewer bits per weight around, so the memory bandwidth that you have performs better.


That doesn't make sense to me. Memory bandwidth refers to the throughput of moving data from memory. Even if FP4 calculations aren't natively supported, you still move 4 bits of data per number from memory and then cast it to a FP8 or FP16 or other higher-precision number, right?


Llama.cpp already moves 1.5bit to the cores, they are converted in SRAM which has much higher bandwidth.

FP4 only helps batched inferencing.


Well, it's not...it gets most details wrong.


Can you elaborate?


GPT-4 was rumored to be 1.8T params...not 1.2

And the successor model was called "Orion", not "Omni".


Appreciate the corrections, but I'm still a bit puzzled. Are they wrong about 4.5 having 12 trillion parameters, it originally intending to be Orion (not omni), or an expected successor to GPT 4? And do you have any related reading that speaks to any of this?


GPT-4 was 1.3T. 221B active. 2 experts active. 16 experts total.


That sounds correct except for that total parameter count. 110B per expert at 16 experts puts you just shy of 1.8T. Are you suggesting there are ca. 30B shared params between experts?


GPT-4 was rumored to be 1.8T params...not 1.2

And the successor model was called "Orion", not "Omni".


If the world is smaller today then it means it was bigger at that time. What are you confused about?


Thanks for pointing this out, I misread the parent comment.


We use TSDB and are pretty happy with it.

But it is much less performant than CH.


Glad you like it! And yes I'm not saying Timescale is better than Clickhouse generally. But it does avoid having to host a second database, you keep ACID compliance, your application code doesn't have to change... There is more to analytics than raw speed, and even in raw speed we're slowly catching up. For some types of queries TS actually performs better than Clickhouse afaik, but I'm not a benchmarking expert so take it with a grain of salt.

Always choose the right tool for the right job, Clickhouse is amazing software. I just wanted to mention that if someone currently runs analytics queries via postgres and runs into performance issues, trying out timescale doesn't really hurt and might be a simpler solution than migrating to Clickhouse.


Well, the most important US equity market is in Aurora…


What is this a reference to?



It looks like CME's HQ is near Millennium Park in the center of Chicago, not part of Aurora (CO, IL, OH, IN, MO, NE, NY, or OR).


Cme runs their matching engine out of a datacenter in aurora il


CyrusOne off of I-88.

https://www.cyrusone.com/data-centers/north-america/aurora-i...

(gives 350 E Cermak a run for its money, and can't beat the 'burbs over downtown chi)


Was this written by DeepSeek? Aside from not being in depth, it’s also inaccurate (MoE details and MTP misunderstanding).


IMO the more interesting question is why low-quality stuff like this keeps getting upvoted here. Feels like any submission that has AI in it automatically gets to the front page no matter the quality. Sad state of HN. I just can't imagine that people actually read this stuff and then decide to upvote because they found it useful. It's probably upvoted by people/bots who only read the title.

The whole reason I come to HN in the first place is to filter out BS clickbait articles exactly like this one, not to have them fill the front page.


It's certainly AI-generated garbage. But it seems to have slipped from first place to 20th in the time it took to read your comment. If it was ranked up by bots, and say 50 fake accounts, they mistimed the velocity.


You have some options here:

- check out the new section and vote up good articles

- flag bad submissions

- or complain about it


I'm certainly not an extreme HN old timer, but I've been visiting for a fair number of years and I've seen this sort of complaint since I started, while article quality doesn't seem to have gone down noticeably. In fact, the site rules even caution against complaining that HN is "becoming Reddit", which is essentially the old version of this comment. The fact is that, even here, there will always be a few poor quality articles that slip through.

BTW, pointing out that a particular article is poor, like qeternity's comment, is worthwhile. It's just comments that complain all of HN is going downhill that are tiresome.


Article quality has IMO gone down considerably in the last 2-3 years ever since LLMs became a thing. Probably not because LLM articles are upvoted by humans, but more likely because it's much easier to create and manage realistic bot fake accounts with LLMs.

We're at a point where it's impossible to tell which users are bots and which are human by looking at their comments.


I’m an old-timer so I’ve seen multiple cycles of the front page being dominated by a PR blitz. Sometimes it’s startup/money-driven (e.g. mobile applications via smartphone adoption), sometimes it’s a community that organizes elsewhere to promote something to the HN readership in a disciplined way (e.g. Rust), sometimes it’s both (e.g. crypto).

What feels different about this one is that it seems very “top down”, it has the flavor of almost lossless transmission of PR/fundraise diktat from VC/frontier vendor exec/institutional NVIDIA-long fund to militant AGI-next-year-ism at the workaday HN commenter level.

Maybe the powers that be genuinely know something the rest of us don’t, maybe they’re just pot committed (consistent with public evidence), I’m not sure. It’s been kind of a while since the GPT3 -> GPT4 discontinuous event that looked like the first sample from an exponential capability curve. Since then it’s been like, it can use a mouse now. Well, it can kinda use a mouse now. Hey that sounds a lot like the robot in Her.

But whatever the reason, this one is for all the marbles.


I'm a bit curious to know what was inaccurate in it.


How about you go study first, instead of just trying to crank out AI generated slop. Absolutely no point in helping you correct your articles when they should instead be left obviously bad, so that they can be flagged.


If they are the same underlying model, it’s unlikely the prices will be different on a per token basis. The high model will simply consume more tokens.


You're right but then in that mode it's no longer cheap.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: