More

NickHoff · 2025-01-15T23:17:53 1736983073

Neat. What about power density?

An H100 has a TDP of 700 watts (for the SXM5 version). With a die size of 814 mm^2 that's 0.86 W/mm^2. If the cerebras chip has the same power density, that means a cerebras TDP of 37.8 kW.

That's a lot. Let's say you cover the whole die area of the chip with water 1 cm deep. How long would it take to boil the water starting from room temperature (20 degrees C)?

amount of water = (die area of 46225 mm^2) * (1 cm deep) * (density of water) = 462 grams

energy needed = (specific heat of water) * (80 kelvin difference) * (462 grams) = 154 kJ

time = 154 kJ / 39.8 kW = 3.9 seconds

This thing will boil (!) a centimeter of water in 4 seconds. A typical consumer water cooler radiator would reduce the temperature of the coolant water by only 10-15 C relative to ambient, and wouldn't like it (I presume) if you pass in boiling water. To use water cooling you'd need some extreme flow rate and a big rack of radiators, right? I don't really know. I'm not even sure if that would work. How do you cool a chip at this power density?

Paul_Clayton · 2025-01-16T00:59:58 1736989198

The enthalpy of vaporization of water (at standard pressure) is listed by Wikipedia[1] as 2.257 kJ/g, so boiling 462 grams would require an additional 1.04 MJ, adding 26 seconds. Cerebras claims a "peak sustained system power of 23kW" for the CS-3 16 Rack Unit system[2], so clearly the power density is lower than for an H100.

[1] https://en.wikipedia.org/wiki/Enthalpy_of_vaporization#Other... [2] https://cerebras.ai/product-system/

twic · 2025-01-16T02:24:22 1736994262

On a tangent: has anyone built an active cooling system which operates in a partial vacuum? At half atmospheric pressure, water boils at around 80 C, which i believe is roughly the operating temperature for a hard-working chip. You could pump water onto the chip, have it vapourise, taking away all that heat, then take the vapour away and condense it at the fan end.

This is how heat pipes work, i believe, but heat pipes aren't pumped, they rely entirely on heat-driven flow. I would have thought there were pumped heat pipes. Are they called something else?

It's also not a refrigerator, because those use a pump to pressurise the coolant in its gas phase, whereas here you would only be pumping the water.

pants2 · 2025-01-16T03:51:22 1736999482

No need to bother with a partial vacuum when ethanol boils at around 80 C as well and doesn't destroy electronics. I'm not aware of any active cooling systems utilizing this though.

pezezin · 2025-01-16T10:19:34 1737022774

May I introduce you to the glorious vodka cooled PC? https://www.youtube.com/watch?v=IYTJfLyo_vE

ddxxdd · 2025-01-16T10:00:44 1737021644

I could argue that ethanol has 1/3 the latent heat of vaporization of water, and would boil off 3 times quicker. However, what ultimately matters is the rate of heat transfer, so my nitpick may be irrelevant.

TehCorwiz · 2025-01-16T03:52:13 1736999533

I found this review from 2019 of mechanically pumped heat pipe technologies. I skimmed the intro. Looks like it already has a foothold in aerospace.

https://www.sciencedirect.com/science/article/abs/pii/S13594...

Dylan16807 · 2025-01-16T04:26:12 1737001572

> This is how heat pipes work, i believe, but heat pipes aren't pumped, they rely entirely on heat-driven flow. I would have thought there were pumped heat pipes.

Do you have a particular benefit in mind that a pump would help with?

buildbot · 2025-01-15T23:23:15 1736983395

A Very Fancy cooling engine: https://www.eetimes.com/powering-and-cooling-a-wafer-scale-d...

jwan584 · 2025-01-15T23:30:03 1736983803

A good talk on how Cerebras does power & cooling (8min) https://www.youtube.com/watch?v=wSptSOcO6Vw&ab_channel=Appli...

throwup238 · 2025-01-16T00:59:46 1736989186

The machine that actually holds one of their wafers is almost as impressive as the chip itself. Tons of water cooling channels and other interesting hardware for cooling.

flopsamjetsam · 2025-01-16T00:49:37 1736988577

Minor correction, the keynote video says ~20 kW

lostlogin · 2025-01-15T23:21:41 1736983301

If rack mounted, you are ending up with something like a reverse power station.

So why not use it as an energy source? Spin a turbine.

kristjansson · 2025-01-15T23:48:38 1736984918

If you let the chip actual boil enough water to run a turbine you're going to have a hard time keeping the magic smoke inside. Much better to run at reasonable temps and try to recover energy from the waste heat.

ericye16 · 2025-01-16T00:35:52 1736987752

What if you chose a refrigerant with a lower boiling point?

kristjansson · 2025-01-16T01:53:01 1736992381

That's basically the principle of binary cycle[1] generators. However for data center waste heat recovery, I'd think you'd want to use a more stable fluid for cooling, and then pump it to a separate closed-loop binary-cycle generator. No reason to make your datacenter cooling system also deal with high pressure fluids, and moving high pressure working fluid from 1000s of chips to a turbine of sufficient size, etc.

[1]: https://en.wikipedia.org/wiki/Binary_cycle

renhanxue · 2025-01-16T00:48:33 1736988513

There's a bunch of places in Europe that use waste heat from datacenters in district heating systems. Same thing with waste heat from various industrial processes. It's relatively common practice.

sebzim4500 · 2025-01-15T23:24:01 1736983441

If my very stale physics is accurate then even with perfect thermodynamic efficiency you would only recover about a third of the energy that you put into the chips.

dylan604 · 2025-01-15T23:26:03 1736983563

1/3 > 0, so even if you don't get a $0 energy bill I'd venture that any company that could get 1/3 of energy bill would be happy

bentcorner · 2025-01-15T23:43:26 1736984606

I'm aware of the efficiency losses but I think it would be amusing to use that turbine to help power the machine generating the heat.

twic · 2025-01-16T01:51:45 1736992305

Hey, we're building artificial general intelligence, what's a little perpetual motion on the side?

NickHoff · 2024-04-23T07:03:13 1713855793

I'm an American living in Germany. When I first arrived, the way Germans write the digit 1 surprised me. They write it with the upper hook thing very long, almost like a capital lambda (Λ), which sometimes makes 1 and A visually ambiguous. This isn't really a problem, just something funny about moving to a new country.

lynguist · 2024-04-23T11:56:25 1713873385

I use 1 with a long hook except when I write binary numbers where I use just a | for 1.

I have some other context dependent characters/letters.

I write small z like that in normal writing, but as a mathematical variable I write it as ƶ. (To disambiguate from 2.)

I write small t like † in normal writing, but as a mathematical variable I write it as t. (To disambiguate from + (plus).)

I write q like that in normal writing, but as a mathematical variable I write it with a stroke, which does not display on the iPhone, a ꝗ, a bit similar to a ɋ. (To disambiguate from a (ɑ).)

It’s all about disambiguation, and sometimes having different letter shapes for isolated characters.

froh · 2024-04-23T08:20:47 1713860447

my us colleagues regularly mistook the ones for sevens. that's btw why we cross the sevens, like tees and effs

NickHoff · 2024-02-12T20:06:00 1707768360

> Superb Owl

That's a great typo.

rtkwe · 2024-02-12T20:10:26 1707768626

Possibly not a typo. It's a long running joke based on the NFLs extremely aggressive defense of their trademark by threatening anyone using the phrase Super Bowl to advertise anything. That's why euphemisms like "The Big Game" show up a lot.

geodel · 2024-02-12T20:09:24 1707768564

I think it is joke from Stephen Colbert. I have been calling it Superb Owl since then.

NickHoff · on Oct 20, 2023

ASCII arrow: own work.

LOL

NickHoff · on Oct 9, 2023

I clicked their link to Knative so I could read about it. On the Knative webpage, the cookie banner pops up. I don't want to accept so I click "learn more". That expands the cookie banner to include a button that says "I understand how to opt out, hide this notice" as well as a link to a lengthy explanation of cookies. Well I don't want to click the button so I click the link to the cookie explanation. At the bottom of that page are more links to browser documentation that might eventually explain how to opt out, but I can't click those links because everything on the page is disabled - because the same cookie popup is on this page too. It blocks interaction, including clicking on their links about how to opt out, until you opt in. This stuff is getting worse.

thih9 · on Oct 9, 2023

Having to click “learn more” button is not compliant with GDPR, it should be as easy to not give consent as it is to give consent.

Google and Facebook were fined over this[1] and fixed their UI.

Many providers of cookie popups respect that too; possibly this will get better long term.

[1]: https://www.lexology.com/library/detail.aspx?g=6001cd19-ecbf...

NickHoff · on Sept 7, 2023

There's a recent Tom Scott video about an office inside an elevator. It has a desk and chairs, the whole thing. It's on the corner of the building too. It seems, from his video, that nobody is really sure why it was built and what it was used for.

https://www.youtube.com/watch?v=1yfX84RMQ3M

bluGill · on Sept 7, 2023

Well it is clear from the video what is was used for: nothing. The boss who had it built because of complex WWII history never actually got to use it. What the vision for it was though is unclear.

joshjob42 · on Sept 7, 2023

I think the vision was just that it was baller

mhb · on Sept 7, 2023

Costanza?

NickHoff · on June 6, 2023

My question here is about underlying fab capacity. This chip is made on TSMC 4N, along with the H100 and 40xx series consumer GPUs. I assume Nvidia has purchased their entire production capacity. I also assume that Nvidia is using that capacity to produce the products with the highest margins, which probably means the H100 and this new GH200. So when they release this new chip, does it mean effectively fewer H100s and 4090s? Or is that not how fabrication capacity works?

I'm asking because whenever I look at ML training in the cloud, I never see any availability - either for this architecture or the A100s. AWS and GCP have quotas set to 0, lambda labs is usually sold out, paperspace has no capacity, etc. What we need isn't faster or bigger GPUs, it's _more_ GPUs.

ac29 · on June 6, 2023

> This chip is made on TSMC 4N, along with the H100 and 40xx series consumer GPUs. I assume Nvidia has purchased their entire production capacity.

I dont know why you would assume that. Qualcomm has been using TSMC N4 since last year [1]. I'm sure there are other customers as well.

[1] https://www.anandtech.com/show/17395/qualcomm-announces-snap...

huijzer · on June 6, 2023

It sounds to me like the GH200 achieves more FLOPS per transistor. So, compute demand will be quicker satisfied via the GH200 than via "smaller" chips such as the H100.

Having said that, I don’t think we’re anywhere near some kind of equilibrium for AI compute. If chip supply would magically double tomorrow, then the large companies would buy it for their datacenters and have 100% utilization in a few weeks. They all want to train larger models and scale inference to more users.

rcme · on June 6, 2023

In addition to training larger models, I'm sure there are many use cases that AI could serve that are currently cost prohibitive due to the cost of running inference.

danielmarkbruce · on June 6, 2023

I'd like bigger GPUs. A trillion parameter model at 16 bits needs 2000gb+ for inference, more for training. All kinds of things can be done to spread it across multiple GPUs, downsize to less bits etc, but it's a lot easier to just shove a model on one GPU.

We'll likely see more efficiency from bigger GPUs and hopefully more availability as a result.

mcbuilder · on June 6, 2023

TBH this is what all ML researcher / engineers have wanted for the past 10 years.

0cf8612b2e1e · on June 6, 2023

My question on the very slow growth of available memory: are there technical reasons they cannot trivially build a card with 100GB of RAM (even with lower performance) or has it been a business decision to milk the market for every penny?

Dylan16807 · on June 6, 2023

High speed I/O pins cost a lot, and GDDR generally has 32 data pins per chip and no way to attach multiple chips to the same pins. So 256 bits and 16GB is hard to exceed by much on that tech. The high end is 384 bits and 24GB.

There is a mode to attach 16 data pins to each GDDR chip, so with some extra effort you could probably double that to 48GB. Or at least 32GB. Maybe this is a valid niche, or maybe there isn't enough demand.

The alternative to this is HBM, which can stack up big amounts, but it's a lot more expensive.

Baeocystin · on June 6, 2023

I don't disagree with Dylan, but I'm more than willing to bet that the only reason Nvidia's cards (and that's who we're talking about. CUDA is a hell of a moat.) are RAM-starved is that they haven't felt the pressure to do otherwise. AMD has an institutional aversion towards good software. Intel isn't even an also-ran, yet.

Apple and their unified memory architecture may be the prod needed to get larger levels of RAM available to single cards solutions. We'll see.

shaklee3 · on June 7, 2023

Nvidia has had unified memory for more than 6 years. This chip is just a faster interconnect for it.

bob1029 · on June 6, 2023

> Or is that not how fabrication capacity works?

Fabs can run multiple complex designs on the same line simultaneously by sharing common tools. For example, photolithography tools can have their reticles swapped out automatically. Obviously, there is a cost to the context switching and most designs cannot be run on the same line as others.

Ultimately, the smallest unit of fabrication capacity is probably best measured along grain of the lot/FOUP (<100 wafers).

ksec · on June 6, 2023

>Or is that not how fabrication capacity works?

The basic of Supply Chain and Supply and Demand, as you should have all witness during COVID for toilet rolls are the same.

Fab capacity is not that different to any other manufacturing. You just need to book those capacity way ahead of time. ( 6 - 9 months ) And that is also why I said 99% of news, or rumours about TSMC Capacity are pure BS.

So to answer your question. Yes, Nvidia will likely go for the higher margin products. One of the reason why you see Nvidia working with Samsung and Intel.

theincredulousk · on June 6, 2023

It's my understanding from friends in the business that the actual chips do not represent any capacity issue or bottleneck, it's actually manufacturing the devices that the chips are in (e.g. the finished graphics card).

NotSuspicious · on June 6, 2023

Why would this be the case? I would naively think that since the chips can only be made in a fab and the rest can be made basically anywhere that that wouldn't be true.

archi42 · on June 6, 2023

They can not be made "anywhere"; when you can't get that PMIC from the original manufacturer, good luck getting it from someone else. And replacing an IC in a QA tested, EMV verified, FCC and CE etc. certified device will often trigger you redoing all that, possibly requiring additional iterations. If there is a similar part available at all.

Take a look at a recent GPU and count the auxiliary components. All of them can cause supply chain difficulties.

refulgentis · on June 6, 2023

That's...fascinating. There's enough space on TSMC but the PCB is the hard part?

Yizahi · on June 6, 2023

For example my corpo hit manufacturing issues (production capacity) with flash memory, with clock oscillators, with auxiliary fpga. But main chips production was fine all the time during chip crisis as far as I know. So yeah, small critical components totally can be a blocker. Some specific voltage controller is unavailable and suddenly your whole design is paralyzed.

idiotsecant · on June 6, 2023

pcbs are also full of a bunch of other components, many of which are hard to get ahold of right now.

davrosthedalek · on June 6, 2023

I think that's it. PCB itself is rather trivial, it's the RAM, but also things like switching regulators (there are others, but then it's a redesign), maybe even stuff like connectors (which don't burn....).

For a science project, we need to manufacture magnets. It's not easy to find a company who has the right iron right now, and it's hard to get, with long lead times. The supply crisis is real.

Tepix · on June 6, 2023

I see A100 80GB cloud capacity available on both runpod.io and vast.ai currently.

RosanaAnaDana · on June 6, 2023

You know I was wondering this the other day when NVDA's insane run up happened. I went down the road of trying to figure out if there was even enough silicon wafers, or if there even would be enough wafers in the next five years, to justify that price.

Unless all the planet does is make silicon wafers; no.

xadhominemx · on June 6, 2023

Well you figured wrong - NVDA AI GPUs are a very small % of global foundry supply, if even if volume tripled, they will still be a small % of global foundry supply. NVDA’s revenue is high because their gross margins are extreme, not because their volume is high.

austinwade · on June 6, 2023

Can you go into more detail? So you're saying that at a 200 P/E ratio NVDA there isn't even enough wafer supply for NVDA to grow into that valuation even over 5 years?

RosanaAnaDana · on June 6, 2023

I mean, you've got the gist of it. I pulled some reports on silicon production, silicon waver prices and price trends, current fab capacity etc..

My back of the napkin basically suggested that silicon production would need to 4x and fab capacity 4x (neither of which are happening) and NVDA with would have to capture all of that to justify their current price. I didn't bother writing it up, just looked at it mostly because I was on the wrong side of that play. It's something worth considering for sure.

austinwade · on June 6, 2023

Wouldn't NVDA just focus more on high margin datacenter products in order to grow into those higher earnings but with the the wafer limitation? Datacenter focused products are already starting to surpass gaming which is their second largest revenue source: http://www.nextplatform.com/wp-content/uploads/2022/05/nvidi...

It seems to me that yes while a 200 P/E may be high, they certainly could keep increasing the prices on the already high margin datacenter products, of which get quickly gobbled up by companies no matter what price they are because of the immense demand.

refulgentis · on June 6, 2023

We're probably ~3 years out from all of those fabs gov'ts funded coming online, right?

(n.b. that's really good work on your end and I agree with your conclusion, just idly musing about the thing that bugs me, what the heck all these non-leading edge fabs are going to do)

fxtentacle · on June 6, 2023

I believe availability is low because the GPUs are too expensive so those that need to scale up use the older and much more affordable models.

tomschwiha · on June 6, 2023

I'm using Runpod and Datacrunch regulary and they seem to always have some available.

NickHoff · on June 2, 2023

See also the International Obfuscated C Code Contest. [0]

This program [1], for example. It just accepts some input on stdin and returns the same input, but mirrored along the diagonal. So the first row of the input becomes the first column of the output, second row becomes second column, etc. But the program is functionally invariant when given itself as input. In other words, you can flip the source code of the program along its diagonal and the result is a program which has the same functionality - it flips stdin.

Or this one [2], which parodies java in C. It's functional C code that looks like Java, including the

  class LameJavaApp
  {
    /** The infamous Long-Winded Signature From Hell. */
    public static void main(String[] args)
    ...

Or this one [3] that calculates pi by estimating its own surface area.

Or this one [4]. It's a lovers quarrel, written simultaneously in C and English. It's incredible, seriously, read it.

[0]: https://www.ioccc.org/

[1]: https://github.com/ioccc-src/winner/blob/master/1994/schnitz...

[2]: https://github.com/ioccc-src/winner/blob/master/2005/chia/ch...

[3]: https://github.com/ioccc-src/winner/blob/master/1988/westley...

[4]: https://github.com/ioccc-src/winner/blob/master/1990/westley...

_x11l · on June 2, 2023

It would seem that the author of this code is actually the World No. 1 IOCCC champ!

deadcore · on June 2, 2023

Number 2 really did make chuckle!

throwaway744678 · on June 2, 2023

I struggled a bit too long to build it: don't forget the `-w` flag (disable all warnings):

    cc -w chia.c -o chia

NickHoff · on May 31, 2023

There's also this one. https://www.youtube.com/watch?v=F3FkAUbetWU

He shoots the Prince Rupert's drop with a bullet. The bullet wins. In high speed, you can see the bullet splat against the glass and break into shards, long before the glass breaks. Really neat.

NickHoff · on May 26, 2023

I've read about this. Studying the psychology of group dynamics and conflict in a confined environment was a part of the experiment. Sure enough, the crew of 8 quickly split into two factions. It still seems like a success to me though, because even though some of them were barely on speaking terms despite being friends before, they continued to work together.

> I don't like some of them, but we were a hell of a team. That was the nature of the factionalism... but despite that, we ran the damn thing and we cooperated totally.