Water used is a terrible metric. Water is not consumed in the process; many cooling systems are closed loop and even open loop systems just release water into the environment where it will be renewed.
What matters is strain on water infrastructure, but that is wildly variable and can't be compared apples to apples. Like a water cooled data center drawing from an aquifer is a massively different beast than one drawing from a river. Likewise a data center with a dedicated tap is much different than one drawing from the standard municipal system.
If datacenters are set up in places that can not support them, that's on the operators of the datacenters themselves, not the users of the compute power. And if datacenters are set up in locations where the resources they need are present but they create an excessive burden on the infrastructure upkeep, that's on the municipality for not appropriately charging them for their usage and/or approving plans without checking their capabilities.
I think the comparison that should be made here is the actual alternative - what is the cost in kw/hr of me sitting in front of my powered-on and not-sleeping laptop while I spend the next 10 minutes penning an email?
If the machine can make that happen in a second, and it takes me 600x longer to do that while consuming power that entire time, I suspect that the AI carries an overall advantage when you actually include the full story.
I'll file this under "lazy tech journalists didn't bother to do the math past the first, most obvious step".
That seems to get a bit philosophical. Are you not gonna use your laptop ~8 hours a day thanks to generative AI? If you still do, that part is constant.
Are you gonna get less done in those 8 hours? Probably. Is it good for the environment to get more done in less time? Can't really answer that one generically. We humans do a lot of stuff simply because we can.
If unemployment goes up 20% from generative AI (just a made up number for narrative purposes), are those 20% not gonna sit in front of a screen all day? Are they perhaps even going to play games that max out their machines instead?
Of course another way to see all this is to argue that we need 20% less humans now, so we save any environmental costs those people induce. Seems a bit gruesome to me.
There’s an assumption here that Gen AI makes everyone more productive.
I find it’s quite the opposite for me. The time saved on generating a response is spent making sure there are no mistakes, misrepresentations, or plain BS. In code, it’s making sure it didn’t generate subtle security errors and the like. I find it’s better at wasting my time than anything.
And then having to read my GenAI-using colleagues’ “work.” What they could have explained themselves in a few minutes becomes a wall of text they didn’t bother to review, let alone write. Waste of time.
We ought to consider that this isn’t the revolutionary tool that’s going to replace workers. That’s capitalism talking and it has a problem on its hands: shareholders want profits for all the money they’ve dumped into these ventures and API tokens aren’t covering their costs right now. Of course they want us to believe it makes everyone instantly more productive since that would sell more API tokens.
… they’ll just deal with how to actually make money with this and hide or justify the environmental damage later.
All of your complaints about generative AI will be fixed with more engineering. I suggest comparing the state of generative AI to the state of automobiles, radios, and rockets now versus a decade after they hit the public consciousness. Even centuries-old civil engineering has its generative AI failure moments like Tacoma Narrows.
All of these fields and many more solve their problems through engineering. The same is true for AI, and the pressure put on them for energy/water consumption will drive that engineering.
You writing remids me, what the heck had us so damn distracted 20y ago, cos in front of google (if that may be a starting point ~20y ago) we had heard much of it, not? ...OT...but what was with the radiation levels caesium an uran that past day measured ...what is in norway...or...um...realy off-topic... (to downvote till deleted, please)^^
>Of course another way to see all this is to argue that we need 20% less humans now, so we save any environmental costs those people induce. Seems a bit gruesome to me.
Seems a bit of a straw man to me. Perhaps there's some light between ecosystem literacy ("carrying capacity is real") and mass murder? :-\
A third way (which differs only in timing) is thinking we can support 100% of the people who exist right now, but acknowledge that if you iteratively "add 20% more people" enough times eventually we will exceed the environment's capacity.
If you think any solution that might be proposed to that is "gruesome" (AKA the usual anti-population modulation trope) just look at the cruelties people inflict on each-other when humans blindly exceed the environment's capacity.
Though I suppose such "law of the jungle" ad hoc population control could be preferred by many, since it effects mostly the poor and powerless, whereas intentional population policy would effect all people equally.
Yes, sorry, that was a strawman, didn't quite realise it. On that point, I _do_ believe that there are too many humans. I'm certainly not in favour of taking any drastic measures against that, but in the long run, I think we'd all be better off if there was less of us. Seems like we have a dynamic right now that will see to that over time: demographic transition.
But my main point - which I didn't make very eloquently - is that we humans do have a tendency to keep ourselves busy no matter what, sometimes doing things that are objectively pointless, sometimes doing things to the detriment of our species. I think the majority of what I've spent my life building falls into one of those two categories, unfortunately. Most applications of generative AI I've seen certainly do. And even if we do things that are objectively beneficial, we still spend resources doing so.
I think it's cool that there's research into the magnitude of that. If you want to save money, you start by understanding what you spend your money on, to use an analogy here.
It would have been nice if the article did the comparison for us, but the the premise of the article seems correct. According to the article, ChatGPT uses 140 watt hours to compose a short email. That's considerably more than the battery capacity of many laptops, so even if you spent all day staring at your laptop trying to write the email, you're probably using less energy.
Now there's certainly a time advantage to using AI, but LLM based AI is pretty inefficient from an energy perspective.
The biggest problem there is that that laptop is pretty darn efficient. And multiple GPUs and other servers running 100% to run that query for you are not quite as efficient. It does so much more math to get the LLM to do anything useful, the scale is really staggering compared to registering your keystrokes and updating the screen, even if that all happens in a bloated Javascript code base. You can run that laptop, which is not using more than 40 watts in spikes when just typing in a text editor for a long time compared to the full inference and processing load of that LLM. And that ignores the cost of training and getting the training data amortized over all queries.
Per the article, the energy used by AI to write a short email (140 watt hours) is roughly equivalent to running a higher end desktop CPU flat out for an hour. That really puts the amount of computation going on behind the scenes of LLMs in perspective.
Does anyone else have a hard time accepting these calculations? I don’t doubt the serious environmental costs of AI but some of the claims in this infographic seem far-fetched. Inference costs should be much lower than training costs. And, if a 100-word email with GPT-4 requires 0.14 kWh of energy, power AI users and developers must be consuming 100x as much. Also, what about running models like Llama-3 locally? Would love to see someone with more expertise either debunk or confirm the troubling claims in this article. It feels like someone accidentally shifted a decimal point over a few places to the right.
If I run some simple inference locally on a 4090 (450 TDW card) it takes order of seconds and that sucker's going full blast, you're looking at order of 1 kJ, which is significantly higher than what is quoted in the article.
Article numbers line up better with CPU inference for ~1s.
1 kJ is for reference enough to heat 1 L (33 oz) of water by ~0.25C (~0.5F). The machine will probably heat up a few degrees if you run inference once, but since it's essentially one big heatsink it will dissipate throughout the body and into the air. The problem begins when you run it continuously, as you would in a datacenter.
IIRC The M series chip isn’t specifically optimized for ML workloads, the biggest gain it has is having unified video and cpu memory as transferring layers between the two is a big bottleneck on non Apple systems.
Real ML hardware (like the Nvidia H1000s) that can handle the kind of inference traffic you see in production get hot and use quite a bit of energy, especially when they run at full blast 24/7
Google’s TPU energy usage is a well-kept secret / competitive advantage. If energy efficiency isn’t a major concern for them, I bet it will be in a couple years.
I'm not sure how comparable o1 is in total usage. Remember that people will either adjust the prompt or continue the conversation as needed. If o1 spends more time on the answer, but responds in fewer steps, it may be a net positive on energy use. Also it may skip the planning and self-reflection steps in agent usage completely. It's going to be hard to estimate the real change in usage.
> Even though hydropower water withdrawal and consumption intensities are usually orders of magnitude larger than other types and likely to skew overall regional averages, it is important to include hydropower in the factors to show not only the power sector’s dependency on water but also its vulnerability to water shortages.
Some people have pointed out that using water for cooling does not destroy it - it'll all rain back down. I think it would've still been fair to consider how much processed/drinking water was being evaporated, since it'd need processing again, but I can't really see the justification for the article's framing when the figure is measuring water that would've just flowed into the sea had the hydroelectric dam not been there.
In all these calculations you have to wonder what "using water" even means. Add to that that most water cooled systems are closed loops, i.e. no water escapes
The calculations paper talks about the water cycle and explains that fresh clean water isn't well distributed. I wonder whether big evaporators like data centers might actually help to redistribute water across the planet.
Is this true in a marginal cost sense? I was under the impression most of the environmental impact occurred during the training stage, and that it was significantly less costly post training?
You could argue that this is no longer the case once the model is done; the cost per request will go down over time, as the set amount of power and coolant pumped through data centres gets divided over more people.
However, AI companies can't afford to stand still. They have to keep training or they risk being made irrelevant by whatever AI company comes next.
Furthermore, a non-significant amount of energy and cooling is being used for generating responses as well. It's plainly obvious when you run even the very modest AI models at home how much power these things take.
The paper[1] mentions the statistics used to calculate these numbers. It has a separate column for inference, with numbers ranging from 10mL to 50mL of water per inference depending on the data centre sampled.
The numbers seem bad, but the authors also call out that more transparency is needed. With all the bad rep out there from independent estimations and no AI companies giving detailed environmental impact data, I have to assume the real cost is worse than estimated, or companies would've tried to greenwash themselves already.
> It's plainly obvious when you run even the very modest AI models at home how much power these things take.
Really good point to put this into perspective. I tried models locally and my gpu was running red hot. Granted, I think the server boards like H100 are more optimized for the AI workloads so they run more efficiently than consumer gpus, but I don't believe they are more than 1 magnitude more efficient.
Another corollary is that AI companies don’t train one model at a time. Typical engineers will have maybe 5-10 models training at once. Large hyperparameter grid searches might have hundreds or thousands. Most of these will turn out to be duds. Only one model gets released, and that one’s energy efficiency is what’s reported.
Llama 403b takes OOM a kilowatt minute to respond on our local gpu server, or about 10 grams of C02 per email. Last I checked, add another 20 grams of amortized manufacturing emissions. A typical commute is OOM 5-10 kg of CO2.
this article is alarmist bullshit. (for entirely unrelated reasons openai delenda est)
A thousand times? I’d have a hard time typing out that many queries in 8 hours. Even 100 seems like a stretch for someone who uses it within something like cursor.
More and more environments offer LLM aid without having you explicitly typing in a query. E.g. trigger inference whenever static analysis fails (e.g. on a compile error). Or trigger an LLM aided auto-complete with Ctrl-Space. I don't think it'll be particularly unusual to reach 1000 queries in a working day that way.
Coding models these days use an inference every time you stop typing. Let’s say it’s 0.1 inference oer keystroke. If you keep VSCode open all day, I could believe it’s a significant energy draw.
Google now uses several inferences per Google search.
The average user’s #inferences-per-day is going to skyrocket.
My point is that it’s understandable to consider AI a significant contributor to the average professional’s energy budget. It’s not an insult to point this out.
With reports like this discussing training cost, is the usage reported per-model? Or is it aggregated across the entire creative model development/exploration process?
At the risk of doing original research, one thing I don’t see a lot of discussion on is that AI companies don’t train one model at a time. Typical engineers will have maybe 5-10 mid-size models training at once. Large automated hyperparameter grid searches might need ensembles of hundreds or thousands of training runs to compare loss curves etc... Most of these will turn out to be duds of course. Only one model gets released, and that one’s energy efficiency is (presumably) what’s reported.
So we might have to multiply the training numbers by the number of employees doing active research, times the number of models they like to keep in flight at any given time.
Very high. But since they provide actual utility as opposed to the overwhelming majority of the current usecases for generative language models, it doesn’t seem relevant here.
I lived in the tropics where it was hotter than the USA and we survived just fine without A/C.
A/C is luxury and your luxury is destroying the planet. I don’t want you to stop using A/C but I would like people to accept that it is a luxury that is not needed in most cases.
Questionable in many cases. A bunch of stores have their A/C on full blast while keeping the door open in the summer. You can actually get fined for this in NYC now but in other US cities it’s still the norm.
That’s the tricky thing about conservation. These problems have multiple sources!
Every time a completely new energy-hungry product category is introduced, the carbon budget gets harder and harder to balance. In the 19X0s it was television and air conditioning, in the 2010s it was Bitcoin maybe, and now we’re adding AI. In the 2X00s, teleportation will double the typical commute energy usage, until heavy regulation and scientific advances will bring it down again.
And whatabout all of the lights I can see being used in the average American house at night, shouldn’t we address the environmental costs of that first?
I don't think a single 100-word email using GPT-4 alone can cost 140wh, even considering how expensive GPT-4 is to run.
If it takes 20 seconds for the model to compose the letter, that means: (3600/20)*140=25200W, a 25200W piece of hardware is used just to compose your email and no other request, this seems wrong by several orders of magnitude.
While using water is one interesting point of the actual paper(from which this article is written), I would say, to a lay person, water volume makes more sense than arbitrary values like KiloJoules or KiloWatts, because I can imagine what a 20mL water would look like, but can't imagine what 20mJ or 20mW looks like.
I argue it's impossible to measure the total environmental impact of anything, and it's a waste of time to even try.
Let's break this down. There is water used at inference time and training time. The humans working on the project consume water. The building in which they did the project uses water. The whole supply chain for the computers? Good luck measuring the water usage in there
This may seem pedantic, but I promise there is a point to this.
Measuring environmental impact is like trying to understand a neural network.
If you want to discourage water usage, the only way is to tax the marginal water used at any step in the supply chain. You don't need to know the total water usage for the last step of the supply chain, in this case, an LLM
This makes no sense.
100 words for 0.14kWh is crazy.
Currently the api costs 15$ for 1M output tokens ~ 750k words.
So 50k words/$
But, according to washingtonpost, 50k words consume 70kWh of electricity.
Which in turn would imply that OpenAI needs to pay less than 1.4ct per kWh of electricity to make a profit. Average industrial electricity cost in US ~8ct/kwh.
-> The numbers are wrong or OpenAI has solved AGI and is using a fission reactor to produce energy.
Also the numbers in the paper don't fit the numbers in the article (paper is older though). I actually think this is grossly misleading at best, and malicious misinformation at worst.
The water circulates through a cooling system. It soaks up heat from the servers and goes outside and cools down, then goes back again. You are not "using up" the water. This is article is nonsense.
There is a ton of hand-waving in this paper, and they include water from electricity generation (which can easily change depending on the source) and the chip manufacturing (which would have to be amortized across the lifetime usage of the chips, still ongoing).
The only datacenters that would actually consume water are the ones using open-loop evaporative cooling, which is not all of them. Over time, they'll get replaced with more efficient closed loop systems anyway.
Water costs money. Datacenters are highly incentivized to use it as efficiently as possible. If it’s becoming scarce, increase the price and they’ll find ways to be even more efficient.
What matters is strain on water infrastructure, but that is wildly variable and can't be compared apples to apples. Like a water cooled data center drawing from an aquifer is a massively different beast than one drawing from a river. Likewise a data center with a dedicated tap is much different than one drawing from the standard municipal system.
If datacenters are set up in places that can not support them, that's on the operators of the datacenters themselves, not the users of the compute power. And if datacenters are set up in locations where the resources they need are present but they create an excessive burden on the infrastructure upkeep, that's on the municipality for not appropriately charging them for their usage and/or approving plans without checking their capabilities.