
Green AI - montalbano
https://arxiv.org/abs/1907.10597
======
gwern
This is ultimately a silly and misguided proposal, which focuses on the costs
and not benefits. There is nothing special about training an AI; it is simply
another thing to spend resources on, no intrinsically worse or better than any
other way. A biology experiment has 'carbon footprint'. The ITER has a 'carbon
footprint' (and probably one orders of magnitude larger than all AI research
this year). HN burns CO2. Everything uses energy or it uses things which
require energy, like human labor, and they involve other costs like
opportunity cost which are just as real and important. They stand or fall on
their net merits, not how much electricity they use.

There is no need to go around criticizing people for 'green AI' by myopically
focusing solely on an abstract electrical cost of training. And if there is,
then that applies to _everything_ which uses electricity, and is better
handled by putting a carbon tax on energy sources, and letting the market find
the most unprofitable uses of energy (which will probably not be AI research,
I'll tell you that...) and stop it and substitute in more 'green' power
sources for everything else.

More importantly, if you are concerned about the costs of training AI, you
should be concerned about the _total_ costs as compared to the _total_
benefits, not slicing out a completely arbitrary subset of costs and ranting
about how many 'cars' it is equivalent to (which is not even strictly true in
the first place considering that many data centers are located near cheap and
renewable power like hydropower or nuclear power plants!) and shrugging away
the issue that people consider these performance gains important and well-
worth paying for. There are costs to a model which is worse than it could be.
There are costs to models which run slower at deployment time even if they are
faster to train. There are costs to models which cannot be used for transfer
learning (as the criticized language models excel at, incidentally). And so
on. What matters are the _total_ costs, and corporations and researchers
already pay considerable attention to that already. (Not a single one of their
metrics - 'carbon emission', 'electricity usage', 'elapsed real time', 'number
of parameters', 'FPO' \- is an actual total cost!)

~~~
dxbydt
> is ultimately a silly and misguided proposal

No its not.

Its analogous to quantifying an F test with watts.

Nothing wrong with that. I can give a billion row dataset to a dozen people.
John assumes normality, samples 100 rows, runs a linear regression, gets 70%
R^2, under 1 second on a 1980s era computer. Mary doesn't assume normality,
runs a glm via irwls, gets better explanatory power than John, still on 1980s
computer, though she takes 5 seconds instead of 1. Then James comes along &
runs a decision tree & is twice as good as Mary, but he now needs a second on
a 1990s PC. Tony uses a random forest & Baker wants a 128 layer neural net.
And so on & so forth, until we end up with burning enough energy that would
otherwise power a village, to overfit some dataset that excel at some
completely artificial leaderboard metric.

At some point a grown up comes in & says you don't need to light up your
bedroom with industrial stadium lighting if you need to do your homework, a
40w table lamp will suffice. Hell half the third world uses a candle & they
are performing quite well, if not better. That's really all this is.

The marginal gains in performance aren't justified compared to the amount of
energy you end up expending. Half these models are brittle & have zero shelf
life, ppl regularly throw away stuff they wrote just a year ago, so to what
raison d'etre this pursuit ?

~~~
JimmyAustin
The cure isn't proscribing "green ai", it's adding a carbon tax that embeds
the cost of the carbon externality into the cost of power in general, then
letting the free market make the decision.

------
mark_l_watson
I like this idea! In last week’s Lex Firdman AI interview, Gary Marcus touches
on the low twenty watt energy requirements of the human brain compared to deep
learning energy requirements.

Basically adding energy requirements to the loss function of automated neural
model architecture search seems like a good idea also. (I am thinking of
frameworks like AdaNet, etc.)

I retired this year but I still spend a lot of time reviewing deep learning
and also conventional AI literature (and I do tiny consulting projects to help
people get started or build prototypes).

Since I now mostly pay for my own computer resources I try to mostly limit
myself to what I can do on my System 76 laptop with a single 1070 GPU. The
availability of pretrained models makes this not so bad, at all. I really
appreciate efforts by huggingface (and other organizations) of offering
reduced size models that still provide good results.

~~~
AndrewKemendo
The math is just wrong though.

To train a brain to be competent in a classification task it takes 20W rms for
years on end for the individual + all of the wattage from the parents,
grandparents, teachers etc... that are training the individual for those
years. Very hard to determine the power allocation used for a specific human
being trained on a narrow task for eg. object classification, but that doesn't
mean it's not the comparable measure.

Comparing training of a single model to the "instant power" draw of the brain
is not just over simplistic, the scale and time periods are wrong.

~~~
Invictus0
8 year olds can accomplish pretty much any classification task. 20W * 3.154E7
seconds in a year * 8 years / 3600 seconds per hour gives about 70 kilowatt-
hours, which would cost about $14 from the grid at $0.20/kwh. Our AI is
nowhere near the intelligence of an 8 year old, who can presumably also do
more things than just decide if a photo depicts a bus or an avocado.

~~~
Tade0
I think your math is off.

8 years at 20W is:

0.02kW * 24h * 365.25 * 8 = 1 402.56kWh

~~~
Invictus0
Yup, you're correct, forgot to multiply by the 20W. Still only about 1.5X
average American monthly electricity usage.

------
CoffeePython
I find it strange that the part of the conversation where AI has the potential
to save thousands of hours of human labor doesn’t show up more often in these
types of threads.

That being said, we need to definitely be thinking about lowering the cost
(environmental and monetary) of training these models. I’m glad research is
being done within this domain.

I’d love to see a study on what the human labor cost potential vs. training
environmental costs would be for certain large models.

~~~
rahkiin
I find it interesting that you see 'save thousands of hours of human labor' as
something that's surely positive. What should those people start doing
instead? They also need an income to get bread on the table.

~~~
SeanAppleby
Literally anything else.

Whatever makes them happy, petting their dog, having a conversation with their
friends/family, make some music that one person listens to, literally anything
produces more value for society than performing arbitrary unnecessary labor
which could be automated for near-zero marginal cost.

We should divert that capital to something more productive, tax the process
and kick them back a basic income to do whatever they want with. If what they
really want to do is what they currently do at their job, whatever, they can
keep doing it while supporting themselves on the basic income, but that'd be
their uncoerced choice.

Most people are unhappy at work and only go because they are required to to
survive. If our society weren't so myopic and steeped in puritan ethics this
would be as absurd as asking whether the thirteenth amendment was going to be
surely positive.

~~~
pacala
Work gives us purpose. There is no substitute for enjoying the fruits of one's
labor. Working on teams anchors us socially. On the flip side, working for
someone else with little agency and decompression time is not very enjoyable.

Between pointless distraction and work, most people will pick the latter,
especially after a solid decompression period, in spite of the surface
enjoyability balance favoring the former.

~~~
jsjolen
There's a lot of work that I want to do that I cannot be paid to do, therefore
those are the things I would be doing with my time instead.

Being paid is not a substitute for enjoying the fruits of one's labour.

~~~
pacala
True. The point is that at a personal level leisure is no long term substitute
for work.

On organizing work. Chances are that at some point you'll want to work on
things beyond the scope of what you can accomplish on your own. Or want to
have a larger impact on the world / other people's lives. The big question is
how to organize the work of people at society level. The pay per work model
has proven to be very effective and gave us the marvels of the modern world.

On the shadow side, the pay per work model has become heavily geared towards
exploitation, as in exploration vs. exploitation. We are at a point where
worker conditions in a BigTech warehouse, or as a mechanicized Turk, are
meticulously quantified exploitation. At the same time, we find it natural to
expend the energy budget of a small city to enable machines do large-scale
exploration.

------
ausbah
Like the abstract mentioned, I think this is also a good criteria for helping
"level the playing field" for groups with lower budgets so the phenomenon of
simply throwing more computational resources at a problem with an existing
resource becomes less of a novelty.

~~~
remon
Hm, I struggle to see an upside with levelling the playing field that way.
Groups that have the budget to throw huge amounts of resources at problems are
still providing important insights. That can happen in parallel to optimising
wattage/computational unit. In fact, that can be an almost completely parallel
track in AI research.

~~~
currymj
right now such a parallel track doesn't really exist, at least not on an even
footing.

as the article shows, there's still a heavy focus on beating accuracy
benchmarks, at huge computational cost, in terms of what actually gets
accepted at major venues.

it can be quite difficult to make the case for a method that has worse
performance, but is cheaper. hard to judge without real data but I think such
papers are much less likely to be accepted.

there are also sometimes demands to replicate very expensive techniques as
baselines, which can be onerous for groups with limited resources.

------
css
The citation for the "surprisingly large carbon footprint" [0] is crazy. The
paper alleges that a car, including fuel, has a lifetime footprint of 126,000
lbs of CO₂e, and training a big NN transformer consumes 626,155 lbs of CO₂e,
almost a 5-fold increase.

[0]
[https://arxiv.org/pdf/1906.02243.pdf](https://arxiv.org/pdf/1906.02243.pdf)

~~~
tsbinz
Training it _with neural architecture search_. Just training it with a given
architecture they cite as 192lbs ...

~~~
dharma1
I thought Google etc datacentres run on mostly renewable energy?

------
sriku
Research that demonstrates, say, 1000x reduction in power for training a known
problem (ex: mnist) won't be considered irrelevant by the community. So is
there a specific need to bias against large works apart from the carbon
footprint argument? There remain questions to be answered in that space too -
such as are current "neural" architectures adequate to cover the capabilities
of the brain when upping only the scale? It was certainly worth knowing that
scaling up was all that was required to compete with humans in DOTA. But will
we hit wall as we near human level complexity? After all, the money spent on
making movie which is "just" for entertainment trump's multi million $ deep
learning experiments in costs which I guess have some correlation with the
carbon footprint .. or the gases emitted in rocket launches. Do we really know
whether this well intentioned call for green AI (which I right as hell want)
will do too little for the greenness while biasing people against possible
discoveries that can lead to a greener future a tad too early?

Edit: pardon typos due to mobile device.

------
taneq
> The computations required for deep learning research have been doubling
> every few months

I feel this is misleading. Computation is _cheap_ , so the computation thrown
at deep learning research has been doubling every few months, but that doesn't
mean it's required to do research (unless your research is "throw huge
datasets at a neural net architecture and see if it sticks".)

~~~
atoav
I understood it as “computation required to replicate stuff outlined in a
paper has doubled” not as “we do more research so we do more computation”.

------
malux85
This reminds me of that Simpsons quote "We can't fix your heart, but we can
tell you exactly how damaged it is"

~~~
swalsh
I'm not sure about that. I think you get what you measure, and if you start
measuring effeciency, you're going to start seeing a major incentive to make
it more effecient.

~~~
Nasrudith
I agree that it is important but think it is more understanding is a
prerequisite to do so deliberately (careful of Goodhart's law) and from there
the incentives become evident.

Like Lanchester's laws. It is obvious how outnumbering the foe can help but
knowing the scaling really hammers home how important it is to concentrate
force against as weak a foe groups as possible.

[https://en.m.wikipedia.org/wiki/Lanchester%27s_laws](https://en.m.wikipedia.org/wiki/Lanchester%27s_laws)

------
esotericn
This quote amused me:

> deep learning was inspired by the human brain, which is remarkably energy
> efficient

Yeah, sure, if you purely look at its' energy consumption in isolation and
ignore all of the flights of fancy we engage in (such as, for example, this NN
model) in order to maintain its' coherence.

------
keithyjohnson
Efficiency metrics would be really useful is evaluating DNNs for embedded
solutions as well.

