
At Tech’s Leading Edge, Worry About a Concentration of Power - gumby
https://www.nytimes.com/2019/09/26/technology/ai-computer-expense.html
======
high_derivative
So I am a machine learning researcher who moved to a FAANG as a research
scientist after graduation. My salary is 10x against the grad student stipend.
That does not even account for the free food, the healthcare, and other perks.
However, I have not adjusted my lifestyle so it does not feel real.

The thing is, even though having 1000x the resources compared to university,
that does not really make me happier about the work specifically. It makes
some things easier and other things harder.

No, what I really feel is that at work I am not actually treated like a
servant any more but like a person. I don't have to work weekends and nights
any more. I can take vacations and won't be flooded with emails
every.single.day during holidays. I don't have extra unpaid responsibilities
that I have zero recourse against.

The thing that I just cannot stop wondering is, why, knowing the perks of
industry, advisers still treated us like this. Even though ostensibly trying
to keep us, it felt more like squeezing out as much as they could.

~~~
Donald
>The thing that I just cannot stop wondering is, why, knowing the perks of
industry, advisers still treated us like this.

It's cultural. Their PhD advisors treated them the same way. A PhD is
effectively a hazing ritual required to break into academia.

~~~
WhompingWindows
I entered an MS program, where my two advisors told me I could get a PhD in
just 3 years, vs 2 years for the MS. I naively thought that sounded good. I
didn't realize that my advisor took over 4 years to do it herself. Nor did I
realize that she was drastically over-committing herself on time, and she
dropped me as a student just 1 year later because she didn't have time. Just a
blatantly misleading and false advising process, I wasted 2 years due to that.

The power your advisors wield over you is terrifying in graduate school. You
are severely underpaid AND they have you over a barrel.

~~~
tonyarkles
> The power your advisors wield over you is terrifying in graduate school. You
> are severely underpaid AND they have you over a barrel.

I realize this isn't the case for everyone, and I realize my situation was
odd, but... there's always a trump card: you can drop out. I get that for
international students it's not always so simple (losing your student visa,
having to go home) and can be a very hard choice.

My supervisor and my department occasionally tried to push for things that I
believed were ridiculous, and the threat of dropping out seemed to work
reasonably well to push back on that. My understanding is that a supervisor
having a low completion rate is a black mark that can hurt them, especially if
they're chasing tenure. I didn't pull this out like a petulant child stomping
their feet when they didn't like their chores, but rather when things were
getting stupid.

Tony: "This is totally unreasonable and the timeline doesn't work for me."

Supervisor: "This is how it must be. There's no other option."

Tony: "I could drop out and continue on with my life."

Supervisor: "Now wait a minute, let's see what we can do to make this work..."

~~~
selimthegrim
Your supervisor can drop out too - they can change universities midstream.

~~~
tonyarkles
Also very true! My #1 choice of supervisor was pleasantly honest about that.
He wasn't tenured, and after we'd met a few times to talk about maybe me
starting a program with him in the fall, he told me "I've got bad news.
There's a pretty good chance that I'm going to lose my position here due to
funding cuts, and that would leave you without a supervisor in 8 months. We're
looking at a pretty niche field, and there isn't really anyone else who could
take over for the project we're talking about. Sorry, you should look
elsewhere."

I still look back and would have loved to do that research, but I appreciate
his honesty.

~~~
selimthegrim
You had honest ones. Mine tried to milk me right up to the moment they
announced they were leaving - for institutions that only offered a Master's
program or for a different field department and were stupid enough to think I
wouldn't see the writing on the wall long before or figured I'd quit.

------
galimaufry
I'm not an expert, but I've read highly-cited ML papers where the researchers
barely bothered with hyperparameter search, much less throwing a few million
dollars at the problem. You can still get an interesting proof of concept
without big money.

And low resource computing is more theoretically and practically interesting.
I've heard experts complain of some experiments "they didn't really discover
anything, they threw compute at the problem until they got some nice PR." This
was coming from M'FAANG people too so it's not just resentment.

~~~
opportune
Are you sure they barely bothered, or did they just not mention it? I have
heard stories of lots of “cutting edge” ML research actually just being the
result of extremely fine hyperparameter tuning

~~~
Bartweiss
This is a worryingly good question. Most ML papers represent real results,
we're not going to see a replication crisis in that sense, but I've heard some
fears about another AI winter arriving when people realize that the sum of our
reported gains vastly exceeds our actual progress.

Hyperparameter tuning is one big concern here; we know it provides good
results at the cost of lots of work, so there's a temptation to sic grad
students on extensive tuning but publish on technique instead. Dataset bias is
another, since nets trained on CIFAR or ImageNet keep turning out to embed
database features.

Ironically, I'm not sure all this increases the threat of FAANG taking over AI
advancements. It sort of suggests that lots of our numerical gains are brute-
forced or situational, and there's more benefit in new models work than mere
error percentages would imply.

------
acollins1331
But I do AI and I work at a university. Developing AI might be too expensive
if you're going for larger architectures and eeking out an extra % or two in a
kaggle like problem. Most advances in machine learning to come though are in
fields where there has been little activity. I'm currently en process of
making a career out of using very basic machine learning methods and applying
them to physical science problems because 95% of the people in the field don't
know how (problem of tenure). This opens up lots of opportunities for funding
though. NSF etc will literally just throw money at you if you say AI and that
you'll apply it to any problem.

~~~
abrichr
> _I 'm currently en process of making a career out of using very basic
> machine learning methods and applying them to physical science problems..._

Interesting! How do you find customers?

~~~
acollins1331
Oh, the career is in academia and research.

------
bluishgreen
I call BS.

Companies spend a lot of money on AI because they have a lot of money and
don't know what to do with it. Companies lack creativity and an appetite for
riskier and more creative ideas. That is what Universities must do instead of
trying to ape companies. The human brain doesn't use a billion dollars in
compute power, figure out what it is doing.

Sort of by definition, it can never be too costly to be creative. Only too
timid. And too unimaginative.

~~~
mapcars
>The human brain doesn't use a billion dollars in compute power, figure out
what it is doing

Phahahaha, this made me a good laugh :) To find out how your brain works guess
what are you going to use - the brain itself. It's like trying to cut a knife
with itself, or trying to use a weigher to weight itself.

~~~
noonespecial
>to use a weigher to weight itself.

Flip it upside down. Like he said, be creative.

~~~
aurelianito
Or use 2. That's what we do, use one brain to understand another brain.

------
jchallis
There is a far more obvious problem than the cost of computing - the cost of
labor. If you are a deep learning researcher, you can join any of these
companies and multiply your salary 5x.

Computer constraints are relatively straightforward engineering and science
problems to solve. The lack of talent, that seems like the bigger story.

~~~
blazespin
Yeah, but at university you are more free to pick your project and can be much
more nimble - so it balances out.

~~~
skj
You might be shocked to find out what the job is like for most professors.
Grants, grants, and grants.

~~~
dekhn
and travel, travel, travel, and submit, submit, submit papers.

------
Merrill
It may be a sign that CS is becoming a mature field. Aeronautical Engineering
doesn't build experimental aircraft. Chemical doesn't build experimental
refineries. If you want to do those things you work for Lockheed, Boeing,
Exxon, etc.

~~~
dekhn
This is demonstrably false; high-end research universities do exactly all
these things. Stanford has a high-end fab. Caltech students build experimental
aircraft. Universities build nuclear reactors for research. I didn't find any
examples but I'm certain that Unis in Texas have small refineries.

~~~
Merrill
There are a few "have" universities in each case, and mostly "have not"
universities, which is consistent with the direction AI is going, according to
the NY Times article.

Even then, the experimental aircraft at Caltech is not a full-scale prototype
of the next generation of fighter after the F-35. How does Stanford's fab
compare with TSMC's?

Edit: TSMC collaborates with four Taiwan universities on research and provides
fab services for 23 universities.
[https://www.tsmc.com/csr/en/update/innovationAndService/case...](https://www.tsmc.com/csr/en/update/innovationAndService/caseStudy/2/index.html)

------
bitL
I hit this wall in my own Deep Learning-based automation business; I am now
forced to rely on transfer learning most of the time, my Tesla/Titan RTX-based
in-house "server" is no longer capable of training latest greatest models in
reasonable time, cloud is out of question due to costs and (automated)
parameter tuning with distributed training takes ages. I can still ask a lot
for customized solutions, though I see the writing on the wall that it might
not last too long (2-4 years) and I'd have to switch business as there will be
only a handful of companies able to train anything that is better than current
SOTA.

~~~
thewarrior
But do you need SOTA for your business to be viable ?

~~~
bitL
Well, right now I can get within 1-3% SOTA performance for models I need but I
expect that in 2-4 years I'll be much farther away with laughably outdated
models nobody would want to pay for. Like now I can replace 100s of people in
e.g. quality control as they make e.g. 15% errors and I can bring that down to
e.g. 8%. But later big boys might be able to bring it down to 1% for $2M
training cost and I wouldn't stand a chance.

------
m3nu
I asked the same question after several AI talks given by large companies
(Google, Nvidia, etc).

The general answer is that it's still possible to try out things on a single
GPU or several servers and many gains come from good features and smart
network designs. On the other hand, squeezing out the last 5% does require
more data and budget.

Personally, I think you can still do a lot with a moderate budget and smart
people. But would love to hear other opinions.

~~~
totoglazer
Look into the modern nlp models. BERT and its many derivatives, RoBERTa,
XLNet. Training all of these require roughly TB of data, and generally take
days on multiple TPUs. You often can’t even fine tune on a single GPU without
some clever tricks.

------
RhysU
When compute becomes prohibitively expensive, people find opportunities in
attaining better algorithms or cheaper compute.

~~~
hexeater
“Better algorithms” - sounds like a good research topic!

~~~
samfriedman
Drat, scooped again...

------
benkarst
Why do universities need to compete with big companies? The less they behave
like corporations and more like places of learning, the better.

------
cujo
So tell me something....

Have AI techniques actually changed in the last 20 years, or is there just
more data, better networking, better sensors, and faster compute resources
now.

By my survey of the land, there haven't been any leaps in the AI approach.
It's just that it's easier to tie real world data together and operate on it.

For a university, what changes when you teach? This sounds like researchers
feeling like they can't churn out papers that are more like industry reports
vs advances in ideas.

~~~
currymj
There hasn't been any kind of paradigm shift but there have been a bunch of
real although incremental improvements. Improved optimization algorithms,
interesting and novel neural network layers.

Some of this is even motivated by mathematical theory, even if you can't prove
anything in the setting of large, complex models on real-world data.

The quote from Hinton is something like, neural networks needed a 1000x
improvement from the 90s, and the hardware got 100x better while the
algorithms and models got 10x better.

~~~
cujo
So I guess this kind of leads to my original point. If you're a school,
nothing has really changed that would require you to invest gobs of money to
teach AI. It only matters if your idea of research is trying to create
something you can go to market with.

So basically every graduate chemistry program.

------
blacksoil
While it's true that training a new DL model requires lots of computation
power, I personally feel that such activity mentioned in the article is more
of "application" of ML instead of "research". I personally think University
should move in the direction of "pure" research instead.

For example, coming up with a new DL model that has improved image recognition
accuracy would mean it has to be trained through the millions of samples from
scratch, which requires a lot of time and money. But I'd argue that such thing
is more of an "application" of DL instead of "research". Let me explain why...
Companies like FAANG have the incentives to do that, because they have tens or
hundreds of immediate practical use cases once model is completed, hence I
call such activity more of an "application" of ML rather than "research",
because there's a clear monetary incentives of completing them. What about
University, what sort of incentives do they have by creating a state-of-the-
art image recognition other than publication? The problem is publication can't
directly produce the resources needed to sustain the research (i.e. money)

I think ML research in the university should move in the direction of "pure"
research. For example, instead of DL, is there any other fundamentally
different ways of leveraging current state-of-the-art hardware to do machine
learning? Think how people moved out approaches such as SVM to neural network.
Neural network was originally a "pure" research project. At the moment of
creation, neural network wasn't taking off because hardware wasn't capable to
keep up with its computational demand, but fast forward 10-15 years later, it
becomes the state-of-the-art. University ML research should "live in the
future" instead of focusing on what's being hyped at the moment

------
ineedasername
The article presents the rising costs couched within the theme of all-to-
powerful tech companies like Google and Facebook, which is really irrelevant:
These costs are not high because of those companies, they are high because the
research itself is incredibly resource intensive, and would be so whether or
not large tech companies were also engaged in it. In fact, with Google and
their development of specialized chips for this purpose, AI research is
probably getting _cheaper_ due to their involvement.

Next, this research will probably continue to get cheaper. The cost to do the
Dota 2 research 5 years ago would have been much higher, and will probably be
even less expensive 5 years from now.

Also, I think there's plenty of room for novel & useful at the bottom end
where $millions in compute resources are not essential. Cracking AI Dota is
certainly interesting, but it's hardly the only game in town, and developing
optimized AI techniques specifically for resource-sparse environments would be
a worthy project.

------
wongarsu
Having a few hundred consumer GPUs or a few dozen "datacenter" GPUs should be
within the reach of any University department, and at least Nvidia also seems
happy to sponsor University setups (after all new research creates demand from
the industry).

Sure, this doesn't compete with Google's data-centers. But that's assuming
Universities are for some reason competing against private industry. That's
not how any other engineering discipline works, so it's a bit odd to just
assume without discussion.

~~~
paulhilbert
"Having a few hundred consumer GPUs or a few dozen "datacenter" GPUs should be
within the reach of any University department"

That was funny - however not even close to reality. I have to work on a GTX
1080 (not TI)...

~~~
wongarsu
A month ago Nvidia had a grant program running to get rid of
refurbished^w^w^w^w donate 1-4 Titan Vs based on a 1-2 page application [1].
When my university started offering a CUDA course we got ~15 top of the line
GTX cards sponsored by Nvidia. Buying 100 GTX1080TI with 11GB with supporting
hardware is in the range of 100 thousand Euro/USD (before applying education
discounts and asking for sponsorships). Not money spent on a dime, but not
outrageously expensive either (the article mentions OpenAI spending millions
on cloud GPU resources, compared to that spending 100k on something you get to
keep is nothing)

[1]
[https://developer.nvidia.com/academic_gpu_seeding](https://developer.nvidia.com/academic_gpu_seeding)

------
iamgopal
AI is now a data problem, and a bit of optimisation problem, both should or
could be solved at commercial end of research. Universities should be more
focused on what's next ? Not saying this is not "the next". but, as, ground
level ideas of AI are quite 50-60 year old if not more, current ground level
research should make theoretical platform for the technology that is going to
come in 20-50 years.

------
ArtWomb
You need all four quadrants: risk capital, industry, academics, and that
elusive X-factor. The hard work of AI/ML theory, such as issues around
generalizability and ethics, is still done around whiteboards and academic
conferences.

A more useful metric may be the proportion of proprietary versus open
discovery. I don't know if I can point to a single example where researchers
have not rushed to put their latest breakthroughs on OpenReview or Arxiv. Even
knowledge of a technique, without the underlying models or data, is enough to
influence the field.

Academic free inquiry and intellectual curiosity, looks very different than
product-focused solutions-oriented corp R&D. A good working example looks
something like Google AI's lab in Palmer Square, right on the Princeton
campus. Researchers can still teach and enjoy an academic schedule. I think it
was Eric Weinstein who said something to the effect that if you were a johnny
come lately to the AI party, your best bet would just be to buy the entire
Math Department at IAS! In practice, its probably easier to purchase Greenland
;)

------
mcv
I don't quite understand the issue here. I thought the main reason for the
many recent breakthroughs in AI was that hardware has become cheaper and more
powerful. Anyone can train a neural network on the graphics card of their home
PC now. There are powerful open source frameworks available that do a lot of
the heavy lifting for you. You can do far more today than you could back when
I was in AI.

Of course the Big Tech companies have far more resources to throw at it;
that's why they're Big.

A far more serious issue than access to computational power, is access to
suitable data, and particularly the hold that Big Tech has on our data.

------
ilaksh
It seems like a structural problem. Deep learning generally performs better
than anything else that is well known but it also has well known limitations
and inefficiencies.

People should question all of the assumptions, from the idea of using NNs and
the particular type of NN and all of the core parts of the belief system.
Because these certain aspects are fixed on faith more than anything else.

If you want efficiency of training, adaptability, online, generality, true
understanding, those assumptions might need to go. Which would not mean you
could learn from DL systems, just that core structures would not be fixed.

------
jackcosgrove
Is compute the limiting factor or data?

I see an asymmetry between academia and industry. Academia has the models,
industry has the data. Compute is more balanced because it's usually commodity
hardware.

If industry is outpacing academia in research, I think that means data is the
more valuable quantity, not compute.

And the article's theme of concentration is more a problem with data. Is
Facebook dominant because of its algorithms or because of its database? If
other companies had Google's index and user telemetry could they not compete
with a rival search algorithm?

------
keithyjohnson
Maybe this is a good time for university researchers to develop AI algorithms
that are not so data and compute hungry. Here's a promising bit of
that--[https://www.csail.mit.edu/news/smarter-training-neural-
netwo...](https://www.csail.mit.edu/news/smarter-training-neural-networks).
This is easier said than done but necessity is the mother of all invention
they say.

------
blue_devil
Maybe it's time to start triaging/incentivising what kind of problems we spend
that computational power and CO2 credits on.

Playing in the sandbox with AI (especially the "brute force" deep learning
algos) in and of itself does not equate _intelligence_ or _progress_ for us as
civilisation.

------
raverbashing
AI is not expensive necessarily

What's expensive is thinking AI evolution means deeper networks and the only
way to get better results is by throwing more GPUs at the problem.

And to be honest those with "infinite" resources are a bit "guilty" of pushing
this research lines.

------
imtringued
I don't see the problem. Universities don't solve problems by simply throwing
more dollars at them. If compute resources are scarce then researchers are
incentivised to invent machine learning techniques that use those resources
more efficiently.

------
Barrin92
the point about the increase in computational resources reminded me of Gary
Marcus latest book. It seems like the increasing trend away from research into
generalised AI models towards narrower and narrower domain-specific ML models
that consume more and more energy and data is becoming an increasing
bottleneck.

I think instead of trying to build larger computers there is an opportunity
for academia to move back towards the construction of cognitive models and
minimizing the reliance on computation and data. That's what intelligence is
supposed to be all about.

~~~
degski
You hit the nail on the head. AI is nowadays defined as a very sophisticated
form of curve fitting [thanks to TensorFlow a.o.], that's what training ANN's
comes down to. There is clearly a lack of intelligence, f.e. the [main] method
used to beat Lee Sedol was MCTS (with the help of some ANN's).

------
programminggeek
University isn't utopia. Not everyone wants to live there. Also, there aren't
enough university research dollars to adaquately fund AI research relative to
the current demand.

This is a good thing.

------
whatitdobooboo
Hasn't it always been like this? When the Model T came out it was concentrated
even more I would assume. What other way could it be?

------
rahidz
Not super knowledgeable about deep learning, but is some form of distributed
computing possible here, or is the amount needed just too high?

~~~
FartyMcFarter
Yes, distributed computing is sometimes possible for machine learning.
However, distributed computing typically makes computations faster or bigger,
not cheaper.

------
drchewbacca
Is there anyone who thinks we shouldn't break up these large companies on
anti-trust grounds?

Imo it's got to the point where if you have a bad idea your startup fails and
if you have a good idea they will copy it and your startup fails. Even
companies like spotify and slack which have "made it" are now threatened by
Google Music + Apple Music and by Microsoft Teams.

Would be interested to hear other opinions.

------
octocop
This could be said with anything, some people have more money than others.

------
cjfd
Well, these AI calculations also emit much CO2. It may be a luxury that we can
only afford ourselves within measure. The vision of a society where AI plays a
large role may only be feasible both ecologically and economically if/when we
have fusion energy.

~~~
yhoneycomb
This is so ridiculous. Of all the things that emit co2 you’re focusing on AI
calculations? Tell me you’re not serious.

~~~
imglorp
I'm dead serious about this. I think we should weigh all large scale compute
against its carbon cost. Granted, fang is starting to look for renewable
sources for its DC's, but US data centers alone will use around 73 billion kWh
in 2020 [1].

How much of that is ML? As someone pointed out in another comment, how much ML
is actually useful? Some ML is "carbon good", such as route planning saving
energy. But do we really need to spend billions of kWh just to get slightly
better recommendations? Do we really need to increase margins a fraction of a
percent for some company to show more ads and sell more?

And while we're on the subject of power, maybe if web pages weren't 300mb of
crap and 1k of content, we could cut back on another few billion kWh on
servers and routers.

This shit is serious, we're dying here, so yes, absolutely, let's do the math
about how much AI costs. It's up to us, the computer people, to ask these
questions and solve the part of this problem that WE own.

1\. [https://eta.lbl.gov/publications/united-states-data-
center-e...](https://eta.lbl.gov/publications/united-states-data-center-
energy)

~~~
ctdonath
_we 're dying here_

Who? Where? Why?

~~~
imglorp
The point is the expenditure is unsustainable at the current rate, and we can
do something about it.

~~~
ctdonath
So nobody is dying from this.

Are you doing something, personally, about this? My office is 100% solar, home
about 20%, telecommute 100%; you?

------
surelyyoujest
Breaking news: some universities have more money than others.

------
floor_
I wonder how it compares to something like cryptocoin mining/transaction
processing.

~~~
atupis
And so Deeplearningcoin was born.

~~~
toxik
Interesting idea: the blockchain is a sequence of gradient descent steps

------
baybal2
I'm seeing less and less point with all of that AI thing.

I'm seeing companies spending tens of megabucks on idling GPU farms without a
clear idea what to do with them.

Saw that when we did a subcontract for datacentre for Alibaba. They had a
huuge reception for all kinds of dignitaries, showing them CGI movies of their
alleged AI supposedly crunching data right there in the DC — and all that with
all of hardware in the DC being shut down...

The moment I poked a joke about that during an event, there came dead silence,
and faces on the stage started to turn red. The guy defused the situation with
a joke, and the party went on.

~~~
whatshisface
AI has too perfect of a sales pitch and surrounding narrative for business
people to treat it rationally. The fact that philosophers who don't understand
AI are debating AI philosophy is strong evidence of that. A crash is less
likely this time around because it's being funded by established companies
instead of the public market this time, but there has to be a moment of
reckoning for the baloney eventually... I hope.

~~~
xamuel
>philosophers who don't understand AI

If you mean "philosophers who don't understand ML/gradient descent/linear
algebra/statistics etc", I don't think many philosophers are debating about
that stuff.

If you mean "philosophers who don't understand artificial intelligence", that
would be all of them, and everyone else too, because no-one understands
artificial intelligence yet. And a lot of the people who come closest to
understanding it are in philosophy departments.

~~~
whatshisface
I more or less agree with your second point. Presently the philosophy of
artificial intelligence is kind of like theology, it is molded by how people
like to think more so than it is determined by reality. Sales pitches also
tend to be molded by how people think. AI has that theological essence that
makes people love it even if they don't understand it, and that makes it a lot
easier to sell big GPU farms to companies that don't have a legitimate use.

