Hacker News new | past | comments | ask | show | jobs | submit login
At Tech’s Leading Edge, Worry About a Concentration of Power (nytimes.com)
230 points by gumby 27 days ago | hide | past | web | favorite | 206 comments



So I am a machine learning researcher who moved to a FAANG as a research scientist after graduation. My salary is 10x against the grad student stipend. That does not even account for the free food, the healthcare, and other perks. However, I have not adjusted my lifestyle so it does not feel real.

The thing is, even though having 1000x the resources compared to university, that does not really make me happier about the work specifically. It makes some things easier and other things harder.

No, what I really feel is that at work I am not actually treated like a servant any more but like a person. I don't have to work weekends and nights any more. I can take vacations and won't be flooded with emails every.single.day during holidays. I don't have extra unpaid responsibilities that I have zero recourse against.

The thing that I just cannot stop wondering is, why, knowing the perks of industry, advisers still treated us like this. Even though ostensibly trying to keep us, it felt more like squeezing out as much as they could.


>The thing that I just cannot stop wondering is, why, knowing the perks of industry, advisers still treated us like this.

It's cultural. Their PhD advisors treated them the same way. A PhD is effectively a hazing ritual required to break into academia.


I entered an MS program, where my two advisors told me I could get a PhD in just 3 years, vs 2 years for the MS. I naively thought that sounded good. I didn't realize that my advisor took over 4 years to do it herself. Nor did I realize that she was drastically over-committing herself on time, and she dropped me as a student just 1 year later because she didn't have time. Just a blatantly misleading and false advising process, I wasted 2 years due to that.

The power your advisors wield over you is terrifying in graduate school. You are severely underpaid AND they have you over a barrel.


You could always say F it and get a job in the industry.

It's 1000 times better especially when your research is clearly not on the cutting edge and you end up comtemplating life choices because what you're doing are basically useless to the world, and you're only doing it because some funds decided to throw money on it.

At least that's how it is for people whose primary research method is coding, such an easy exit. The powerpoint making variety have it harder though.


> The power your advisors wield over you is terrifying in graduate school. You are severely underpaid AND they have you over a barrel.

I realize this isn't the case for everyone, and I realize my situation was odd, but... there's always a trump card: you can drop out. I get that for international students it's not always so simple (losing your student visa, having to go home) and can be a very hard choice.

My supervisor and my department occasionally tried to push for things that I believed were ridiculous, and the threat of dropping out seemed to work reasonably well to push back on that. My understanding is that a supervisor having a low completion rate is a black mark that can hurt them, especially if they're chasing tenure. I didn't pull this out like a petulant child stomping their feet when they didn't like their chores, but rather when things were getting stupid.

Tony: "This is totally unreasonable and the timeline doesn't work for me."

Supervisor: "This is how it must be. There's no other option."

Tony: "I could drop out and continue on with my life."

Supervisor: "Now wait a minute, let's see what we can do to make this work..."


". My understanding is that a supervisor having a low completion rate is a black mark that can hurt them, especially if they're chasing tenure."

Normally, during tenure decisions, the grad students and postdocs will be interviewed by the dean. If you work for a pre-tenure prof, and decide to leave, meet with the dean and let them know your concerns before dropping out. Make sure to include lots of physical documentation, such as emails requesting you to do inappropriate things.


Your supervisor can drop out too - they can change universities midstream.


Also very true! My #1 choice of supervisor was pleasantly honest about that. He wasn't tenured, and after we'd met a few times to talk about maybe me starting a program with him in the fall, he told me "I've got bad news. There's a pretty good chance that I'm going to lose my position here due to funding cuts, and that would leave you without a supervisor in 8 months. We're looking at a pretty niche field, and there isn't really anyone else who could take over for the project we're talking about. Sorry, you should look elsewhere."

I still look back and would have loved to do that research, but I appreciate his honesty.


You had honest ones. Mine tried to milk me right up to the moment they announced they were leaving - for institutions that only offered a Master's program or for a different field department and were stupid enough to think I wouldn't see the writing on the wall long before or figured I'd quit.


> I entered an MS program, where my two advisors told me I could get a PhD in just 3 years, vs 2 years for the MS.

In Germany PhD is almost always done after an MS and the PhD usually takes 4-6 years depending on the group.


> A PhD is effectively a hazing ritual required to break into academia.

Your definition is painfully on point.

And the perks of "getting into academia" are getting smaller with each day.

But hey, you get to have those 3 letters next to your name and probably can ask for a higher salary, that is if you don't go back into the academic machine and the hunger games of chasing grants/contracts and eventually tenure (but it is my impression funding a successful startup is easier than getting tenured today)


Your last impression, if true, would come as a great surprise to me. I would check but I'm literally in a Econ PhD class right now.


My professor with commercial experience said roughly the same thing: academia is harder and more competitive than the commercial world. Now I have experience of both, he was right. Your collegues have roles and help you in commercial world. Everyone is against each other in academia, and the admin staff know your just a temp so your problems are not worth the effort.


As a nuclear physicist who left academia for Disney once told me: "Academia... The competition is so intense because the stakes are so low..."



My brother-in-law is a tenure-track professor (though not tenured yet) in cancer research, I'm working on getting a startup funded. I would agree with that impression. There are more fundable startup markets than there are tenured positions available in academia, and you have more flexibility to pursue new ones that open up than you do to change your line of research if it doesn't pan out. Both professions require an enormous amount of work, but you have more control as a startup founder, and are less subject to the whims of politics, grants, and scientific reality. Hard science involves an amazing amount of risk and hard work to manage that risk, moreso than startups do.

The rewards for a successful startup are about 2 orders of magnitude higher than the rewards for tenure, too. The latter means that you have a job for life; the former means you don't need a job for life.


>There are more fundable startup markets than there are tenured positions available in academia

>The rewards for a successful startup are about 2 orders of magnitude higher than the rewards for tenure, too.

Massive [citation needed] on both these claims. "But my brother-in-law" is not evidence. It's not at all obvious that there are more startup markets than tenure positions. Also, founding a successful startup is largely a matter of luck, and being in the right place at the right time.


3/4 of academic positions are now non-tenured. When I left academia at the end of a 4-year post-doc in the late 90s, I more than tripled my salary in a day. These days, it would be more like ~10x. Without tenure, what possible reason does anyone have to stay in academia?

And for the first decade and a half it was great to make money, but also possess the grounding to build things collaboratively with academics. But then AI happened, and the market inefficiency that academia created for itself by making 3/4 of us make less than a high-school dropout with a trade skill caused tech to suck them all up like a sponge to feed the AI craze.

And these days, the same craptastic culture that drove me out of academia (pedigree, PhD (even though I have one but not from one of the 4 horsemen of AI or even one of the right schools so it's nearly worthless apparently), and the coveted "applied scientist" title which at least at AMZN means you get 15% higher comp to order both the scientists and engineers around because you convinced an L10 you were one whatever your actual qualifications are) are now status quo in tech. But I'm old now, and every day I am tempted to just throw in the towel and retire.

Commence those downvotes guys! None of what I just wrote could possibly be true in any way. Tech is clearly awesome and we just need to keep up the great work so we can hit that singularity sprint goal by 2030 or so to save Ray Kurzweil from squandering his hard-earned money on an Alexa-equipped toupee.


If you want data:

As of 2016, academic institutions hired 21,511 full-time, tenure-track professors [1]. Your odds of getting tenure given that you're in a tenure-track position vary by institution, but seem to range from about 75% at elite private institutions to 90% at state schools [2]. Figure about 18,000 professors were hired into positions that will eventually grant them tenure.

AngelList lists about 30,827 companies at the Seed or Series A/B/C stage [3]. Crunchbase has 33,577 funding rounds that occurred within 2016 [4]. The former stat has issues in that it includes funding rounds over a broad time period, and the latter potentially has issues with multiple funding rounds for the same company happening; however, given their general consistency and the existence of funded companies that do not appear in AngelList nor Crunchbase, 30,000 companies funded per year seems about right.

The average tenured professor salary varies heavily by field, but the Chronicle of Higher Ed reports it at $104820 [5], Glassdoor at $70385 [6], PayScale at $71993 [7], Indeed at $66921 [8]. There were 12,688 mergers & acquisitions in 2018 [9], the vast majority of which were for undisclosed amounts (companies do not have to disclose the transaction if it's for < $75M). Given the distribution of the ones that are disclosed, it's a good bet that most of those are for between $1M-20M (I've heard $1M/engineer as the going rate for aquihires, where the team succeeds in delivering a product but the product fails in the marketplace). This is still roughly 100x what the tenured professor makes though.

[1] https://www.aaup.org/sites/default/files/10112018%20Data%20S...

[2] https://dynamicecology.wordpress.com/2014/07/21/dont-worry-t...

[3] https://angel.co/companies?stage[]=Series+A&stage[]=Seed&sta...

[4] https://www.crunchbase.com/search/funding_rounds/a416e3bdb17...

[5] https://www.chronicle.com/article/How-Much-Did-Professors-Ea...

[6] https://www.glassdoor.com/Salaries/tenured-professor-salary-...

[7] https://www.payscale.com/research/US/Job=Tenure_Professor/Sa...

[8] https://www.indeed.com/salaries/Professor-Salaries

[9] https://www.statista.com/statistics/245977/number-of-munda-d...


Great, you've given us the numerator. Now you just need the denominator.


Not quite, the average startup probably has significantly more than 1 founder.

So, it’s clear more people get funding than tenure. However, difficultly really is a function of the success rate of each track and the amount of investment they make in it. Still, for the average collage freshman tenure is likely much harder.


You can't judge difficulty simply by the number of people who have obtained something. You need to know how many startups are seeking funding, and the number of people attempting a phd. That's the denominator I was referring to.

The average college freshman is not trying to get a phd. Picking a low starting point like that is just a way of making it appear more difficult than it is. The more appropriate number is the number of phd's attempted.


That's also the wrong denominator. The fact that a million unqualified people attempt startups because the barrier to entry is filing a $100 LLC online has no relevance on my particular chances of succeeding at a startup.

I didn't respond to your comment because the right denominator is going to depend on who is reading this thread. A Stanford CS major with a 2400 SAT whose parents successfully founded a tech startup is coming from a dramatically different pool than a data-entry drone with no college degree and no technical skills who figures that a weekend startup is his ticket to riches. The former's chances at both getting a funded startup and a tenure-track position are pretty good. The latter's chances are effectively zero. The latter falls into the pool of people who could conceivably "start" startups, but doesn't fall into the pool of people who attempt Ph.D programs.

You are not a lottery ticket.

Realistically, I think that tenured professorships and funded startups largely draw from the same pool of people and hence have the same denominators. That's why I focused more on the numerator. If you have the skills, dedication, and knowledge needed to become a tenured professor you usually (not always) have the skills, dedication, and knowledge needed to found a high-growth company, but there are more opportunities for the latter around, as well as fewer gatekeepers that can exclude you for arbitrary reasons.


The reason this behavior occurs is because people compromise themselves by jumping through hoops to get degrees. Meanwhile, universities have become fixated on producing credentials and nurturing their endowments rather than producing knowledge. Ponzi schemes fail when they run out of suckers. With the federal government giving hundreds of thousands in free money to round up new suckers it seems unlikely to fail anytime soon. Perhaps corporations will fill the void. I’m not hopeful.


My PhD adviser treated us well. He could be bluntly critical of our work ("Which of these conclusions do you actually believe?" became a catchphrase) but he treated us fine as people.

If you think a hazing ritual is required, as a kind of filter for who wants to be a professor most, you can set high standards for the work while still treating people with respect.


For anyone considering where to go to graduate school...

One's PhD experience is highly dependent on the university one attends and especially the professor that one works under. Each lab has a different culture and each professor treats students differently. It also matters a lot whether you are on fellowship or not (someone funded directly from a professor's grant is easier to bully/exploit).

Just like you should speak to employees at a company before joining it to get an idea of culture and work/life balance, prospective graduate students should speak to current graduate students to get an idea of life at a given lab. Most graduate students are openly aware of which professors are known for treating students like indentured servants and which are known for being hands off and generous with research funds. If you are trying to pick a university/lab, definitely go to visits and speak to 4th/5th year grad students (preferably over a beer or two). Typically by the end of their PhD, graduate students are willing to tell you the truth about the different labs/professors.

Also, remember that is often totally ok to switch labs within 1-2 years. Yes, it may set you back some on your progress, but it can be much better than being miserable for 5 years.


> No, what I really feel is that at work I am not actually treated like a servant any more but like a person.

My experience in FAANG research, outside of the pay > 100x multiplier, has been very different! In grad school we could work from anywhere at anytime, there was no requirement to be glued to a desk 14 hours a day and having to respond to emails on Friday nights or weekends or risk a bad performance review from your manager, no dystopian open office space, unlimited conference travel flexibility, etc. I actually kind of miss grad school despite having made 30K a year as a PhD student.


> rad school we could work from anywhere at anytime, there was no requirement to be glued to a desk 14 hours a day and having to respond to emails on Friday nights or weekends or risk a bad pe

which Univ gives 30K/year PhD? My top pay was 18K/year in my last year.


I think it's pretty typical, at least in physics. Every offer I had/that I have heard of was around 30k, I've only heard of one that was around ~24k iirc


Yep, STEM majors do relatively well.


So you're making 3 million a year? :)


Yep, my experience was similar to yours and I miss grad school due to that flexibility. I don't know where else to find such flexibility while getting paid decently.


Hello. I'd like to share one slightly tangential anecdote and observation regarding this.

I have a friend who was in a PhD in Physics program at CalTech. Absolute genius of a kid, and was surrounded by other people who are incredibly smart. My friend was always a very ambitious person, and wanted to join Wall Street as a quant after completing his PhD because he was interested in maximizing his income, and found the problems presented in finance/markets more compelling than those found in academia.

When he intimated this to people in the Department, they looked at him as if he had suddenly grown tentacles, because it's unbelievable to them that anyone would want to do something other than academia. This is a stark contrast to friends I have at places like Stanford, where no one quite frankly cares.

This doesn't touch on any alleged bad behavior or stress or pressures or experiences that people have while in grad school, but I believe that the cultural forces at institutions govern how people feel pretty strongly. That isn't a profound observation or even revealing, but I just thought that people would like to see a quick human anecdote to maybe relate to people going through this.


Unwilling abusers. they know they're treting you bad, they feel its wrong, but its their job! and so they have to justify that behavior to themselves.

The victims must deserve it! Anyone so subhuman as to tolerate these conditions and abuses is evidently in need of punishment for being such a shank.


There is no unwilling. "It's the job" is no excuse.


Yes. Someone explained it to me that way once and the term stuck, it could be expressed better. "reluctant?"


"The thing that I just cannot stop wondering is, why, knowing the perks of industry, advisers still treated us like this."

You can drop "knowing the perks of industry". There's no excuse for doing that under any circumstances, even if there is no industry alternative. I see this in many fields where grad students work in labs and are funded by grants. I wish universities would clean it up. There's simply no excuse for it. There's nothing about being in grad school that makes this behavior okay.


> The thing that I just cannot stop wondering is, why, knowing the perks of industry, advisers still treated us like this.

I think you had a bad advisor(s)! They’re not all like that. Maybe the group you were in was under a lot of funding pressure or something, that can cause bad behavior.


+1 : my grad school experience was not like this. I'm sorry you went through this. For me grad school was the best time ever. Pursuing science while being protected and guided by a world expert, and doing so without grown-up responsibilities like getting grants, doing admin, etc. It was so. much. fun.


I think postdoc is even better. Bit better pay and academic freedom. You are still somewhat shielded from responsibilities, but you can take that jump if you want to take the initiative.

But no thesis hanging over your head :)


The thing that I just cannot stop wondering is, why, knowing the perks of industry, advisers still treated us like this. Even though ostensibly trying to keep us, it felt more like squeezing out as much as they could.

But people in FAANG do work nights and weekends and it's totally not unusual to be assigned sudden bitchwork by your manager and have no recourse against it.


This is not the case on every team...


The recourse is to move to another team or company.


Saw this first hand at universities too. grad school has a lot of exploitation going on, and no one manages the professors.


As a note, it is possible to be in a Ph.D. program and not be an employee of the University, at least in the US. Being on a grad student stipend is usually a choice. Advisors are treating you like that because you allow yourself to be treated like that.

Yes, I have a PhD. Yes, I paid my own graduate school tuition. Yes, I had a successful career in ML.


I used the term grad stipend loosely, I was on various scholarships that meant I was not technically an employee or dependent on funding from my adviser. Did not make a difference at all.

Where you are right is that I did allow myself to be treated like that. I could have quite any time, of course.


Sorry you felt being abused or quitting were the only options. Strikes me as a false dichotomy, but, hey, don't know the particulars of your unique situation. But for others out there: your skill are highly valuable, others will pay for them, your advisor needs people with your skills, you have leverage. At least, try speaking up for yourself. Hell, organize. https://en.wikipedia.org/wiki/Graduate_student_employee_unio...


I stuck it out trying to fix bad culture for a year last year. My mom had good advice: it’s really hard to change an established culture, impossible without support from the top.

I learned a lot being a squeaky wheel, but I didn’t change the culture in any meaningful way beyond getting us a lunch room.


If you don't mind my asking, what multiple of $100k is your salary? Is ML still much more highly paid than standard engineering?


Fully expecting utter disbelief and skepticism, the best information I have indicates the compensation cap at FB is ~$3M. I have personally made up to $1.5M in a year doing AI work. If I had a higher threshold for pedigree BS in tech, I could have made ~$2M this year (but I don't and I have run out of you-know-whats to give about that personality type because life is too short for working with them IMO). In contrast, the NYT reported (previously) non-profit OpenAI's CEO made $2M annually from the get-go. I'm sure it's more now that it's OpenAI LP. And then there's Anthony Lewandowski's $120M comp at Waymo that somehow just wasn't enough for him. I can't relate to that.

Before AI, the most I had ever made in a year was $800K based on a long stock options play for a public company I knew was undervalued (and I was there for its rock bottom). And in academia, the most I ever made was $30K as a post-doc and $13K as a grad student before that. That anyone wonders why non-tenured sorts flee is mind-blowing.

In contrast, I've seen ruthless tenured types pull 7 figures with multiple labs at multiple institutions and lucrative consulting contracts with private industry and the military. All whilst cheating on their spouses or sleeping with their students, sometimes both.


were you high up in the org? Or an individual contributor on something that had a very strong measurable relationship with the bottom line?


High on the technical track as an IC that made multiple significant moneymaking contributions over the past decade (which was used against me last year at my last employer because I hadn't done sufficient pure research in that past decade to merit consideration as a scientist despite my doctorate).

My attempt to go into management was pretty disastrous due to my inexperience and toxic internal politics. And that made me lose interesting in pursuing it further because politics are just not my strong suite.

Mid-range engineers at FB I know personally are making ~$500K supporting AI efforts. AI is really lucrative for now and probably always will be. But I hope there's a purge of the posers in the field somewhere down the road.


The salaries at FAANGS are level-driven, so I refer to levels.fyi. I think the perk of being a research scientist is more freedom, building prototypes, not having to do support/oncalls in general, just being able to hand off that type of work.


3x to 4x depending on col. Grad student stipends are only 30k, sometimes lower

ML isn't much higher paid than standard engineering, but new PhDs from top schools who would've been competitive faculty applicants tend to get jobs at top paying firms and enter toward the top of the non-executive ranks.


grad student stipend can be as low as 20-30K a year, depending on a school.


It's also that as someone coming in with a PhD you will start halfway up the career ladder. My first job was a 'senior'.


> No, what I really feel is that at work I am not actually treated like a servant any more

Yessssss this. Even as a more permanent lab member (staff) I don't have clear defined role outside of, do what no one else wants to. It's not just the students!


Tell us your salary?


At a guess, are you a foreign student? Most post graduate degrees are being gotten by foreigners, because that’s how the economics works as the education ponzi scheme collapses.

The sweet spot, in terms of ROI, is to get a bachelors degree and go immediately into industry. The student shows they are competent enough to succeed on their own at a school environment where they have a nonzero probability of failure. For foreigners they stay in school for as long as possible so their green card can process. It’s not worth the risk of working for a company where they can be fired immediately and sent home, and it’s likewise not worth the cost of an h1b to companies for an undergrad. Since we graduate more phds every year than academia can absorb into teaching positions the green card process is effectively subsidizing postgrad programs, where the product is Indians and Chinese (primarily) who are desperate to get a high paying job in industry. Look up the numbers if you’d like.

At a societal level this is disastrous as it means that we have many foreign born who disproportionately hold the highest paying positions in society. Prosperity gospel and American exceptionalism aside this just leads to mass discontent and nationalism. It’s hard to argue they are wrong - why should a native born American allow a foreigner to take the highest paid jobs if given a choice? This leads (among other ways the country ignores and marginalizes poor native born citizens) to the election of people like Donald Trump.

This is not to say foreigners are bad - they often come from hard places where life isn’t easy. The real villains in this story are the aristocrats of the education empire. “More money for me and fuck everyone else,” seems like a common refrain these days.


There are several misconceptions here. First, foreign students rarely can apply for green cards, since student visas are not dual/immigrant intent. Instead the process usually starts after they get a job. And you can get fired from the job regardless of whether you have a BS or PhD. Also, most PhD students have a stipend and tuition waiver paid for by the university in exchange for teaching and research duties. Almost no PhD student pays the university a dime. MS is another story.

There is a demand for people with scientific and data skills and not enough natives hold these qualifications. However, I do agree that the education system is broken and the main reason why native born citizens don’t pursue higher studies is that they are usually under the burden of large educational loans.

There is also a smaller issue that liberal arts is a common choice for American students, which makes many of them unemployable. However, perhaps due to cultural pressure, many Indians and Chinese end up in science and engineering, which are in very high demand.


This just devalues the phd because the smartest native minds are skipping them and successful images of leaders without them are displgayed everywhere which stops everyone aside from foreign students from obtaining them.

I can't think of one successful tech leader with one. I'm sure there are many but they stopped being included when they discuss successes.


Larry and Sergey were PhD students when they started Google

I think it’s not just devaluing the PhD, it’s changing it from something that gets you a job in research, hopefully academia, to in many cases just another credential to help you in the immigration process and getting a coveted American job.


biotech startups often have PhDs as founders and leaders.


Know nothingism redux. Maybe those native born Americans should have made cars that didn't suck.


Boohoo. What exactly is your complaint? You did a PhD, learned advanced skills that made you more attractive to a company, and now you are reaping the benefit. Do you think you should have been given the same salary as a PhD? Come on.


Would you please not post personally nasty comments to HN? We're trying for a bit better than that here.

https://news.ycombinator.com/newsguidelines.html


> I have not adjusted my lifestyle

That's one gentle way of describing conspicuous consumption.


I'm not an expert, but I've read highly-cited ML papers where the researchers barely bothered with hyperparameter search, much less throwing a few million dollars at the problem. You can still get an interesting proof of concept without big money.

And low resource computing is more theoretically and practically interesting. I've heard experts complain of some experiments "they didn't really discover anything, they threw compute at the problem until they got some nice PR." This was coming from M'FAANG people too so it's not just resentment.


Are you sure they barely bothered, or did they just not mention it? I have heard stories of lots of “cutting edge” ML research actually just being the result of extremely fine hyperparameter tuning


This is a worryingly good question. Most ML papers represent real results, we're not going to see a replication crisis in that sense, but I've heard some fears about another AI winter arriving when people realize that the sum of our reported gains vastly exceeds our actual progress.

Hyperparameter tuning is one big concern here; we know it provides good results at the cost of lots of work, so there's a temptation to sic grad students on extensive tuning but publish on technique instead. Dataset bias is another, since nets trained on CIFAR or ImageNet keep turning out to embed database features.

Ironically, I'm not sure all this increases the threat of FAANG taking over AI advancements. It sort of suggests that lots of our numerical gains are brute-forced or situational, and there's more benefit in new models work than mere error percentages would imply.


But I do AI and I work at a university. Developing AI might be too expensive if you're going for larger architectures and eeking out an extra % or two in a kaggle like problem. Most advances in machine learning to come though are in fields where there has been little activity. I'm currently en process of making a career out of using very basic machine learning methods and applying them to physical science problems because 95% of the people in the field don't know how (problem of tenure). This opens up lots of opportunities for funding though. NSF etc will literally just throw money at you if you say AI and that you'll apply it to any problem.


May I ask at what stage of your career you are? I am a (confused) grad student working on “ML for physical sciences” and would really appreciate some advice on career directions.


> I'm currently en process of making a career out of using very basic machine learning methods and applying them to physical science problems...

Interesting! How do you find customers?


Oh, the career is in academia and research.


I call BS.

Companies spend a lot of money on AI because they have a lot of money and don't know what to do with it. Companies lack creativity and an appetite for riskier and more creative ideas. That is what Universities must do instead of trying to ape companies. The human brain doesn't use a billion dollars in compute power, figure out what it is doing.

Sort of by definition, it can never be too costly to be creative. Only too timid. And too unimaginative.


+1 on calling this BS, even though I think it is only partly BS:

While it is true that training very large language models is very expensive, pre-trained models + transfer learning allows interesting NLP work on a budget. For many types of deep learning a single computer with a fast and large memory GPU is enough.

It is easy to under appreciate the importance of having a lot of human time to think, be creative, and try things out. I admit that new model architecture research is helped by AutoML, like AdaNet, etc. and being able to run many experiments in parallel becomes important.

Teams that make breakthroughs can provide lots of human time, in addition to compute resources.

There is another cost besides compute that favors companies: being able to pay very large salaries for top tier researchers, much more than what universities can pay.

To me the end goal of what I have been working on since the 1980s is flexible general AI, and I don’t think we will get there with deep learning as it is now. I am in my 60s and I hope to see much more progress in my lifetime, but I expect we will need to catch several more “waves” of new technology like DL before we get there.


> The human brain doesn't use a billion dollars in compute power, figure out what it is doing.

This may not be true, if we’re talking about computers reaching general intelligence parity with the human brain.

Latest estimates place the computational capacity of the human brain at somewhere between 10^15 to 10^28 FLOPS[1]. The worlds fastest supercomputer[2] reaches a peak of 2 * 10^17 FLOPS, and it cost $325 million[3].

To realistically reach 10^28 FLOPS today is simply not possible at all: If we projected linearly from above, the dollar cost would be $16 quintillion (1.625 * 10^19 dollars).

So, when it comes to trying to replicate human intelligence in today’s machines, we can only hope the 10^15 FLOPS estimates are more accurate than the 10^28 FLOPS ones — but until we do replicate human level general intelligence, it’s very difficult to prove which projection will be correct (an error bar spanning 13 orders of magnitude is not a very precise estimate).

P.S. Of course, if Moore’s law continues for a few more decades, even 10^28 FLOPS will be commonplace and cheap. Personally, I am very excited for such a future, because then achieving AGI will not be contingent on having millions or billions of dollars. Rather, it will depend on a few creative/innovative leaps in algorithm design — which could come from anyone, anywhere.

[1] https://aiimpacts.org/brain-performance-in-flops/

[2] https://en.m.wikipedia.org/wiki/TOP500#TOP_500

[3] https://en.m.wikipedia.org/wiki/Summit_(supercomputer)


The dirty secret though is that AI isn't doing anything nearly comparable to what a whole human brain is doing. It's performing the functions of perhaps a small subset of NNs in the brain or maybe the equivalent of what a small rodent's brain is capable of doing. Obstacle avoidance, route planning, categorizing objects in vision, even language related functions are more of a mapping than an understanding. I think the point still stands that we're making very inefficient use of the hardware that we have. Universities need to be smarter about this and figure out how such a limited network of squishy cells do all the things they do, that's the whole point of concentrating smart people in an environment where they're given the freedom to pursue ideas without worrying about whether or not it generates a profit. You learn the 'how' and the 'why' rather than just the 'what makes money'.


Most AI is doing overly complicated versions of plain old decision trees or just pattern matching.

Or in the worst cases they are presenting one thing, and really just relying on hundreds or thousands of people in Bangalore to pore through the data sets and tag and categorize.


"Dirty secret" is a weird term for something practitioners and researchers are trying to tell anyone who will listen. It's a dirty secret on the marketing side.


Of course, and we’re already seeing incredible results, even from size-compressed deep neural networks running on custom acceleration hardware embedded now in most major smartphones.

I was simply responding to the parent post’s false claim (”The human brain doesn’t use a billion dollars in compute power, figure out what it is doing.”), in isolation from the rest of the post (which I generally agree with).


The higher numbers there (eg. 10²⁸) are irrelevant. It claims a single neuron is operating at 10¹⁶ operations per second—that is, as much computation is happening within a single neuron as is the sum of all computation between neurons in the whole brain!

Bostrom's estimate of 10¹⁷ is much, much more reasonable.

Note that this is still a number biased in favour of the brain, since for the brain you are measuring each internal operation in an almost fixed-function circuit, and for Summit you are measuring freeform semantic operations that result from billions of internal transitions. A similar fixed-function measure of a single large modern CPU gives about 10¹⁷ ops/s as well; the major difference is that a single large modern CPU is running a much smaller amount of hardware many times faster, and uses binary rather than analogue operations.


While I agree that 10^17 seems like a more accurate number, don’t forget that each neuron contains ~10^5 synapses which all process at timescales of 10s of microseconds. This gives you an additional factor of 10^9


The 10¹⁷ includes that factor. 10¹⁰ neurons by 10⁵ synapses by 10² Hz. It seems unlikely to me that the meaningful temporal resolution is going to be 1000x that of the firing rate, but if you want to add a factor of 10 or so I wouldn't object.


Neurons also seem to be doing local, protein-based computations.


There's a "tiny" secret - the beautiful power and plasticity of the human brain comes from it being a part of a physical body.

I recommend checking out some Antonio Damasio books for a fascinating read on this topic.


This sounds interesting.

I'm guessing you mean that much of the power of the human brain comes from its ability to interact with its environment?

I've been noticing more and more of a trend in recent years to treat the brain as separate from the body, and as if it is "trapped" in the body.


More than interaction, it's a living organisation in the service of life.

Damasio has a very interesting take on how emotions, consciousness etc. emerge from the the way the brain & body together process information from the "external" world.


> if Moore’s law continues for a few more decades

Moore's law has already been dead for years.


Where'd you pull that out? Yeah intel has dropped the ball and stopped innovating but TSMC and others still have us on track.


I also think this is kinda ridiculous. If anything I feel like the big consumer tech companies are disadvantaged because they need to deploy something that can scale to a billion people. They can only spend pennies per customer because the margins are so low. Sure they will have a research team for marketing purposes but when it comes to deployments they aren't doing anything too fancy.

The work coming out of companies with higher profit margins per customer are doing much more novel work from what I have seen.

All this is to say, I don't see universities getting shut out anytime soon. The necessary compute to contribute is pretty cheap and most universities either have a free cluster for students or are operating with large grants to pay for compute (or both).


> big consumer tech companies are disadvantaged because they need to deploy something that can scale to a billion people

pretty small group of companies. FAANG + maybe five more. So for everyone else, scaling is not an issue as you have described.


I don't think you have the right numbers in mind talking about the compute you need for AI. The prices are getting lower and lower of course, but you still need tons of money to train the kind of networks that make the news.


The resources academia has can be pretty big also. Top 500 doesnt have a lot of corporate machines on it.


I don't think you have the right numbers in mind talking about the networks used in academic works. The majority of network used in publications are good old references like VGG


Example? I have yet to see something actually deployed by one of the big tech companies that could not be trained by students on a university cluster. I also think you underestimate grant funding. I worked at a state school a couple years back in a research lab that had over a million dollars in grant funds specifically for equipment and outside compute (not for salaries or new hires) and this is not at all abnormal.


State of the art CV models (image, not video) can cost 3-figure dollars per training run.

State of the art language models can cost 5-figure dollars per training run.

There are a lot of variable in play here so your mileage will definitely vary (how much data, how long are you willing to wait, do you really need to train from scratch, etc) and these should only be considered very rough ballpark numbers. However, those are real numbers for SotA models on gold-standard benchmark datasets using cost-optimized cloud ML training resources.

At 5-figures per training run, the list of people who can be innovators in the LM research space is very small (fine-tuning on top of a SotA LM is a different, more affordable matter).


Sure but 3 figure and 5 figure runs certainly do not eliminate universities (see my above comment). Not to mention as I have said, most good universities will have clusters capable of training these that they maintain on premise drastically reducing that cost (and in a worst case just take longer to train).


It really does. You've got to remember that a good SotA paper takes hundreds of training runs, at least.

I can't go into detail about budgets, but suffice to say if you think $1M is a university compute budget that lets you be a competitive research team on the cutting edge, you are __severely__ underestimating the amount of compute that leading corporate researchers are using. Orders of magnitude off.

On-prem is good for a bit until you're 18 months into your 3 year purchase cycle and you're on K80s while the major research leaders are running V100s and TPUs and you can't even fit the SotA model in your GPUs' memories any more.

Longer to train can mean weeks or even months for one experiment - that iteration speed makes it so hard to stay on the cutting edge.

And this is before considering things like neural architecture search and internet scale image/video/speech datasets where costs skyrocket.

The boundary between corporate research and academia is incredibly porous and a big part of that is the cost of research (compute, but also things like data labelling and staffing ML talent).


Your goalposts moved a few figures. Furthermore, $1 million+ was not a university compute budget - that was money for a single lab on campus (at a general state school nonetheless) on a specific project.

You still have yet to provide any concrete sources to back up your claims. We're talking about contributing to research here. If multi-million dollar training jobs are what it takes to be at the cutting edge you should be able to provide ample sources of that claim.


- "Some of the models are so big that even in MILA we can’t run them because we don’t have the infrastructure for that. Only a few companies can run these very big models they’re talking about" [1]. NOTE: MILA is a very good AI research center and, while I don't know too much about him, that person being quoted has great credentials so I would generally trust them.

- "the current version of OpenAI Five has consumed 800 petaflop/s-days" [2].

- Check out the Green AI paper. They have good number on the amount of compute to train a model and you can translate that into numbers.

- https://medium.com/syncedreview/the-staggering-cost-of-train.... NOTE: That XLNet number has to be wrong - it should be 5-figures, not 6.

I'm not an expert in on-prem ML costs, but I know many of the world's best on-prem ML users use the cloud to handle the variability of their workloads so I don't think on-prem is a magic bullet cost wise.

$1M annually per project (vs per lab) isn't bad at all. It's also way out of whack with what I saw when I was doing AI research in academia, but that was pre deep learning revolution, so what do I know.

Re: the moving goalposts - the distinction is between the cost of a training run and the cost of a paper-worth research result. Due to inherent variability, architecture search, hyperparameter search and possibly data cleaning work, the total cost is a couple orders of magnitude more than the cost of a training run (multiple will vary a lot by project and lab).

I understand why you don't trust what I'm saying. I wish I could give hard numbers, but I'm limited in what I can say publicly so this is the best I can do.

[1] https://medium.com/syncedreview/yoshua-bengio-on-the-turing-... [2] https://openai.com/blog/how-to-train-your-openai-five/


> The human brain doesn't use a billion dollars in compute power, figure out what it is doing.

Perhaps figuring out what it is doing itself costs billions of dollars?


What company would pay for this kind of research without expecting some sort of profitability from it? I imagine Facebook is trying to figure out how to brain works, but I hope they don't get very far.


If you are the kind of person that might be able to figure out what the brain is doing, chances are a company is going to pay you better and provide more resources to you than just about any university.


But will they pay you to figure out what the brain is doing, or what the brain is doing that increases margins?


Depends on the resources. Amazon undoubtedly has much more compute, but they’re not exactly known for their wetlab facilities—-and this question undoubtedly needs both.

That said, I think you are right that academia’s structure may need to change. Right now, we’re locked into a model where projects mostly need to be doable by a handful of researchers (almost entirely trainees) in a few years. Other than these time-limited positions, there’s not a lot of room for skilled individual contributors, which seems goofy when tackling such a hard problem.


That would be one of the best one-time investments made by humanity then, but I don't think throwing X billions at the problem can solve it.


>That would be one of the best one-time investments made by humanity then

This has been done for thousands of years, we are not the only generation who found it's important to understand how we work.


Although this is technically true, being creative/original and being academically successful might not be the same thing.

There still will be a genius discovered in wild, but an institutionalized effort might make the discovery process quicker and more efficient.

And then there's a correlation of the test score of rich kids and their parent's income.

I want to agree with your sentiment, but the cynical side of me says that sometimes banality wins.


Most of the human brain is analogous to Boston Dynamics feedback control systems, so yes, there's billions of dollars in research


Boston Dynamics work seems very cool, but I dont see a big market for it until AI makes more progress and the price of such hardware drops significantly. Even then it seems like a novelty.


>The human brain doesn't use a billion dollars in compute power, figure out what it is doing

Phahahaha, this made me a good laugh :) To find out how your brain works guess what are you going to use - the brain itself. It's like trying to cut a knife with itself, or trying to use a weigher to weight itself.


>to use a weigher to weight itself.

Flip it upside down. Like he said, be creative.


Or use 2. That's what we do, use one brain to understand another brain.


Well then don't find out how your brain works, find out how a brain works then.


I'm pretty confused, What else would you like us to use?


I don't know if we can use anything at all for this. More so I assume that engineering and math are not the right approaches in the same way they are not the right approaches to things like humor, poetry or design.


might as well figure out how to make energy using sun and water like plants at the same time.

If there is any spare time that week, how to make spiders web from eating flies as well


My conclusion after reading many papers in the Natural Language Processing field (which is now all about machine learning) is that, generally, company papers focus on tweaking pipelines until they have increased the accuracy score. If they have done so they quickly publish this result and leave the analysis of their results for others. (BERT[1] is a prime example of this.) However, I do not agree with the fact that companies lack creativity. If you look at all the wide research currently being undertaken at the big tech companies you will be amazed by their scope. (I found this out by coming up with 'new' ideas during my thesis, only to find out that some researcher at some big tech company was already working on it.)

[1] Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).


There is a far more obvious problem than the cost of computing - the cost of labor. If you are a deep learning researcher, you can join any of these companies and multiply your salary 5x.

Computer constraints are relatively straightforward engineering and science problems to solve. The lack of talent, that seems like the bigger story.


As an ML researcher without a masters or PhD that’s not necessarily true. Companies reach out to me and say things like “we are really excited to talk to you about opportunities, especially given your 4 years of production ML model work and research”

Then right before interviews I hear, “well we like you, but didn’t realize you didn’t have a PhD. We have a really awesome software engineering / machine learning engineer opening that’d be great for for you. That’s what we’ll do the interviews for”

Cool. Probably happens 2/3 the time, honestly.

Just ask me about my parents, research, and projects...

Anyway, point being you can’t 5x with a B.S. in C.S. That’s why there’s a “lack” of talent.


A lot of the PhD requirements actually comes from the grants and R&D tax benefits these guys are after. For a lot of those you have to justify you have the right workforce.


If you are talking about the R&D tax credits there is nothing that says you have to have a PhD. We get a bunch of these every year and we only have one person with a PhD that is part of the research.


If you have a reputation in your field for good research and top-tier publications, like a PhD would, then I'm really surprised that you are being treated any differently.

There are some famous computer science researchers with no PhD, but they do have the papers to back themselves up.


I think they normally would, but I’m pretty young (27) so they don’t assume I have that. The HR folks who set this up are typically checking boxes and don’t realize equivalencies.

My usual tactic is to mention the patents and send some of my work and they reconsider.

In either case, my general point was that the 5x salary bump is pretty much reserved for an artificially set subset of people.


> sseveran 5 hours ago | favorite[-] [collapse root]

If you are talking about the R&D tax credits there is nothing that says you have to have a PhD. We get a bunch of these every year and we only have one person with a PhD that is part of the research.

How did you get 4 years of experience in ML at 27? care to share?


Probably technically it’s been longer, but I built neural networks in OpenCL in 2014 (22 years old). That was for a research project at university.

In 2015, I started building ML models full-time at two large companies. First while I school, as a contractor, then joined full-time after school


> How did you get 4 years of experience in ML at 27? care to share?

Presumably they've been working in ML since they were 23? What's so crazy about that?


This depends on the company but my understanding is that in most tech companies, there's not much distinction between being an ML engineer and being a research scientist/engineer with a Ph.D, in terms of the type of work or compensation, provided that you aren't in a pure research role (i.e. you were hired because your research is highly regarded by academic peers in a narrow field and get to do something close to original research with fellow academics). Most Ph.Ds aren't qualified for the latter type of positions either and the number of non Ph.Ds that are qualified would be entirely negligible.


Hasn't that been the case for decades though? Maybe not those exact numbers, but from what I understand, STEM profs could always go into the private sector/government and make more money. AI profs were probably the exception a few decades ago; they're just catching up now that it has gotten good enough. EDIT: By 'good enough' I mean that it's progressed to the point where it can actually be used for economic gain in a company/product, where before it was just a money pit.

Long gone are the days when your patron king or queen would fund you handsomely just for doing math.


Before AI was hot they were like other CS professors, maybe they could double their salary by going into industry. And give up the academic freedom to do whatever the hell you want as far as research is concerned, the joy of teaching for those who care, the unshakable job security of tenure, the option of doing the bare minimum to avoid getting fired once you have tenure, summers off.

Then their field got hot and suddenly people are not offering them double their academic salary in industry. They’re offering dectuple their academic salary.

Real demand calls forth supply.


Yeah, but at university you are more free to pick your project and can be much more nimble - so it balances out.


You might be shocked to find out what the job is like for most professors. Grants, grants, and grants.


and travel, travel, travel, and submit, submit, submit papers.


Our AI lab is mainly experimental at this point, but we're a company so we're looking at how to apply certain models at the wide amount of data in our warehouse and how it could be surfaced in multiple ways.

We've also thrown a lot of money in this endeavor, so there's research and building a foundation with the underlying intent that this will be used to help the company.

I don't think that's too different than the university model of grants and research.


For me, that would maybe balance out if the salary difference was <2x. 5x? Forget about it.


A select few get a chance to truly pick their projects, even then, most need to apply for grants internally and externally to get the funds. Professors may also be tied to teaching, cutting their time from the researches they them valuable. When accounting for this, a lower purchase power and seeing 'the grass is greener' on your private corp peers, you are very likely to think long and hard about jumping ship.


Maybe if you're a big name researcher at a top university. Most people are 'forced' to work on whatever happens to be getting grant money that year. But you are more free with regards to how you approach a problem and what you have to deliver.

Fortunately AI is hot right now so there are lots of projects where you can get funding simply by saying I want to apply machine learning to X.


Sadly we as a country dont invest more into CS research like we used to. Companies used to pay to have teams of developers go through training. Now in some places you are lucky if they even pay for training you do at home.


Same problem in Western Europe. In Eastern Europe companies train you before you start with hands on work or a paid professional training. In the West, companies expect you to somehow be born a senior dev. :)


It may be a sign that CS is becoming a mature field. Aeronautical Engineering doesn't build experimental aircraft. Chemical doesn't build experimental refineries. If you want to do those things you work for Lockheed, Boeing, Exxon, etc.


This is demonstrably false; high-end research universities do exactly all these things. Stanford has a high-end fab. Caltech students build experimental aircraft. Universities build nuclear reactors for research. I didn't find any examples but I'm certain that Unis in Texas have small refineries.


There are a few "have" universities in each case, and mostly "have not" universities, which is consistent with the direction AI is going, according to the NY Times article.

Even then, the experimental aircraft at Caltech is not a full-scale prototype of the next generation of fighter after the F-35. How does Stanford's fab compare with TSMC's?

Edit: TSMC collaborates with four Taiwan universities on research and provides fab services for 23 universities. https://www.tsmc.com/csr/en/update/innovationAndService/case...


I hit this wall in my own Deep Learning-based automation business; I am now forced to rely on transfer learning most of the time, my Tesla/Titan RTX-based in-house "server" is no longer capable of training latest greatest models in reasonable time, cloud is out of question due to costs and (automated) parameter tuning with distributed training takes ages. I can still ask a lot for customized solutions, though I see the writing on the wall that it might not last too long (2-4 years) and I'd have to switch business as there will be only a handful of companies able to train anything that is better than current SOTA.


But do you need SOTA for your business to be viable ?


Well, right now I can get within 1-3% SOTA performance for models I need but I expect that in 2-4 years I'll be much farther away with laughably outdated models nobody would want to pay for. Like now I can replace 100s of people in e.g. quality control as they make e.g. 15% errors and I can bring that down to e.g. 8%. But later big boys might be able to bring it down to 1% for $2M training cost and I wouldn't stand a chance.


I asked the same question after several AI talks given by large companies (Google, Nvidia, etc).

The general answer is that it's still possible to try out things on a single GPU or several servers and many gains come from good features and smart network designs. On the other hand, squeezing out the last 5% does require more data and budget.

Personally, I think you can still do a lot with a moderate budget and smart people. But would love to hear other opinions.


Look into the modern nlp models. BERT and its many derivatives, RoBERTa, XLNet. Training all of these require roughly TB of data, and generally take days on multiple TPUs. You often can’t even fine tune on a single GPU without some clever tricks.


When compute becomes prohibitively expensive, people find opportunities in attaining better algorithms or cheaper compute.


Agree 100%. I spent decades working on improving models of protein folding and design, only to learn that recently, people are using information theory and evolution to produce much higher quality models without as much computation. There are a lot of things I've worked on in the past that used oodles of CPU which got replaced with reasonable, cheap approximations that are good enough to make progress.


“Better algorithms” - sounds like a good research topic!


Drat, scooped again...


Why do universities need to compete with big companies? The less they behave like corporations and more like places of learning, the better.


So tell me something....

Have AI techniques actually changed in the last 20 years, or is there just more data, better networking, better sensors, and faster compute resources now.

By my survey of the land, there haven't been any leaps in the AI approach. It's just that it's easier to tie real world data together and operate on it.

For a university, what changes when you teach? This sounds like researchers feeling like they can't churn out papers that are more like industry reports vs advances in ideas.


There hasn't been any kind of paradigm shift but there have been a bunch of real although incremental improvements. Improved optimization algorithms, interesting and novel neural network layers.

Some of this is even motivated by mathematical theory, even if you can't prove anything in the setting of large, complex models on real-world data.

The quote from Hinton is something like, neural networks needed a 1000x improvement from the 90s, and the hardware got 100x better while the algorithms and models got 10x better.


So I guess this kind of leads to my original point. If you're a school, nothing has really changed that would require you to invest gobs of money to teach AI. It only matters if your idea of research is trying to create something you can go to market with.

So basically every graduate chemistry program.


Indeed. The academic theory for NNs has been there since before the 90s, and is solidly grounded in a mathematical framework. Whatever new techniques arose after 2010 are a) empirical results obtained by semi-trial-and-error and b) unexplainable mathematically.


While it's true that training a new DL model requires lots of computation power, I personally feel that such activity mentioned in the article is more of "application" of ML instead of "research". I personally think University should move in the direction of "pure" research instead.

For example, coming up with a new DL model that has improved image recognition accuracy would mean it has to be trained through the millions of samples from scratch, which requires a lot of time and money. But I'd argue that such thing is more of an "application" of DL instead of "research". Let me explain why... Companies like FAANG have the incentives to do that, because they have tens or hundreds of immediate practical use cases once model is completed, hence I call such activity more of an "application" of ML rather than "research", because there's a clear monetary incentives of completing them. What about University, what sort of incentives do they have by creating a state-of-the-art image recognition other than publication? The problem is publication can't directly produce the resources needed to sustain the research (i.e. money)

I think ML research in the university should move in the direction of "pure" research. For example, instead of DL, is there any other fundamentally different ways of leveraging current state-of-the-art hardware to do machine learning? Think how people moved out approaches such as SVM to neural network. Neural network was originally a "pure" research project. At the moment of creation, neural network wasn't taking off because hardware wasn't capable to keep up with its computational demand, but fast forward 10-15 years later, it becomes the state-of-the-art. University ML research should "live in the future" instead of focusing on what's being hyped at the moment


The article presents the rising costs couched within the theme of all-to-powerful tech companies like Google and Facebook, which is really irrelevant: These costs are not high because of those companies, they are high because the research itself is incredibly resource intensive, and would be so whether or not large tech companies were also engaged in it. In fact, with Google and their development of specialized chips for this purpose, AI research is probably getting cheaper due to their involvement.

Next, this research will probably continue to get cheaper. The cost to do the Dota 2 research 5 years ago would have been much higher, and will probably be even less expensive 5 years from now.

Also, I think there's plenty of room for novel & useful at the bottom end where $millions in compute resources are not essential. Cracking AI Dota is certainly interesting, but it's hardly the only game in town, and developing optimized AI techniques specifically for resource-sparse environments would be a worthy project.


Having a few hundred consumer GPUs or a few dozen "datacenter" GPUs should be within the reach of any University department, and at least Nvidia also seems happy to sponsor University setups (after all new research creates demand from the industry).

Sure, this doesn't compete with Google's data-centers. But that's assuming Universities are for some reason competing against private industry. That's not how any other engineering discipline works, so it's a bit odd to just assume without discussion.


"Having a few hundred consumer GPUs or a few dozen "datacenter" GPUs should be within the reach of any University department"

That was funny - however not even close to reality. I have to work on a GTX 1080 (not TI)...


A month ago Nvidia had a grant program running to get rid of refurbished^w^w^w^w donate 1-4 Titan Vs based on a 1-2 page application [1]. When my university started offering a CUDA course we got ~15 top of the line GTX cards sponsored by Nvidia. Buying 100 GTX1080TI with 11GB with supporting hardware is in the range of 100 thousand Euro/USD (before applying education discounts and asking for sponsorships). Not money spent on a dime, but not outrageously expensive either (the article mentions OpenAI spending millions on cloud GPU resources, compared to that spending 100k on something you get to keep is nothing)

[1] https://developer.nvidia.com/academic_gpu_seeding


Why can't they afford it? I remember when I worked at a physics lab at University and each team had many pieces of equipment each costing more than $100k. You'd get quite a lot of compute for that kind of money, especially since AI research doesn't need any other equipment.


I get you can’t do all your work on them, but 24 hours of 16 GPUs is $215, using the newest instance type on-demand.

It’s within the reach of many grants to afford a few scaled runs of a technology as a demonstration of behavior at scale.


AI is now a data problem, and a bit of optimisation problem, both should or could be solved at commercial end of research. Universities should be more focused on what's next ? Not saying this is not "the next". but, as, ground level ideas of AI are quite 50-60 year old if not more, current ground level research should make theoretical platform for the technology that is going to come in 20-50 years.


You need all four quadrants: risk capital, industry, academics, and that elusive X-factor. The hard work of AI/ML theory, such as issues around generalizability and ethics, is still done around whiteboards and academic conferences.

A more useful metric may be the proportion of proprietary versus open discovery. I don't know if I can point to a single example where researchers have not rushed to put their latest breakthroughs on OpenReview or Arxiv. Even knowledge of a technique, without the underlying models or data, is enough to influence the field.

Academic free inquiry and intellectual curiosity, looks very different than product-focused solutions-oriented corp R&D. A good working example looks something like Google AI's lab in Palmer Square, right on the Princeton campus. Researchers can still teach and enjoy an academic schedule. I think it was Eric Weinstein who said something to the effect that if you were a johnny come lately to the AI party, your best bet would just be to buy the entire Math Department at IAS! In practice, its probably easier to purchase Greenland ;)


I don't quite understand the issue here. I thought the main reason for the many recent breakthroughs in AI was that hardware has become cheaper and more powerful. Anyone can train a neural network on the graphics card of their home PC now. There are powerful open source frameworks available that do a lot of the heavy lifting for you. You can do far more today than you could back when I was in AI.

Of course the Big Tech companies have far more resources to throw at it; that's why they're Big.

A far more serious issue than access to computational power, is access to suitable data, and particularly the hold that Big Tech has on our data.


It seems like a structural problem. Deep learning generally performs better than anything else that is well known but it also has well known limitations and inefficiencies.

People should question all of the assumptions, from the idea of using NNs and the particular type of NN and all of the core parts of the belief system. Because these certain aspects are fixed on faith more than anything else.

If you want efficiency of training, adaptability, online, generality, true understanding, those assumptions might need to go. Which would not mean you could learn from DL systems, just that core structures would not be fixed.


Is compute the limiting factor or data?

I see an asymmetry between academia and industry. Academia has the models, industry has the data. Compute is more balanced because it's usually commodity hardware.

If industry is outpacing academia in research, I think that means data is the more valuable quantity, not compute.

And the article's theme of concentration is more a problem with data. Is Facebook dominant because of its algorithms or because of its database? If other companies had Google's index and user telemetry could they not compete with a rival search algorithm?


Maybe this is a good time for university researchers to develop AI algorithms that are not so data and compute hungry. Here's a promising bit of that--https://www.csail.mit.edu/news/smarter-training-neural-netwo.... This is easier said than done but necessity is the mother of all invention they say.


Maybe it's time to start triaging/incentivising what kind of problems we spend that computational power and CO2 credits on.

Playing in the sandbox with AI (especially the "brute force" deep learning algos) in and of itself does not equate _intelligence_ or _progress_ for us as civilisation.


AI is not expensive necessarily

What's expensive is thinking AI evolution means deeper networks and the only way to get better results is by throwing more GPUs at the problem.

And to be honest those with "infinite" resources are a bit "guilty" of pushing this research lines.


I don't see the problem. Universities don't solve problems by simply throwing more dollars at them. If compute resources are scarce then researchers are incentivised to invent machine learning techniques that use those resources more efficiently.


the point about the increase in computational resources reminded me of Gary Marcus latest book. It seems like the increasing trend away from research into generalised AI models towards narrower and narrower domain-specific ML models that consume more and more energy and data is becoming an increasing bottleneck.

I think instead of trying to build larger computers there is an opportunity for academia to move back towards the construction of cognitive models and minimizing the reliance on computation and data. That's what intelligence is supposed to be all about.


You hit the nail on the head. AI is nowadays defined as a very sophisticated form of curve fitting [thanks to TensorFlow a.o.], that's what training ANN's comes down to. There is clearly a lack of intelligence, f.e. the [main] method used to beat Lee Sedol was MCTS (with the help of some ANN's).


University isn't utopia. Not everyone wants to live there. Also, there aren't enough university research dollars to adaquately fund AI research relative to the current demand.

This is a good thing.


Hasn't it always been like this? When the Model T came out it was concentrated even more I would assume. What other way could it be?


Not super knowledgeable about deep learning, but is some form of distributed computing possible here, or is the amount needed just too high?


Yes, distributed computing is sometimes possible for machine learning. However, distributed computing typically makes computations faster or bigger, not cheaper.


Is there anyone who thinks we shouldn't break up these large companies on anti-trust grounds?

Imo it's got to the point where if you have a bad idea your startup fails and if you have a good idea they will copy it and your startup fails. Even companies like spotify and slack which have "made it" are now threatened by Google Music + Apple Music and by Microsoft Teams.

Would be interested to hear other opinions.


This could be said with anything, some people have more money than others.


Well, these AI calculations also emit much CO2. It may be a luxury that we can only afford ourselves within measure. The vision of a society where AI plays a large role may only be feasible both ecologically and economically if/when we have fusion energy.


I don’t buy that fusion is a prerequisite here, especially since just developing those reactors requires modeling on super computers.

Yes, we should be powering these things with renewables, but we should make cuts somewhere else before giving up on basic research.

I just can’t seem to relate to people who think we should all punish ourselves because society didn’t see climate change coming. Sure we need to do something, but it’s not practical for us to collectively put on hair shirts and wander into the woods.


I am not sure we should be punishing ourselves either but when I look at how things are currently going in my country, it seems we already are. There are climate plans that basically involve 25 centimeters of insulation to every house and heating by heat pumps. At the same time currently pretty much all building projects are on hold, not because of CO2 but because of nitrogen compounds. I am not really sure these are good things to do either. I am guessing maybe not. But in such a political climate the quite large amounts of CO2 emitted by training deep learning networks does feel like a frivolous expense.


The ideas that deep learning is meaningfully contributing to CO2 emissions seems dubious to me. Lots of data centers are powered by renewables. Furthermore I think advances in AI are probably critical to climate adaptation. “Turn off the computers to save the planet” strikes me as Luddite-ism.

We need to be switching off coal fired power plants not computers.


Training is expensive, but running trained models can be quite lightweight.


If you want to be that drastic, why not ban non renewables crypto mining first. Also, google cloud is 100% renewable, right?


In the US transportation emits more greenhouse gases than the entire US electricity production. AI easily pays for itself if it makes transportation or industrial processes even marginally more efficient.


A lot of data center power is hydro and wind.


If only the grid worked like that - case and point, Facebook Ft.Worth data center is there because of Wind Farms in Jacksboro TX. Reality, we really do not know the electrons are coming from Comonche Peak Nuclear or any other NG generators in the vicinity or the Wind Farms.

They just use a nice contract system - to make every one feel good about it and fund the wind farms that is it.

With Hydro its more predictable but even there no guarantees.


This is so ridiculous. Of all the things that emit co2 you’re focusing on AI calculations? Tell me you’re not serious.


I'm dead serious about this. I think we should weigh all large scale compute against its carbon cost. Granted, fang is starting to look for renewable sources for its DC's, but US data centers alone will use around 73 billion kWh in 2020 [1].

How much of that is ML? As someone pointed out in another comment, how much ML is actually useful? Some ML is "carbon good", such as route planning saving energy. But do we really need to spend billions of kWh just to get slightly better recommendations? Do we really need to increase margins a fraction of a percent for some company to show more ads and sell more?

And while we're on the subject of power, maybe if web pages weren't 300mb of crap and 1k of content, we could cut back on another few billion kWh on servers and routers.

This shit is serious, we're dying here, so yes, absolutely, let's do the math about how much AI costs. It's up to us, the computer people, to ask these questions and solve the part of this problem that WE own.

1. https://eta.lbl.gov/publications/united-states-data-center-e...


we're dying here

Who? Where? Why?


The point is the expenditure is unsustainable at the current rate, and we can do something about it.


So nobody is dying from this.

Are you doing something, personally, about this? My office is 100% solar, home about 20%, telecommute 100%; you?


It literally doesn't matter if it's co2 or not that powers it. Even if it's all wind, that energy could have gone somewhere else. We are wasting energy, by a lot. Fusion won't saves, there is already evidence that every extra energy we produce is used, having more green energy doesn't reduce fossil fuels energy consumed, it just makes more energy consumed overall. Don't waste energy.


Breaking news: some universities have more money than others.


I wonder how it compares to something like cryptocoin mining/transaction processing.


And so Deeplearningcoin was born.


Interesting idea: the blockchain is a sequence of gradient descent steps


I'm seeing less and less point with all of that AI thing.

I'm seeing companies spending tens of megabucks on idling GPU farms without a clear idea what to do with them.

Saw that when we did a subcontract for datacentre for Alibaba. They had a huuge reception for all kinds of dignitaries, showing them CGI movies of their alleged AI supposedly crunching data right there in the DC — and all that with all of hardware in the DC being shut down...

The moment I poked a joke about that during an event, there came dead silence, and faces on the stage started to turn red. The guy defused the situation with a joke, and the party went on.


The downvotes for you are really disappointing. Because before AI, there was the walled garden of HPC, with temple-like supercomputers to which gaining access involved magic incantations and politically savvy groveling from academics wishing to graduate from grad school in less than a decade. GPUs disrupted that garden in a big way. Or all of this has happened before(tm).

But in the past few years, the one two punch of AI and Python rebuilt that garden. So now not only are the big guys controlling training the big AIs, but the insistence on describing those computations entirely from a Python abstraction of those computations is leading to insanely inefficient use of those GPUs (and coming soon, dedicated ASICs with even worse software libaries past Resnet, BERT and a few other acceptable strawmen).

That said, one can do a lot with AI/ML models that fit in a single sub $10,000 machine built entirely from consumer parts and doubly so if one is willing to profile and low-level optimize one's code as intensely as stare at the data going into the model (you're in grad school, you have time for this, you have all the time so to speak and it will save you lots of time in the long run). For inspiration, all the big guys have GPU codemonkeys on staff to micro-optimize as needed. One might want to take a cue from that and DIY.


AI has too perfect of a sales pitch and surrounding narrative for business people to treat it rationally. The fact that philosophers who don't understand AI are debating AI philosophy is strong evidence of that. A crash is less likely this time around because it's being funded by established companies instead of the public market this time, but there has to be a moment of reckoning for the baloney eventually... I hope.


>philosophers who don't understand AI

If you mean "philosophers who don't understand ML/gradient descent/linear algebra/statistics etc", I don't think many philosophers are debating about that stuff.

If you mean "philosophers who don't understand artificial intelligence", that would be all of them, and everyone else too, because no-one understands artificial intelligence yet. And a lot of the people who come closest to understanding it are in philosophy departments.


I more or less agree with your second point. Presently the philosophy of artificial intelligence is kind of like theology, it is molded by how people like to think more so than it is determined by reality. Sales pitches also tend to be molded by how people think. AI has that theological essence that makes people love it even if they don't understand it, and that makes it a lot easier to sell big GPU farms to companies that don't have a legitimate use.


> "without a clear idea what to do with them."

Of course they know what to do with them: show us more effective ads. That's the cutting edge of technological progress now.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: