The thing is, even though having 1000x the resources compared to university, that does not really make me happier about the work specifically. It makes some things easier and other things harder.
No, what I really feel is that at work I am not actually treated like a servant any more but like a person. I don't have to work weekends and nights any more. I can take vacations and won't be flooded with emails every.single.day during holidays. I don't have extra unpaid responsibilities that I have zero recourse against.
The thing that I just cannot stop wondering is, why, knowing the perks of industry, advisers still treated us like this. Even though ostensibly trying to keep us, it felt more like squeezing out as much as they could.
It's cultural. Their PhD advisors treated them the same way. A PhD is effectively a hazing ritual required to break into academia.
The power your advisors wield over you is terrifying in graduate school. You are severely underpaid AND they have you over a barrel.
It's 1000 times better especially when your research is clearly not on the cutting edge and you end up comtemplating life choices because what you're doing are basically useless to the world, and you're only doing it because some funds decided to throw money on it.
At least that's how it is for people whose primary research method is coding, such an easy exit. The powerpoint making variety have it harder though.
I realize this isn't the case for everyone, and I realize my situation was odd, but... there's always a trump card: you can drop out. I get that for international students it's not always so simple (losing your student visa, having to go home) and can be a very hard choice.
My supervisor and my department occasionally tried to push for things that I believed were ridiculous, and the threat of dropping out seemed to work reasonably well to push back on that. My understanding is that a supervisor having a low completion rate is a black mark that can hurt them, especially if they're chasing tenure. I didn't pull this out like a petulant child stomping their feet when they didn't like their chores, but rather when things were getting stupid.
Tony: "This is totally unreasonable and the timeline doesn't work for me."
Supervisor: "This is how it must be. There's no other option."
Tony: "I could drop out and continue on with my life."
Supervisor: "Now wait a minute, let's see what we can do to make this work..."
Normally, during tenure decisions, the grad students and postdocs will be interviewed by the dean. If you work for a pre-tenure prof, and decide to leave, meet with the dean and let them know your concerns before dropping out. Make sure to include lots of physical documentation, such as emails requesting you to do inappropriate things.
I still look back and would have loved to do that research, but I appreciate his honesty.
In Germany PhD is almost always done after an MS and the PhD usually takes 4-6 years depending on the group.
Your definition is painfully on point.
And the perks of "getting into academia" are getting smaller with each day.
But hey, you get to have those 3 letters next to your name and probably can ask for a higher salary, that is if you don't go back into the academic machine and the hunger games of chasing grants/contracts and eventually tenure (but it is my impression funding a successful startup is easier than getting tenured today)
The rewards for a successful startup are about 2 orders of magnitude higher than the rewards for tenure, too. The latter means that you have a job for life; the former means you don't need a job for life.
>The rewards for a successful startup are about 2 orders of magnitude higher than the rewards for tenure, too.
Massive  on both these claims. "But my brother-in-law" is not evidence. It's not at all obvious that there are more startup markets than tenure positions. Also, founding a successful startup is largely a matter of luck, and being in the right place at the right time.
And for the first decade and a half it was great to make money, but also possess the grounding to build things collaboratively with academics. But then AI happened, and the market inefficiency that academia created for itself by making 3/4 of us make less than a high-school dropout with a trade skill caused tech to suck them all up like a sponge to feed the AI craze.
And these days, the same craptastic culture that drove me out of academia (pedigree, PhD (even though I have one but not from one of the 4 horsemen of AI or even one of the right schools so it's nearly worthless apparently), and the coveted "applied scientist" title which at least at AMZN means you get 15% higher comp to order both the scientists and engineers around because you convinced an L10 you were one whatever your actual qualifications are) are now status quo in tech. But I'm old now, and every day I am tempted to just throw in the towel and retire.
Commence those downvotes guys! None of what I just wrote could possibly be true in any way. Tech is clearly awesome and we just need to keep up the great work so we can hit that singularity sprint goal by 2030 or so to save Ray Kurzweil from squandering his hard-earned money on an Alexa-equipped toupee.
As of 2016, academic institutions hired 21,511 full-time, tenure-track professors . Your odds of getting tenure given that you're in a tenure-track position vary by institution, but seem to range from about 75% at elite private institutions to 90% at state schools . Figure about 18,000 professors were hired into positions that will eventually grant them tenure.
AngelList lists about 30,827 companies at the Seed or Series A/B/C stage . Crunchbase has 33,577 funding rounds that occurred within 2016 . The former stat has issues in that it includes funding rounds over a broad time period, and the latter potentially has issues with multiple funding rounds for the same company happening; however, given their general consistency and the existence of funded companies that do not appear in AngelList nor Crunchbase, 30,000 companies funded per year seems about right.
The average tenured professor salary varies heavily by field, but the Chronicle of Higher Ed reports it at $104820 , Glassdoor at $70385 , PayScale at $71993 , Indeed at $66921 . There were 12,688 mergers & acquisitions in 2018 , the vast majority of which were for undisclosed amounts (companies do not have to disclose the transaction if it's for < $75M). Given the distribution of the ones that are disclosed, it's a good bet that most of those are for between $1M-20M (I've heard $1M/engineer as the going rate for aquihires, where the team succeeds in delivering a product but the product fails in the marketplace). This is still roughly 100x what the tenured professor makes though.
So, it’s clear more people get funding than tenure. However, difficultly really is a function of the success rate of each track and the amount of investment they make in it. Still, for the average collage freshman tenure is likely much harder.
The average college freshman is not trying to get a phd. Picking a low starting point like that is just a way of making it appear more difficult than it is. The more appropriate number is the number of phd's attempted.
I didn't respond to your comment because the right denominator is going to depend on who is reading this thread. A Stanford CS major with a 2400 SAT whose parents successfully founded a tech startup is coming from a dramatically different pool than a data-entry drone with no college degree and no technical skills who figures that a weekend startup is his ticket to riches. The former's chances at both getting a funded startup and a tenure-track position are pretty good. The latter's chances are effectively zero. The latter falls into the pool of people who could conceivably "start" startups, but doesn't fall into the pool of people who attempt Ph.D programs.
You are not a lottery ticket.
Realistically, I think that tenured professorships and funded startups largely draw from the same pool of people and hence have the same denominators. That's why I focused more on the numerator. If you have the skills, dedication, and knowledge needed to become a tenured professor you usually (not always) have the skills, dedication, and knowledge needed to found a high-growth company, but there are more opportunities for the latter around, as well as fewer gatekeepers that can exclude you for arbitrary reasons.
If you think a hazing ritual is required, as a kind of filter for who wants to be a professor most, you can set high standards for the work while still treating people with respect.
One's PhD experience is highly dependent on the university one attends and especially the professor that one works under. Each lab has a different culture and each professor treats students differently. It also matters a lot whether you are on fellowship or not (someone funded directly from a professor's grant is easier to bully/exploit).
Just like you should speak to employees at a company before joining it to get an idea of culture and work/life balance, prospective graduate students should speak to current graduate students to get an idea of life at a given lab. Most graduate students are openly aware of which professors are known for treating students like indentured servants and which are known for being hands off and generous with research funds. If you are trying to pick a university/lab, definitely go to visits and speak to 4th/5th year grad students (preferably over a beer or two). Typically by the end of their PhD, graduate students are willing to tell you the truth about the different labs/professors.
Also, remember that is often totally ok to switch labs within 1-2 years. Yes, it may set you back some on your progress, but it can be much better than being miserable for 5 years.
My experience in FAANG research, outside of the pay > 100x multiplier, has been very different! In grad school we could work from anywhere at anytime, there was no requirement to be glued to a desk 14 hours a day and having to respond to emails on Friday nights or weekends or risk a bad performance review from your manager, no dystopian open office space, unlimited conference travel flexibility, etc. I actually kind of miss grad school despite having made 30K a year as a PhD student.
which Univ gives 30K/year PhD? My top pay was 18K/year in my last year.
I have a friend who was in a PhD in Physics program at CalTech. Absolute genius of a kid, and was surrounded by other people who are incredibly smart. My friend was always a very ambitious person, and wanted to join Wall Street as a quant after completing his PhD because he was interested in maximizing his income, and found the problems presented in finance/markets more compelling than those found in academia.
When he intimated this to people in the Department, they looked at him as if he had suddenly grown tentacles, because it's unbelievable to them that anyone would want to do something other than academia. This is a stark contrast to friends I have at places like Stanford, where no one quite frankly cares.
This doesn't touch on any alleged bad behavior or stress or pressures or experiences that people have while in grad school, but I believe that the cultural forces at institutions govern how people feel pretty strongly. That isn't a profound observation or even revealing, but I just thought that people would like to see a quick human anecdote to maybe relate to people going through this.
The victims must deserve it! Anyone so subhuman as to tolerate these conditions and abuses is evidently in need of punishment for being such a shank.
You can drop "knowing the perks of industry". There's no excuse for doing that under any circumstances, even if there is no industry alternative. I see this in many fields where grad students work in labs and are funded by grants. I wish universities would clean it up. There's simply no excuse for it. There's nothing about being in grad school that makes this behavior okay.
I think you had a bad advisor(s)! They’re not all like that. Maybe the group you were in was under a lot of funding pressure or something, that can cause bad behavior.
But no thesis hanging over your head :)
But people in FAANG do work nights and weekends and it's totally not unusual to be assigned sudden bitchwork by your manager and have no recourse against it.
Yes, I have a PhD. Yes, I paid my own graduate school tuition. Yes, I had a successful career in ML.
Where you are right is that I did allow myself to be treated like that. I could have quite any time, of course.
I learned a lot being a squeaky wheel, but I didn’t change the culture in any meaningful way beyond getting us a lunch room.
Before AI, the most I had ever made in a year was $800K based on a long stock options play for a public company I knew was undervalued (and I was there for its rock bottom). And in academia, the most I ever made was $30K as a post-doc and $13K as a grad student before that. That anyone wonders why non-tenured sorts flee is mind-blowing.
In contrast, I've seen ruthless tenured types pull 7 figures with multiple labs at multiple institutions and lucrative consulting contracts with private industry and the military. All whilst cheating on their spouses or sleeping with their students, sometimes both.
My attempt to go into management was pretty disastrous due to my inexperience and toxic internal politics. And that made me lose interesting in pursuing it further because politics are just not my strong suite.
Mid-range engineers at FB I know personally are making ~$500K supporting AI efforts. AI is really lucrative for now and probably always will be. But I hope there's a purge of the posers in the field somewhere down the road.
ML isn't much higher paid than standard engineering, but new PhDs from top schools who would've been competitive faculty applicants tend to get jobs at top paying firms and enter toward the top of the non-executive ranks.
Yessssss this. Even as a more permanent lab member (staff) I don't have clear defined role outside of, do what no one else wants to. It's not just the students!
The sweet spot, in terms of ROI, is to get a bachelors degree and go immediately into industry. The student shows they are competent enough to succeed on their own at a school environment where they have a nonzero probability of failure. For foreigners they stay in school for as long as possible so their green card can process. It’s not worth the risk of working for a company where they can be fired immediately and sent home, and it’s likewise not worth the cost of an h1b to companies for an undergrad. Since we graduate more phds every year than academia can absorb into teaching positions the green card process is effectively subsidizing postgrad programs, where the product is Indians and Chinese (primarily) who are desperate to get a high paying job in industry. Look up the numbers if you’d like.
At a societal level this is disastrous as it means that we have many foreign born who disproportionately hold the highest paying positions in society. Prosperity gospel and American exceptionalism aside this just leads to mass discontent and nationalism. It’s hard to argue they are wrong - why should a native born American allow a foreigner to take the highest paid jobs if given a choice? This leads (among other ways the country ignores and marginalizes poor native born citizens) to the election of people like Donald Trump.
This is not to say foreigners are bad - they often come from hard places where life isn’t easy. The real villains in this story are the aristocrats of the education empire. “More money for me and fuck everyone else,” seems like a common refrain these days.
There is a demand for people with scientific and data skills and not enough natives hold these qualifications. However, I do agree that the education system is broken and the main reason why native born citizens don’t pursue higher studies is that they are usually under the burden of large educational loans.
There is also a smaller issue that liberal arts is a common choice for American students, which makes many of them unemployable. However, perhaps due to cultural pressure, many Indians and Chinese end up in science and engineering, which are in very high demand.
I can't think of one successful tech leader with one. I'm sure there are many but they stopped being included when they discuss successes.
I think it’s not just devaluing the PhD, it’s changing it from something that gets you a job in research, hopefully academia, to in many cases just another credential to help you in the immigration process and getting a coveted American job.
That's one gentle way of describing conspicuous consumption.
And low resource computing is more theoretically and practically interesting. I've heard experts complain of some experiments "they didn't really discover anything, they threw compute at the problem until they got some nice PR." This was coming from M'FAANG people too so it's not just resentment.
Hyperparameter tuning is one big concern here; we know it provides good results at the cost of lots of work, so there's a temptation to sic grad students on extensive tuning but publish on technique instead. Dataset bias is another, since nets trained on CIFAR or ImageNet keep turning out to embed database features.
Ironically, I'm not sure all this increases the threat of FAANG taking over AI advancements. It sort of suggests that lots of our numerical gains are brute-forced or situational, and there's more benefit in new models work than mere error percentages would imply.
Interesting! How do you find customers?
Companies spend a lot of money on AI because they have a lot of money and don't know what to do with it. Companies lack creativity and an appetite for riskier and more creative ideas. That is what Universities must do instead of trying to ape companies. The human brain doesn't use a billion dollars in compute power, figure out what it is doing.
Sort of by definition, it can never be too costly to be creative. Only too timid. And too unimaginative.
While it is true that training very large language models is very expensive, pre-trained models + transfer learning allows interesting NLP work on a budget. For many types of deep learning a single computer with a fast and large memory GPU is enough.
It is easy to under appreciate the importance of having a lot of human time to think, be creative, and try things out. I admit that new model architecture research is helped by AutoML, like AdaNet, etc. and being able to run many experiments in parallel becomes important.
Teams that make breakthroughs can provide lots of human time, in addition to compute resources.
There is another cost besides compute that favors companies: being able to pay very large salaries for top tier researchers, much more than what universities can pay.
To me the end goal of what I have been working on since the 1980s is flexible general AI, and I don’t think we will get there with deep learning as it is now. I am in my 60s and I hope to see much more progress in my lifetime, but I expect we will need to catch several more “waves” of new technology like DL before we get there.
This may not be true, if we’re talking about computers reaching general intelligence parity with the human brain.
Latest estimates place the computational capacity of the human brain at somewhere between 10^15 to 10^28 FLOPS. The worlds fastest supercomputer reaches a peak of 2 * 10^17 FLOPS, and it cost $325 million.
To realistically reach 10^28 FLOPS today is simply not possible at all: If we projected linearly from above, the dollar cost would be $16 quintillion (1.625 * 10^19 dollars).
So, when it comes to trying to replicate human intelligence in today’s machines, we can only hope the 10^15 FLOPS estimates are more accurate than the 10^28 FLOPS ones — but until we do replicate human level general intelligence, it’s very difficult to prove which projection will be correct (an error bar spanning 13 orders of magnitude is not a very precise estimate).
P.S. Of course, if Moore’s law continues for a few more decades, even 10^28 FLOPS will be commonplace and cheap. Personally, I am very excited for such a future, because then achieving AGI will not be contingent on having millions or billions of dollars. Rather, it will depend on a few creative/innovative leaps in algorithm design — which could come from anyone, anywhere.
Or in the worst cases they are presenting one thing, and really just relying on hundreds or thousands of people in Bangalore to pore through the data sets and tag and categorize.
I was simply responding to the parent post’s false claim (”The human brain doesn’t use a billion dollars in compute power, figure out what it is doing.”), in isolation from the rest of the post (which I generally agree with).
Bostrom's estimate of 10¹⁷ is much, much more reasonable.
Note that this is still a number biased in favour of the brain, since for the brain you are measuring each internal operation in an almost fixed-function circuit, and for Summit you are measuring freeform semantic operations that result from billions of internal transitions. A similar fixed-function measure of a single large modern CPU gives about 10¹⁷ ops/s as well; the major difference is that a single large modern CPU is running a much smaller amount of hardware many times faster, and uses binary rather than analogue operations.
I recommend checking out some Antonio Damasio books for a fascinating read on this topic.
I'm guessing you mean that much of the power of the human brain comes from its ability to interact with its environment?
I've been noticing more and more of a trend in recent years to treat the brain as separate from the body, and as if it is "trapped" in the body.
Damasio has a very interesting take on how emotions, consciousness etc. emerge from the the way the brain & body together process information from the "external" world.
Moore's law has already been dead for years.
The work coming out of companies with higher profit margins per customer are doing much more novel work from what I have seen.
All this is to say, I don't see universities getting shut out anytime soon. The necessary compute to contribute is pretty cheap and most universities either have a free cluster for students or are operating with large grants to pay for compute (or both).
pretty small group of companies. FAANG + maybe five more. So for everyone else, scaling is not an issue as you have described.
State of the art language models can cost 5-figure dollars per training run.
There are a lot of variable in play here so your mileage will definitely vary (how much data, how long are you willing to wait, do you really need to train from scratch, etc) and these should only be considered very rough ballpark numbers. However, those are real numbers for SotA models on gold-standard benchmark datasets using cost-optimized cloud ML training resources.
At 5-figures per training run, the list of people who can be innovators in the LM research space is very small (fine-tuning on top of a SotA LM is a different, more affordable matter).
I can't go into detail about budgets, but suffice to say if you think $1M is a university compute budget that lets you be a competitive research team on the cutting edge, you are __severely__ underestimating the amount of compute that leading corporate researchers are using. Orders of magnitude off.
On-prem is good for a bit until you're 18 months into your 3 year purchase cycle and you're on K80s while the major research leaders are running V100s and TPUs and you can't even fit the SotA model in your GPUs' memories any more.
Longer to train can mean weeks or even months for one experiment - that iteration speed makes it so hard to stay on the cutting edge.
And this is before considering things like neural architecture search and internet scale image/video/speech datasets where costs skyrocket.
The boundary between corporate research and academia is incredibly porous and a big part of that is the cost of research (compute, but also things like data labelling and staffing ML talent).
You still have yet to provide any concrete sources to back up your claims. We're talking about contributing to research here. If multi-million dollar training jobs are what it takes to be at the cutting edge you should be able to provide ample sources of that claim.
- "the current version of OpenAI Five has consumed 800 petaflop/s-days" .
- Check out the Green AI paper. They have good number on the amount of compute to train a model and you can translate that into numbers.
- https://medium.com/syncedreview/the-staggering-cost-of-train.... NOTE: That XLNet number has to be wrong - it should be 5-figures, not 6.
I'm not an expert in on-prem ML costs, but I know many of the world's best on-prem ML users use the cloud to handle the variability of their workloads so I don't think on-prem is a magic bullet cost wise.
$1M annually per project (vs per lab) isn't bad at all. It's also way out of whack with what I saw when I was doing AI research in academia, but that was pre deep learning revolution, so what do I know.
Re: the moving goalposts - the distinction is between the cost of a training run and the cost of a paper-worth research result. Due to inherent variability, architecture search, hyperparameter search and possibly data cleaning work, the total cost is a couple orders of magnitude more than the cost of a training run (multiple will vary a lot by project and lab).
I understand why you don't trust what I'm saying. I wish I could give hard numbers, but I'm limited in what I can say publicly so this is the best I can do.
Perhaps figuring out what it is doing itself costs billions of dollars?
That said, I think you are right that academia’s structure may need to change. Right now, we’re locked into a model where projects mostly need to be doable by a handful of researchers (almost entirely trainees) in a few years. Other than these time-limited positions, there’s not a lot of room for skilled individual contributors, which seems goofy when tackling such a hard problem.
This has been done for thousands of years, we are not the only generation who found it's important to understand how we work.
There still will be a genius discovered in wild, but an institutionalized effort might make the discovery process quicker and more efficient.
And then there's a correlation of the test score of rich kids and their parent's income.
I want to agree with your sentiment, but the cynical side of me says that sometimes banality wins.
Phahahaha, this made me a good laugh :) To find out how your brain works guess what are you going to use - the brain itself. It's like trying to cut a knife with itself, or trying to use a weigher to weight itself.
Flip it upside down. Like he said, be creative.
If there is any spare time that week, how to make spiders web from eating flies as well
 Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).
Computer constraints are relatively straightforward engineering and science problems to solve. The lack of talent, that seems like the bigger story.
Then right before interviews I hear, “well we like you, but didn’t realize you didn’t have a PhD. We have a really awesome software engineering / machine learning engineer opening that’d be great for for you. That’s what we’ll do the interviews for”
Cool. Probably happens 2/3 the time, honestly.
Just ask me about my parents, research, and projects...
Anyway, point being you can’t 5x with a B.S. in C.S. That’s why there’s a “lack” of talent.
There are some famous computer science researchers with no PhD, but they do have the papers to back themselves up.
My usual tactic is to mention the patents and send some of my work and they reconsider.
In either case, my general point was that the 5x salary bump is pretty much reserved for an artificially set subset of people.
If you are talking about the R&D tax credits there is nothing that says you have to have a PhD. We get a bunch of these every year and we only have one person with a PhD that is part of the research.
How did you get 4 years of experience in ML at 27? care to share?
In 2015, I started building ML models full-time at two large companies. First while I school, as a contractor, then joined full-time after school
Presumably they've been working in ML since they were 23? What's so crazy about that?
Long gone are the days when your patron king or queen would fund you handsomely just for doing math.
Then their field got hot and suddenly people are not offering them double their academic salary in industry. They’re offering dectuple their academic salary.
Real demand calls forth supply.
We've also thrown a lot of money in this endeavor, so there's research and building a foundation with the underlying intent that this will be used to help the company.
I don't think that's too different than the university model of grants and research.
Fortunately AI is hot right now so there are lots of projects where you can get funding simply by saying I want to apply machine learning to X.
Even then, the experimental aircraft at Caltech is not a full-scale prototype of the next generation of fighter after the F-35. How does Stanford's fab compare with TSMC's?
Edit: TSMC collaborates with four Taiwan universities on research and provides fab services for 23 universities. https://www.tsmc.com/csr/en/update/innovationAndService/case...
The general answer is that it's still possible to try out things on a single GPU or several servers and many gains come from good features and smart network designs. On the other hand, squeezing out the last 5% does require more data and budget.
Personally, I think you can still do a lot with a moderate budget and smart people. But would love to hear other opinions.
Have AI techniques actually changed in the last 20 years, or is there just more data, better networking, better sensors, and faster compute resources now.
By my survey of the land, there haven't been any leaps in the AI approach. It's just that it's easier to tie real world data together and operate on it.
For a university, what changes when you teach? This sounds like researchers feeling like they can't churn out papers that are more like industry reports vs advances in ideas.
Some of this is even motivated by mathematical theory, even if you can't prove anything in the setting of large, complex models on real-world data.
The quote from Hinton is something like, neural networks needed a 1000x improvement from the 90s, and the hardware got 100x better while the algorithms and models got 10x better.
So basically every graduate chemistry program.
For example, coming up with a new DL model that has improved image recognition accuracy would mean it has to be trained through the millions of samples from scratch, which requires a lot of time and money. But I'd argue that such thing is more of an "application" of DL instead of "research". Let me explain why... Companies like FAANG have the incentives to do that, because they have tens or hundreds of immediate practical use cases once model is completed, hence I call such activity more of an "application" of ML rather than "research", because there's a clear monetary incentives of completing them. What about University, what sort of incentives do they have by creating a state-of-the-art image recognition other than publication? The problem is publication can't directly produce the resources needed to sustain the research (i.e. money)
I think ML research in the university should move in the direction of "pure" research. For example, instead of DL, is there any other fundamentally different ways of leveraging current state-of-the-art hardware to do machine learning? Think how people moved out approaches such as SVM to neural network. Neural network was originally a "pure" research project. At the moment of creation, neural network wasn't taking off because hardware wasn't capable to keep up with its computational demand, but fast forward 10-15 years later, it becomes the state-of-the-art. University ML research should "live in the future" instead of focusing on what's being hyped at the moment
Next, this research will probably continue to get cheaper. The cost to do the Dota 2 research 5 years ago would have been much higher, and will probably be even less expensive 5 years from now.
Also, I think there's plenty of room for novel & useful at the bottom end where $millions in compute resources are not essential. Cracking AI Dota is certainly interesting, but it's hardly the only game in town, and developing optimized AI techniques specifically for resource-sparse environments would be a worthy project.
Sure, this doesn't compete with Google's data-centers. But that's assuming Universities are for some reason competing against private industry. That's not how any other engineering discipline works, so it's a bit odd to just assume without discussion.
That was funny - however not even close to reality. I have to work on a GTX 1080 (not TI)...
It’s within the reach of many grants to afford a few scaled runs of a technology as a demonstration of behavior at scale.
A more useful metric may be the proportion of proprietary versus open discovery. I don't know if I can point to a single example where researchers have not rushed to put their latest breakthroughs on OpenReview or Arxiv. Even knowledge of a technique, without the underlying models or data, is enough to influence the field.
Academic free inquiry and intellectual curiosity, looks very different than product-focused solutions-oriented corp R&D. A good working example looks something like Google AI's lab in Palmer Square, right on the Princeton campus. Researchers can still teach and enjoy an academic schedule. I think it was Eric Weinstein who said something to the effect that if you were a johnny come lately to the AI party, your best bet would just be to buy the entire Math Department at IAS! In practice, its probably easier to purchase Greenland ;)
Of course the Big Tech companies have far more resources to throw at it; that's why they're Big.
A far more serious issue than access to computational power, is access to suitable data, and particularly the hold that Big Tech has on our data.
People should question all of the assumptions, from the idea of using NNs and the particular type of NN and all of the core parts of the belief system. Because these certain aspects are fixed on faith more than anything else.
If you want efficiency of training, adaptability, online, generality, true understanding, those assumptions might need to go. Which would not mean you could learn from DL systems, just that core structures would not be fixed.
I see an asymmetry between academia and industry. Academia has the models, industry has the data. Compute is more balanced because it's usually commodity hardware.
If industry is outpacing academia in research, I think that means data is the more valuable quantity, not compute.
And the article's theme of concentration is more a problem with data. Is Facebook dominant because of its algorithms or because of its database? If other companies had Google's index and user telemetry could they not compete with a rival search algorithm?
Playing in the sandbox with AI (especially the "brute force" deep learning algos) in and of itself does not equate _intelligence_ or _progress_ for us as civilisation.
What's expensive is thinking AI evolution means deeper networks and the only way to get better results is by throwing more GPUs at the problem.
And to be honest those with "infinite" resources are a bit "guilty" of pushing this research lines.
I think instead of trying to build larger computers there is an opportunity for academia to move back towards the construction of cognitive models and minimizing the reliance on computation and data. That's what intelligence is supposed to be all about.
This is a good thing.
Imo it's got to the point where if you have a bad idea your startup fails and if you have a good idea they will copy it and your startup fails. Even companies like spotify and slack which have "made it" are now threatened by Google Music + Apple Music and by Microsoft Teams.
Would be interested to hear other opinions.
Yes, we should be powering these things with renewables, but we should make cuts somewhere else before giving up on basic research.
I just can’t seem to relate to people who think we should all punish ourselves because society didn’t see climate change coming. Sure we need to do something, but it’s not practical for us to collectively put on hair shirts and wander into the woods.
We need to be switching off coal fired power plants not computers.
They just use a nice contract system - to make every one feel good about it and fund the wind farms that is it.
With Hydro its more predictable but even there no guarantees.
How much of that is ML? As someone pointed out in another comment, how much ML is actually useful? Some ML is "carbon good", such as route planning saving energy. But do we really need to spend billions of kWh just to get slightly better recommendations? Do we really need to increase margins a fraction of a percent for some company to show more ads and sell more?
And while we're on the subject of power, maybe if web pages weren't 300mb of crap and 1k of content, we could cut back on another few billion kWh on servers and routers.
This shit is serious, we're dying here, so yes, absolutely, let's do the math about how much AI costs. It's up to us, the computer people, to ask these questions and solve the part of this problem that WE own.
Who? Where? Why?
Are you doing something, personally, about this? My office is 100% solar, home about 20%, telecommute 100%; you?
I'm seeing companies spending tens of megabucks on idling GPU farms without a clear idea what to do with them.
Saw that when we did a subcontract for datacentre for Alibaba. They had a huuge reception for all kinds of dignitaries, showing them CGI movies of their alleged AI supposedly crunching data right there in the DC — and all that with all of hardware in the DC being shut down...
The moment I poked a joke about that during an event, there came dead silence, and faces on the stage started to turn red. The guy defused the situation with a joke, and the party went on.
But in the past few years, the one two punch of AI and Python rebuilt that garden. So now not only are the big guys controlling training the big AIs, but the insistence on describing those computations entirely from a Python abstraction of those computations is leading to insanely inefficient use of those GPUs (and coming soon, dedicated ASICs with even worse software libaries past Resnet, BERT and a few other acceptable strawmen).
That said, one can do a lot with AI/ML models that fit in a single sub $10,000 machine built entirely from consumer parts and doubly so if one is willing to profile and low-level optimize one's code as intensely as stare at the data going into the model (you're in grad school, you have time for this, you have all the time so to speak and it will save you lots of time in the long run). For inspiration, all the big guys have GPU codemonkeys on staff to micro-optimize as needed. One might want to take a cue from that and DIY.
If you mean "philosophers who don't understand ML/gradient descent/linear algebra/statistics etc", I don't think many philosophers are debating about that stuff.
If you mean "philosophers who don't understand artificial intelligence", that would be all of them, and everyone else too, because no-one understands artificial intelligence yet. And a lot of the people who come closest to understanding it are in philosophy departments.
Of course they know what to do with them: show us more effective ads. That's the cutting edge of technological progress now.