> we’ve rewarded and lauded incremental researchers as innovators, increased their budgets so they can do even more incremental research
There isn't a scientific field where every single paper is groundbreaking. It's a Brownian motion of small incremental innovations, until eventually we stumble upon something big (like deep learning). In no way is machine learning unique in this. Sounds like the author is simply disappointed that, like in any other profession, day-to-day of a researcher is a slog and not a perennial intellectual festival. We've been in an exciting deep learning craze for a while, but it's silly to expect it to last forever. Back to the grind now.
> Machine Learning Researchers can now engage in risk-free, high-income, high-prestige work
Not sure what author means by "risk-free". Yes, if you're not publishing enough you're most likely not going to starve. Is that a bad thing? Is the survival instinct the only good motivator for doing good research?
I would argue that there's plenty of risk, in that people who don't publish good research don't get very far in their academic careers, which in my view is good enough motivation. "They must do good research or starve" is a rather cynical take, especially from someone who seems to not be doing too badly for themselves.
I'd rather more fields provided similar benefits. Maybe then going into science wouldn't be associated with so much sacrifice, so more smart people would choose science over investment banking or such, and we'd make more scientific progress faster.
> CNNs use convolutions which are a generalization of matrix multiplication.
A nitpick: CNNs are most definitely not a generalization of matrix multiplication. In fact, the opposite: you can view CNNs as a matrix multiplication with a particular matrix structure.
This was effectively my response to hardmaru when this topic came up on reddit 
Basically 2010-2018 was an open field for ML/DL research with old(ish) methods being rapidly applied to low hanging fruit and large datasets with newly cheap compute.
Deepmind and others are actually making new methods but by and large are remixes of those same old approaches.
The majority of different research out there trying other approaches (Numenta, OpenCog, Causal Calculus, anything Schmidhuber etc...) don't really get any love because it doesn't fit within the mass tensorflow/torch framework.
Absolutely! And what many folks need to continue to remember is that many scientific disciplines and domains are really just starting to wrestle with the utility and implications of this first generation of deep learning tools and applications. I graduated with my PhD in atmospheric science from an R1 just over 4 years ago; at that time, very few people were looking at how DL provided useful tools for their work. These days, the field is inundated with folks playing with these tools and knocking tons of low-hanging fruit off the tree - it might not be "deep", revolutionary research, but it's fomenting a mini-revolution with respect to R2O and real applications of what had previously been somewhat niche science.
There's no reason to think this trend won't continue. New tools let new generations of scientists take new stabs at their discipline, and of course the low-hanging fruit drops first as folks get their bearing, build skills/experience, and - most importantly - prove efficacy so that they can get funding for more ambitious work.
Similarly on the commercial side, plenty of opportunity to solve existing problems with DL.
So while it may be true that DL research progress has slowed, still plenty to do in applying existing DL.
In the same way that ANNs didn't work for quite some time, until we've had the compute and the data to train them successfully?
I get that it's important to prove that an idea is worthwhile, and the easiest way to do that is to use it to solve a practical problem. At the same time, I am conscious that we shouldn't put all our eggs in the deep learning basket: who knows where the ceiling is going to be.
Don't get me wrong, I like deep learning, and you have to be silly not to admit how successful it has been. But the field would be so much more boring if not for the people with alternative views and ideas.
Younger people don't realize there was strong bias against using neural networks in the late 90s up until Hinton's talk on NNs around 2007. I get the feeling we're going through the same thing where novel research is becoming ignored because everything must fit the deep learning paradigm to be noticed.
There are lots of ideas floating around in AI field. Some of them might be good, most are not. If you have an idea and want others to look at it you better demonstrate how it outperforms every other method when applied to some task.
Except ... while machine learning is great, has made important and significant strides, it's not a yet science. It involves essentially a series of sophisticated, mathematically informed recipes for feeding data to giant algorithms and having them create something useful (maybe very useful but still).
An analogy from a couple years ago is bridge building before physics. You accumulate rules of thumb, you get a vague understand what works. You get better. But you aren't producing a systematic field.
And that implies merely advancement isn't necessarily progress (which isn't to say there's no progress but building larger SOTA isn't that as the article notes).
What we should be careful about is to not be too strict when defining "science". In my view the goal of any scientific field is to build understanding that is useful for predicting outcomes of experiments. Now, this understanding could be defined mathematically, but it doesn't have to be. Don't see why building heuristics can't be a part of this, assuming such heuristics reliably predict outcomes of experiments.
So... scientists need to do scienceing until it is. This is what happened with Biology over the last 50 years after 2000 years of pinning things on cards and putting them in draws.
So I'd claim we're really at the situation right now, at the exploring new ideas, the "crisis of science" phase where essentially people have to start brainstorming (and not all ideas are good here either but they need to be somewhat original).
All this is using Thomas Kuhn's Structure Of Scientific Revolutions model very roughly.
This would be more like Kuhn's pre-paradigmatic (pre-scientific) activity, than moving from one scientific paradigm to another ?
(I don't know which one is more appropriate here.)
Sounds like science to me; systematic recording of (perceived) cause and effect.
Arguably, deep learning and cell biology both appear like equal parts pure wizardry and flailing in the dark, but maybe that’s just because we haven’t gathered enough pieces yet, and not necessarily because people are doing the wrong things, thus failing to advance?
- It's easy to recognize scientific / technological revolutions in hindsight, but at the time they're anything but. I think most recognize the importance of persistence in the progenitor of an idea. What's often missed in these discussions is how important the subsequent incremental progress is: working out consequences of a theoretical insight, figuring out what you can build on top of a new tech, etc.
- I don't have a strong opinion about the optimal amount of "risk," but I do think making risk existential (as in one would starve without a research breakthrough) would have the opposite effect, because while we all say we like people to take risks, what we really mean is like people to take risks and suceeed. And yes, too little risk can breed complacency.
And yes, convolutions are a specific kind of linear transformation, whereas matrices represent linear transformations on vector spaces. The specific structure is that convolutions represent linear transformations that are translation-invariant, a property that many types of data (e.g., images) have, at least approximately.
() My perspective as an academic but not a computer scientist.
The author cites no compelling trend that ML is stagnating, esp relative to other disciplines.
I’d also add: is it BAD if universities are churning out highly skilled workers when there is high demand for them?
This seems to be more a rant of ML becoming more mainstream and accessible to a wider variety of students than anything.
They admit that some of the most “innovative and challenging” problems are still around, but many do not focus on that. Ok, so what, there are maybe more tiers and splitting of work into subdomains, some more “vocational” and some more “research-y” in nature. Is this a BAD thing?
So if you want to credit success to dumb luck, or to stubbornness, or insanity (as Einstein said: "doing the same thing over and over again and expecting different results." then LeCon is your man. I don't know of any quote where LeCun says something to the effect of: "Well, I knew the hardware would speed up and then my neural net software would work exactly as I designed it to and manifest AI. So I waited twenty years."
IOW It wasn't the LeCunn's surfboard(his software) that handed him success, it was the wave(advancing hardware speeds):
You're talking like we only discovered Moore's law now, instead of it being a good bet for the last half-century...
He speaks French after all...
> A nitpick: CNNs are most definitely not a generalization of matrix multiplication. In fact, the opposite: you can view CNNs as a matrix multiplication with a particular matrix structure.
Neither is really a generalization. Any matrix multiplication can be implemented with a convolution function, and yet any convolution can be represented as a matrix multiplication via im2col.
Is the deep learning really the result of incremental research? The SOTA chasing frenzy comes after the discovery of deep learning. The motivation of incremental research can hardly be justified as to discover the next deep learning, although they might do.
It's not some dude disappearing into the forest for a few years and coming back with a revolutionary idea (maybe in movies). Everything that helped shape the scientist's thinking has contributed to the idea, even if sometimes the idea is so revolutionary that it's hard to see the direct link. Besides, show me a scientific paper with no references. :-)
It's a shame that this is rarely acknowledged and more often than not we talk about "that guy who invented X and got a Nobel prize for it". But us mere mortals can take solace in knowing that even if we didn't change the world with our own ideas, it's possible that we've influenced someone who has.
While I agree that everyone was shocked, myself included, when we saw how well SSD and YOLO worked, the last mile problem is stagnating. What I mean is: 7 years ago I wrote an image pipeline for a company using traditional AI methods. It was extremely challenging. When we saw SSDMobileNet do the same job 10x faster with a fraction of the code, our jaws dropped. Which is why the dev ship turned on a dime: there's something big in there.
The industry is stagnated for exactly the reasons brought up: we don't know how to squeeze out the last mile problem because NNs are EFFING HARD and research is very math heavy: e.g., it cannot be hacked by a Zuck-type into a half-assed product overnight, it needs to be carefully researched for years. This makes programmers sad, because by nature we love to brute force trial-and error our code, and homey don't play that game with machine learning.
However, places where it isn't stagnating are things like vibration and anomaly detection. This is a case where https://github.com/YumaKoizumi/ToyADMOS-dataset really shines because it adds something that didn't exist before, and it doesn't have to be 100% perfect: anything is better than nothing.
At Embedded World last year I saw tons of FPGA solutions for rejecting parts on assembly lines. Since every object appears nearly in canonical form (good lighting, centered, homogeneous presentation), NN's are kicking ass bigtime in that space.
It is important to remember Self-Driving Car Magic is just the consumer-facing hype machine. ML/NNs are working spectacularly well in some domains.
However, you can make significant gains to your models by going back to traditional image filtering/augmentation. Sticking with well researched object detectors/segmentation algorithms and putting our effort on improving the algorithms that cleans up the data takes you far. It's impossible to avoid because images will always be full of reflections, artifacts, strange coloration unless you have the perfect lighting tunnel setup; doable nonetheless.
We'd love to be able to work with a company for a few days, get the parameters set up right for our case, and then let them take the thousands of images. My company would easily pay $100K+ for such a data set.
Huh? If anything I would say ML is way more trial-and-error focused than imperative programming.
This is a link to a dataset, unless I'm missing something it's not about anomaly detection. I looked into this area a few years ago and always try to keep my eye open for breakthroughs... care to share any other links?
Uh what? You can literally finetune a Fast.ai model overnight to be borderline SOTA on whatever problem you have data for. 0 Math involved, isn't that exactly a hacker's wet dream?
If every DGP could be captured by fast-tuning a sophisticated enough model, science probably would be solved even before DL.
He goes on to explain how theory always comes later.
I thought that information theory came before practice but turns out it also came after. (there was a bunch of heuristics for sending messages with teletypes)
A nice example is Roman technological advances in architecture and materials that predate the use of geometry or mathematics completely. All advancements a result of tinkering and heuristics.
Information theory came after the telegraph and early communication systems. However, we could not have built modern communication devices without Information theory (the insights and design principles). We build, then we theorize, then we build better, etc. it's not a simple procedure. Computers were developed similarly: there were all sorts of ad-hoc logical apparatus, we built boolean theory to explain it, and then we did all sorts of experiments trying to build computers. Their architects were largely mathematicians with very good ideas of mathematical design principles (and creative new mathematical ideas), not a group stringing together electrical elements and seeing what happens. The same goes for the development of ML/Deep learning, and many other technological marvels.
Although "accidental" discoveries do happen, they happen from a methodical set, with knowledge, intuition, and good priors.
The theoretical framework came almost 100 years after the steam engines original invention
Theory is just an explanation of hw the world works: you can come up with that explanation as a reason for observations, or as a logical consequence of other theories (which is then verified by observation).
It seems like the article was a bit polarizing, some of the comments made me realize I made a few imprecise statements and I'll fix those. Other comments didn't approve of my tone and that's a bit harder to fix since I tried writing this article in the same way I usually speak. I've written a lot of more dry content for my robotics ebook and I was purposeful in trying something different for this article.
Some of the comments actually generalized my observations to management and software more generally and it's always nice to see people taking my ideas further than I thought they could go.
One major thing I'd like to point out is I don't think "ML is dumb" is the right conclusion, there are lots of incentives making it so more of ML is stagnating but this is certainly not true for the field at large. Interesting ideas need to involve a certain amount of risk. The latter half of my article showcases a few ML adjacent projects which I think are absolutely fascinating.
And if you're interested in reading more stuff by me my robotics and machine learning ebook is very easy to read http://robotoverlordmanual.com/ and will teach you all you need to start building robots at home
The writing process for this article was meme first, content second
There has been a rabid frenzy of throwing money at anything that has ML in it. Soon investors and CEOs will realize that ML is effective in narrow ways and that not everything needs ML.
They will also realize that 1 ML team + ML as a service (Azure ML, Sagemaker, Google AI platform) is cheaper and works more reliably. The services will keep improving and an underpaid mediocre ML-Engineering team can work to keep the production system up and running.
Basically, ML teams might lose jobs just as DB/Cluster admins did with the advent of serverless compute as service.
I expect it to (already happening) create a day trading company like hierarchy. The Fair/Brain/OpenAIs will pay 7 figures to the top grads to be first to market. BigN ML product teams will expand and stay as well paid as they are. Then there will be a huge drop as we move to offshore ML product teams that are viewed as cost centers by the remainder of 99% companies. These will be most of the jobs available.
In such a system, a pure ML scientist (usually a PhD) will only exist at the top companies. So if you are not in the top 1-3% percentile, you will not have a pure ML job. However, there will still be hybrid DS-SDE jobs (ML engineering, ML product maintenance, ML-as-a-service user) or hybrid DS-PM jobs (Analysts, Consultants, data driven business decision makers). So, anyone who is not in that top 1-3% will have to pivot to one of these 3 roles.
I won't call this an AI winter. But, it will definitely become boring majority of those employed in the field..
Right now, there is so much misunderstanding about what ML is, what resources it needs, and how it works that the corporate environment is very stressful.
ML jobs are well paid, but they are NOT fun. No one understands ML devops & the infra needs to enable tight experimentation loops. Existing observability and telemetry systems are wildly bad for model training, reproducibility or any form of online or semi-online learning. As an ML engineer you’ll have to take on huge workloads of devops, infra, tooling, data munging. I’ve seen more than a few brilliant ML engineers burnout and quit because of this.
As ML becomes better understood as a boring technology, and decisions around ML projects, team structure and especially ops support start to get more standardized, I think this will get better.
The pivot you mention means a thinning out of the headcount on the pure ML research side. But it also means opening up more positions in ML engineering, infra & devops.
If people choose their specialization appropriately and remain open to being less on the research side of this, then I think there will continue to be lots of opportunities for high-paying jobs, and people will know their required responsibilities more unambiguously and probably will be happier, rather than dredging through the endless series of bait and switch jobs that exist today, promising a focus on ML research but typically forcing you more into ML devops & data platform management.
Can I cry? I feel so understood right now.
I love my job in ML, the subject matter is fun, but there is so such a huge burden of expectations on a team's titular data scientist. It is exciting in a 'mid 90s during the web revolution' sort of wild-west way, but you also have the cynicism of the mature Software field. A good ML engineer is worth their weight in gold.
I also wrote this in a pseudo-fictional dystopian sense. A 'If I was an ML pessimist' take on the the state of things.
The other comments made to the parent I originally posted, are great counter arguments. (2012-14: Alexnet, 14-16: Deep LSTMs, 16-18: Resnet,M-RCNN,Yolo 18-20: Tranformers, 2020+: Alphafold,GPT3,CLIP, et al.) Deep learning has been improving pretty linearly over the last decade. If I was looking at it in a naively statistical sense, then ML will actually be able to match the rising supply of ML scientists with a rising demand. That's the optimistic take though. In that case it will actually feel like being a programmer in the 90s, in that a couple pivots can propel you to multi millionaire.
> CEOs will realize that ML is effective in narrow ways and that not everything needs ML.
Any stable business isn't unjustifiably syncing costs here. I project FY21 rise an AI-funded efforts in large businesses.
> They will also realize that 1 ML team + ML as a service
Yes/no. This is has more platform implications vs actual ML.
> ML teams might lose jobs
Assumes ML Jobs only do some form of R&D. Data is a utility, and advanced analytics is valuable. Stable ML Jobs dont just work on deep learning.
> I expect it to (already happening)...
partially agree. already happening. But cost centers are only taking on what was standardized yesterday. Tomorrow still requires advanced analytics capabilities.
> In such a system
moot point. This is the system.
I see a pivot in focused efforts. More optimism in an early AI commodity vs stagnation. We're moving from research to integration. Further areas to improve AI in applications (with continuous feedback training) and many domains of advanced analytics.
For my career I say, "tere u me shakkar". ("let there be sugar in your mouth", ie. let your words come true)
80-95% of "ML" has always been "boring" and any data scientist/ML-engineer worth their salt know this. This also concerns what you refer to as "pure ML job". It only takes fresh grads and juniors a couple of projects to realize the true meaning of "data cleaning", "outlier detection" "robustness" and their likes - it's painstaking work.
Sure but ML is also not being used in 90+% of the narrow use cases it is good for.
>They will also realize that 1 ML team + ML as a service (Azure ML, Sagemaker, Google AI platform) is cheaper and works more reliably.
These services replace part of the ML Ops component but not much else except in very narrow use cases. There's also already GUI based tools for building models but they're also not used much. I don't see this part of ML Ops as being the majority of what ML Engineers do so the majority of ML Engineers don't have to worry.
Every good couple of years researchers come up with a good advancement or fresh concept that reignites the community. However all that happens till the next breakthrough is basic tweaks. The amount of junk papers that slightly adjust the method that gets a few decimal place improvement then call it a snazzy name would fill a mountain. Drawing blood from a stone. People get so obsessed with specific methods thinking its the new great thing they don't stop and think it's probably not the only way.
Marketing and media is the worst though for general public perception. The amount of times they would warp ML into a magical pangea. "ML will make you skinny!"
I mean it's just mathematical approximations, been around a while.
Don't get me wrong, I like this field. I'm fortunate that I get to apply it to a problem that helps people but at times I just want to shout from the rooftops that it's not a God it wont make all your dreams come true. Then they get annoyed it doesn't and we a bunch of stagnation articles like this.
Your math ability needs to be such that you can transform a problem into a form that can be computed numerically using matrix multiplications, which requires more skill (sometimes significantly more) than simply knowing how matrix multiplications work. Sometimes this ability to reframe complex problems numerically using matrix multiplications is the lions share of the research!
Most academics I’ve come across only think they’re doing this. My perception is they are too insecure about their self-worth to pursue material opportunities.
I admit, the number of academic types I know is not vast so maybe it’s too small a subset to make any judgments
It's interesting so far. Research feels very open ended compared to industry. While I was in industry (AI fintech startup), even though goals were rapidly changing, I had a good idea of what problems to work on and how to gauge progress.
In contrast, research is far more undefined. There are days I feel lost and other where I'm chasing rabbits down deep holes. It's been hard for me to figure out if the problems I'm trying to solve are worth exploring (are they good research question? and more importantly are they publishable).
But that being said, it's only about 6 months in and I feel like I'm still learning what it means to do research. I've definitely enjoyed having the space to explore problems at my own pace and think deeply about them.
The billion dollar endowments don't usually go toward supporting research directly.
This is precisely true, as someone who has passed both screens for competitive jobs.
Cracking the coding interview <> Case in Point
Live coding <> Do 3-digit multiplication in your head (eg 347 * 469)
Sorting algorithms <> M&A Evaluation Frameworks
I could go on...
You just memorize a bunch of crap that's vaguely (but not really) applicable but is super random, and then you just keep asking "do you want me to keep going" in various tones until they tell you to stop.
Love the way this is put.. it is so true :(
The idea that you traditionally have these programmers who spout mumbo-jumo all day, cost a lot of money, and seem to always be planning stuff behind your back is threatening, and all the more so because you are utterly dependent on them. ML breaks their control over the means of production.
Now, that's not to say I am against labor saving devices. I most certainly am for them, but an an economy in which everyone is in a deep learning arms race is an irrational shit show that could only result in less productivity.
(It's possible a single central planner AI could do better, because at least the training data would be "real world" and not output of other deep learning black box actors. But of course single-planner economies have a huge amount of other downsides.)
For me, most software development is about finding something boring and laborious. We get a computer to do the work so humans can level up and work on something requiring actual thought. That requires getting a deep understanding of the actual work.
Some of that definitely happens in well-run ML projects. But there's a bunch of Silver Bullet Syndrome stuff going on, where ML's shiny results and magazine articles lead to inflated expectations and inflated claims of success. A fellow nerd says, "I did an algorithm!" Some turns that into an impressive presentation with claims of X% gains in the Key Business Metric, hallowed be its name. In reality, it's more plus or minus X% when you account for externalities, natural variation, and actor adaptation. But that's ok, because by the time anybody finds out, attention is elsewhere.
That's not to really blame ML for that. For a period years ago, I kept getting asked, "Can we use a wiki for that?" I would start an explanation of what it actually takes to make a wiki work (hint: it's not the software). Their eyes would glaze over in short order, because they realized that it would take actual work. So many people want the silver bullet, the magic pill. Especially people in the managerial caste, as the reigning dogma there is that management is a universal skill. Details are for the little people.
Exactly. Machine learning is the perfect ideological duel. It's "universal labor" for "universal management", and both sides are equally illiterate in the ways of the world.
This would be funny if it were not also so true and sad... management as a skill (and it is a skill, it is not IMHO something that can be taught, especially in business schools!) is such a rarity.
It's the same adoption/business technology tension that has existed since Frederick Taylor in the early 1900s or Vonnegut's Player Piano concept where they propose taking a recorder to automated human-adverse tasks by recording their movements. The hype is trying to replace people. Real-world adoption seems to take place where machine learning complements human activity to do things humans are not good at, not replaces it. It's not making them dumber, it's making them more enabled.
In the early 1900’s Frederick Taylor called public attention to the problem of ‘national efficiency’ by proposing to eliminate ‘rule of thumb’ management techniques and replacing them with the principle of scientific management . Scientific management reasons about the motions and activities of workers and states that wasted work can be eliminated through careful planning and optimization by managers. While Taylor was concerned about the inputs and outputs of manufacturing and material processes, computers and the information age brought about parallel concepts in the management and organization of information and its processes. Work efficiency could now be measured not by bricks or steel, but by their information flows.
F. W. Taylor, “Principles of Scientific Management”, Harper & Row, New York,
It appears to me that you are missing the mark here, unless this is largely a definitional issue.
Do you consider the foundations of ML to be a clever trick?
Do you think human brains primarily learn by clever tricks?
When a metaphor or saying falls apart with one more level of questioning, I would suggest it may be time to find a better metaphor.
For your second, I think that's part of the problem. Most people confuse how a human brain learns with what is actually running in a machine learning program. It's similar at some level, but not really doing the same thing at another.
What insight is gained by this statement? What does it explain? What does it downplay?
I'm not currently seeing much value in it. I'll explain why. Saying 'just a method of solving a problem' comes across as reductive without being useful.
Imagine if someone said 'flying is just a means of movement' in the context of studying a hummingbird's agility. It says more about the speaker than the subject. It suggests the person is uninterested or focused on other things.
So with regards to your statement, it suggests you don't care and/or don't appreciate what makes learning difficult.
I'm talking about learning theory. About generalization given data. This is certainly not easy. Yes, it can be encoded in an algorithm, but that does not make it less interesting.
The big idea here is that enough brute force will lead to better compute-use techniques which will in turn make it possible to do more with less compute. But the current reality is that these systems don’t tend to justify their existence when compared to greener, more useful technologies. It’s hard to pitch an AI system that can only be operated by trillion-dollar tech companies willing to ignore the massive carbon footprint a system this big creates.
That's good to know. Because "labor saving devices" is by no means the field of machine learning. A hammer is a "labor saving device" if all you have are rocks. I've got the impression that a lot of people conflate machine learning and robotics with labor saving in general. Of course compared to the state of art machine learning let's us hope to find magical shortcuts to get our work done. But an IDE, a word processor, or a compiler is also a labor saving devices. As well as a piece of paper, it is way faster to doodle on a piece of paper at your desk than finding the next cave to doodle.
I blame less SQL false advertising, than the insane tunability and lack of effort into migrations tooling for making the relational database world so much more kafkaesque. Or really between the nature of Oracle, Microsoft, and their customers, it might have been an inevitable insanity along the lines of Conways law and too profitable -> too many cooks in the kitchen -> too complex.
The more people who can leverage AI, the better the industry will be as a whole. It serve as both a check on AI hype, and also leads innovating in encouraging new applications.
Huggingface Transformers is a good demo of this philosophy in action.
A goal? No. ML is a field, not something with agency.
One effect of ML is more generalization of prediction and inference problems. (I'm using inference in the statistical sense.)
That is a very bizarre description of ML engineers. In every company I’ve worked at, ML is a team or teams that partners with product managers and other engineering teams to learn about problems they need solved. It’s very systematic, boring, and tied heavily to those other teams as the leaders and decision makers.
First you look for high level value propositions, like automating a decision process, removing a customer friction point, creating key metrics where simple metrics are intractable, or various multi-modal information retrieval goals.
You identify opportunities in these kinds of high level areas in lock-step with product managers and other engineering leaders. Then you move on to identify sources of data that can be leveraged, and eventually (much later) you get to the smaller set of work training a model, validating with acceptance tests and hardening the implementation for safe production deployment.
If people are spouting “mumbo jumbo” and making big ML model decisions without lock-step synchronization with other stakeholders, that sounds like organizational dysfunction, not any type of issue with ML.
I also find it odd that you bring up “cost[ing] a lot of money” .. that’s very out of place among everything else mentioned. That seems more like insecurity or jealousy over the market demand for ML talent, and wanting to cut other people down rather than work with them or acknowledge the level of effort it required to get that level of expertise in ML.
No no no, that's a description of regular programmers.
There's so much more arcana in the field of programming and computer science as a whole, especially with the piss-poor job we've done deprecating bad old interfaces etc. (the monster that is modern Unix grows without bound). And of course there are the various langauge and other fads. All that is a nightmare for a traditional business person, whether they know it or not, and the regular programmers probably feel like an extortion racket of sorts.
ML, depending on the situation may refer to a field of research, a methodology, a set of tools, among other things. Only things with agency 'plot'.
Well, this blogpost shows where this analogy breaks down : birds (AFAWK) don't try to discuss ornithology.
"bureaucrats running the asylum" is normal science !
Thomas Kuhn has shown how it works in The Structure of Scientific Revolutions :
The problem is that without Kuhn our expectation are set by pop history of science, which only remembers 'anormal', extraordinary science : the paradigm changes.
(Otherwise, this is a great blogpost.)
And, while I agree there is a "fake rigor" problem in ML research, the particular examples that they bring up aren't extremely good exemplars in my opinion. Instead, they seem to have a problem with the standard operating procedure of mathematics while missing the point that it is what got the "stack more layers" school here in the first place. Simplifying a problem so you can understand it, and then relaxing the assumptions and seeing if you can figure out what implications that has is how advances are made.
Plus, they have some hot takes and statements that are just plain wrong.
> With Automatic Differentiation, the backward pass is essentially free and is as engaging to compute as 50 digit number long division. Deriving long complicated gradients is fake rigor that was useful before we had computers.
What? First of all, AD is not a solved problem and using it is not "essentially free." There's a huge performance overhead when adding AD to a system. Try using a second order method with AD. I hope your Hessian actually finished computing.
> Julia on the other hand is a language made for scientific computing where everything is automatically differentiable by default.
This is just plain false. I'm a huge proponent of Julia, and there are some great AD packages, but in no way is everything automatically differentiable (even with the nice packages), nor is that a design goal. The work on Flux.jl (a package) is extremely impressive though, and there are particular features of Julia that allow some awesome package interoperability (e.g. the fact that some ODE solvers can be differentiated through with Flux).
I work on the AD infastructure for Julia.
That absolutely is a design goal.
Certainly we are not there yet; we still have a long way to go.
But that is where we want to go to.
With the cavet that thigns that are not mathematically defined to have derivatives (e.g. the derivative of `xs[i]` with respect to `i`) we won't differentiate those.
But for stuff like mutation (the big thing Zygote doesn't support (though some of our other ADs do)), we sure do want it to.
It's definitely a goal for the AD ecosystem, but, as far as I'm aware, AD is not a design goal of Julia itself. That was the point I was trying to make.
E.g. see https://github.com/JuliaLang/julia/pull/33955
Generally speaking, if there is any sort of code transformation pass that's needed for AD but not supported, you can expect people to be working to support either that specific transformation, or a generalization of the transformation. This has been a theme in the language development for years now.
Some? Which ones aren't? Could you open an issue?
Sorry for the confusion.
Is there a citation for this?
Much I would like it to be true, from the little I've read it seems that plenty of universities/colleges were originally religious in nature and were intended to defend orthodoxy, even to combat specific heresies. Definitely not to take intellectual risks.
If you want a reporting solution, buy AI.
"Graduate Student Descent is one of the most reliable ways of getting state of the art performance in Machine Learning today and it’s also ... fully parallelizable"
"Every paper is SOTA, has strong theoretical guarantees, an intuitive explanation, is interpretable and fair but almost none are mutually consistent with each other"
How can that be true unless the "theory" itself isn't really worked out?
Though I have to say that I've personally seen ML invading conferences not related to ML per se, with over half of submissions employing ML techniques to random problems in the primary field which however were in itself clearly of no interest to the presenter and only a vehicle for their graduation even more than usual. So I'm admitting to see MLer as mostly in it for advancing their careers; glad to be convinced otherwise.
If you could name three of them I'd be really grateful. Serious question; everything surrounding ML seems to be only good for (non-monetizable) art projects.
As art it is amazing, not going to lie, but "commercial use" seems like a huge stretch.
Translation (Google translate, DeepL)
Automatically generated product descriptions, sometimes also edited by humans (Alibaba)
Image Tagging (Facebook photos)
Today, I've received a package from Amazon containing router bits (for wood working not IT). It contains a so called "User Manual" which is obviously so badly translated, I assume automatically, that it will only fool a spell checker, that it is actually written in German.
I often hear and read good things about Google Translate but every time I read something from it, e.g. when a browser or webpage helpfully decides that I would prefer a butchered salad of German words instead of an English web page, I am repulsed.
These applications seem to firmly fall into the "I'm willing to compromise on quality if I don't have to pay a living person a wage" niche, so they're value-destroying, not value-creating.
Are there examples of value-creating applications for ML? (From a business point of view; obviously the "shitty translations but at no cost" proposition creates value for the average Internet user.)
At any rate here are way more than 3 other uses of DL today off the top of my head:
* Autocompletion (be it in search engines toolbars or in Gmail/Word/...)
* Superresolution GAN, the most interesting example to me being NVIDIA DLSS, you render a game at ~720p or less and then upscale it to the target resolution of 1080p or 4k, allowing to get quality that the machine would not have been able to support at the target resolution directly.
* Image recognition/tagging: Most of this is used in the security domain, but there is also a lot of stuff around inventory management, safety etc.
* Semantic search
* Protein folding (AlphaFold)
* In astrophysics: detection of supernovaes, FRBs and probably a bunch of other stuff I'm not aware of
* Self-driving cars: Even assuming self-driving technology does not evolve anymore from now on, the current state of the art is still a selling point.
* Predictive maintenance: Used for plane engines and other things
In either case it is only 'value-destroying' if the business has unlimited resources.
I'm talking about commercial applications, like the OP said. That is, things I could potentially pitch to management and substantiate with something concrete that isn't "you can fire your lowest-paid contractors now".
I see the sentiment you've expressed a lot, and feel it speaks to a massive disconnect amongst developers. Most people are interacting with ML systems dozens if not hundreds of times a day.
The stuff you listed doesn't, it's just part of a moat for already established products that don't depend on ML for their market share. (It's not like Netflix will lose market share if they switch from ML to some other approach for their recommendation system.)
It is also used quite a bit in graphics and imaging; DLSS is a consumer-facing application, but it is also used in other domains, like OCR.
Machine translation is another ubiquitous use case. As is any kind of language processing, like text-to-speech.
Also, in the industry, it is heavily used for anomaly and defect detection. Also, Google reportedly uses it for a lot of search/recommendation stuff.
ML is definitely monetizable, but not every company needs it, by a long shot. It is not "AI" and seems to often resemble a complex DSP step when used in practice. I think it's overhyped, but it is far, far more useful than anything blockchain will ever be.
Speech recognition + basic NLP for automatic customer support triage. None of these are great to use as a customer, but they seem to be effective enough to continue using and save companies lots of money.
Automatic "offensive" content detection for social media. I'd bet they use ML to do a first-pass on uploaded content to make sure it doesn't contain porn, gore, etc. Probably lets these companies save money.
Automatic defect detection in factories. Instead of training humans to detect subtle issues in manufacturing issues. I think companies like Samsara are experimenting with offering this tech as a service.
Facial recognition/tracking for law enforcement/defense. Ignoring the ethics of it for a moment, it seems like governments would be willing to pay a good amount of money for this tech. Could be used to automatically search through hours of footage to find which frames (if any) contain a target.
Most of these were developed woth actual business partners and are being used right now.
> Siri/Cortana/Google Home/Alexa, powered by DeepSpeech+language models
> Google Search, powered by BERT
> Tesla, powered by variants of YOLO
> Facial recognition powered by MTCNN+FaceNet
> AirBnB product search+recommendations
> Amazon product recommendations
GANs are a bit artsy, but JFC they're not even a decade old - we've gone from shitty MNIST clones to fully synthetic faces in the span of 5 years!
1. I suspect we're going to see DeepFakes in Hollywood - famous people might license their faces to movies that they might not have the time to star in
2. People are going to start building even more powerful versions of search, like combinations of CLIP
3. Neural networks still aren't optimized for edge devices - we're going to see a deluge of cheap drones with cutting edge computer vision by default
Industrial robotics, driver assist, drug discovery, computational photography, speech translation, and so many other examples illustrate a clear commercial applicability of ML methods scaled in the last 10 years specifically.
Latent Dirichlet Allocation, message-passing in Hidden Markov models, and Naive Bayes for spam filtering. Outside my subfield, there's always the basic handwriting recognition employed in ATMs.
Facial recognition. iphone face unlock, photo tagging, etc.
Behavior prediction for advertising. Ad quality scores basically.
These projects have an impact in the real world [predictive maintenance on infrastructure that serves people, for example].
I think we might see a winter in super big applications like self-driving cars or voice assistants, but ML in general is just a boring, non-controversial business tool with hundreds of valuable applications.
You’ll still need statistical specialists to train and operate models and ensure systems avoid pitfalls like overfitting, poor convergence, multicollinearity, confounders, etc.
So I doubt this will have much impact on ML job market. Companies that invest in ML will continue to run circles around companies that don’t. You’ll just see the unjustified over-focus on SOTA neural networks die down and become just another boring tool in the toolbox like everything else in ML.
Not enough to know how to walk . You need to know where you want to go, and figure out the existence of any path in the first place or you might end up squaring the circle.
Fair enough to say, currently ML research is fixated on exhausting combinations of blocks to squeeze a marginal improve on a few public benchmarks.
But it is like saying software has been stagnated since we are not getting new ISA created. Building ML application remains as challenging as ever, it is just modeling itself has become streamlined before everything else.
This is what I finally found:
BERT, which stands for Bidirectional Encoder Representations from Transformers.
Hilarious. Has Gary Marcus actually done anything, in practical terms, like actual code or something, that outperforms the DL approaches he attacks so viciously?
It seems to be true that no better proposals for solutions have come from his side so far. But I think his criticism per se is valuable, especially his reminder that one cannot simply ignore sixty years of research.
EDIT: and of course he is not the only renowned scientist who calls for reflection; here is a quote from an interview with Judea Pearl: "AI is currently split. First, there are those who are intoxicated by the success of machine learning and deep learning and neural nets. They don’t understand what I’m talking about. They want to continue to fit curves. But when you talk to people who have done any work in AI outside statistical learning, they get it immediately. I have read several papers written in the past two months about the limitations of machine learning." (https://www.theatlantic.com/technology/archive/2018/05/machi...)
>appears less biased than others who benefit from the current high level of investment in DNN.
Yes, instead he blatantly tries to benefit from the counter-investment in AI skepticism.
And who can blame them? AI (read: machine learning) (read: deep learning) research has turned into a huge feeding frenzy. People see all the billions thrown about by Google, Facebook et al. and they go crazy. Maybe they think that if they cheer hard enough and boo hard enough they'll look knowledgeable and "passionate about machine learning" and maybe someone will hire them. Maybe they just want to be on the right side of history, with the winners, not the losers. And when there's so much money to win, there sure are plenty of losers!
A while ago someone posted here an article that advised that to become expert in machine learning one should (among other things) "flashcard X papers in major sub-fields" or something along those lines. Pretty revealing of what people are thinking of: Google is hiring machine learning specialists. Shut up Garry Marcus, you'll scare the fish off.
Unfortunately I'm not unaffected.
Why does it matter? It’s not like the validity or invalidity of Gary’s criticisms would be different if he had done other totally separate ML research.
Any response to criticism that brushes off the criticism based on the source of the criticism is bad and not in the scientific spirit. Progress comes by asking questions and pointing out flaws. You don't need to have an answer to the question before you ask the question.
Marcus's criticisms are valid. Most DL researchers have a huge blind spot. It is not good for the field.
... and yet he has: neuro-symbolic integration. That's his suggestion.
And in fact that's a whole field that's been publishing work for a while now. So he's not just making it up.
We're thinking very hard about if the price of our deep-lolcoin is going to go up enough to buy a new a vacation house....
The criticisms of the financial industry where they are not really creating anything of value other than optimizing more value out of existing money.
Adtech industry where a whole generation of technical researchers spent time figuring out how to optimize clicks.
Cryptoeconomy and blockchain where large amounts of money are created out of perceived value and gargantuan efforts are made to build, not solutions to real-world problems (yes I know there are some), but ways to increase the shared, total value of the technology or individual cryptocurrency.