The first one is Data Science. More and more businesses store their data electronically. Data Scientists aim to analyze this data to derive insights from it. Machine Learning is one of the tools in their tool belt, however often they prefer models that are understandable and not a black box. Sometimes they prefer statistics because it tells you if your insights are significant.
The second one is Machine Learning Engineering. ML Engineers are Software Engineers that use Machine Learning to build products. They might work on spam detection, recommendation engines or news feeds. They care about building products that scale and are reliable. They will run A/B tests to see how metrics are impacted. They might use Deep Learning, but they will weight the pros and cons against other methods.
Then there are AI Researchers. Their goal is to push the boundaries of what computers can do. They might work on letting computers recognize images, understand speech and translate languages. Their method of choice is often Deep Learning because it has unlocked a lot of new applications.
I feel like this post is essentially someone from the first group criticizing the last group, saying their methods are not applicable to him. That is expected.
Probably more accurate to say that it's the first group criticizing others in the first group who try to act like people in the third group. Data scientists who use deep learning for everything, when a more interpretable model would do just as well.
Looks like us Engineers have a common pattern in running towards the next shiny thing ;-)
Somewhere along the journey/career some of us wise up and learn when to say No, and use the right tools for the job, and ignore the "cool" factor.
And the criticism you note isn't one-directional in the field at large. I'm finding that ML/AI researchers deriding ML/Data engineers and "scientists" as not doing "real" ML or AI is becoming a thing, similar to how some computer scientists deride engineering as not doing real computing.
If you mean hype in media and general public, I agree with you. Research is inherently risky and uncertain, often that is not conveyed correctly. Also research results are often oversold.
If you mean that big tech companies are overinvesting in Machine Learning then I have to disagree. It's not a coincidence that the companies that invested the most in Machine Learning (Google and Facebook) are companies that have a lot of data and already use Machine Learning in their products. It provides things like better feed ranking and better signals for search. These "invisible" improvements to existing products are often overlooked.
And the criticism you note isn't one-directional in the field at large.
Yes, like I mentioned I think there is confusion about terminology. Many ML Engineers (in my definition) call themselves Data Scientists. This leads to misunderstandings, when people don't recognize that others have different goals.
But what is driving this hype, as well as the Blockchain hype, is not engineer-driven companies like Google, but rather MBA-driven buzzword-friendly tech companies (imagine Balmer-era Microsoft), non-tech companies who want to share the cake, the media and finally the investors who are misled by the rest, but end up creating a capital-based feedback loop for them.
As far as the people driving the hype train right now are concerned, Machine Learning is the way you do AI, and Deep Learning is just a more powerful (ahem deeper) version of Machine Learning. This is what AlphaGo used to defeat Lee Sedol, so it's obviously superior, and we should use it to process all data, in the same way we should strive to store everything on a blockchain, which is clearly superior to hash tables and databases.
I'm imagining someone tweeting at Robin, several years back:
"Good Prediction Markets expert says: Most firms that think they want advanced Prediction Markets really just need linear regression on cleaned-up data."
See my comments here:
under our recent article " Artificial Intelligence Generates Christmas Song".
Basically, if there is no pixie dust that makes humans intelligent, and instead it is a matter of the architecture of the brain and the first few years of supervised sensory input, then neural net breakthroughs (which use a similar architecture/topology) have the potential at any moment in time to break through and match general human intelligence.
What I mean is that if someone sent back source code from 80 years from now, but we had to run it on a bunch of Amazon / Google servers in a server farm, we're pretty much guaranteed to have enough computing power to do so!
(This is a combination of the number of neurons, their number of connections, and their very slow speed.)
We have the hardware.
Now: we do not actually have the source code from 80 years from now that we can go ahead and run on those machines.
So, we're at like heavier-than-air travel right before the Wright Brothers flew at Kittie Hawk. Except we have like jet engines already -- just no way to design them into something that flies.
I think that AI is vastly underrated and I watch with incredible interest every single breakthrough.
For the second category above, ML built into products or engineering solutions, Alphago surprised me, because Go had an intractibly large possibility space, it's not in any sense subject to brute-forcing or exhaustive search.
Dragon's Natural dictation surprised me in that using its model its able to get basically perfect dictation. I've never worked as a transcriptionist, but a small search reveals it has basically annihilated the industry of medical transcription.
These are not the big, general breakthrough.
But the big, general breakthrough is right there, somewhere. The results researchers are coming up with are astounding, and they're doing it in many cases with neural nets, quite similar to the wiring of the human mind.
The hype is faaaaaaaaaaaaaaar less than warranted for the stage that we're at. At any moment someone can put together something that achieves higher-level intelligence and can be set loose upon the world corpus of culture.
True, there are no clear indications that this is about to happen (For example: people do not extract innate language algorithms from the human genome which encodes them), so we are not exactly taking many steps that we could be trying to. As far as I know we're not even genetically engineering people to see what different parts of DNA do -- which is obviously a very, very good thing, and who would allow anyone to bear to term a child made as an experiment to see what DNA does.
But despite not going from a human starting point, the results that we are achieving in many cases match and surpass human ability - while we do know that in many cases some of their architecture is similar. I feel quite strongly that we have more than enough hardware for general intelligence - and I see advances every day that could end up going past the ppint of no return on it.
EDIT: got a downvote, but I would prefer a discussion if you think I'm wrong.
And you're trying to rebut this by referencing an AI-generated Christmas jingle. I think the author rests their case...
Fast forward to the 1700s and artisan engineers in France were creating similar devices that automatically played music. Toys for the entertainment of the aristocracy. Some insightful engineer noticed the use of pinned cylinders to recreate music, and realized the device might also be used to create textiles with a loom. You could program a machine to create textile patterns. While the initial implementations didn't work very well, eventually Joseph-Marie Jacquard created a version that revolutionized textile manufacturing in the early 1800s, relying on punch cards instead of expensive-to-create cylinders.
By the mid-1800s Charles Babbage became interested in Jacquard's work using punch cards to create complex textile patterns. "The major innovation was that the Analytical Engine was to be programmed using punched cards: the Engine was intended to use loops of Jacquard's punched cards to control a mechanical calculator, which could use as input the results of preceding computations." - https://en.wikipedia.org/wiki/Charles_Babbage#Analytical_Eng...
Toys are amazing for inspiration and innovation.
Remember, 10 years ago smartphones were in the realm of science fiction. Now, everyone has one in their pockets.
Today we can train a computer for a week to generate a cute little christmas jingle. How many years do we need to train a human child to do the same? AGI won't happen overnight, sure, but its effects of that research will be felt continuously along the way.
If you used to have 10 people working at a supermarket and you now have 6 with self-checkouts, that's your 40% jobs lost (for that supermarket) right there.
Tesla are describing their 'hardware 2' self driving platform, fitted to all of their new cars, as "full self-driving hardware", and they've posted several videos recently of fully autonomous journeys on public roads. They've said they plan to roll out an OTA update to enable full self driving next year.
Mercedes-Benz have a limited self-driving mode on their 2017 model.
Uber-owned Otto, a company focusing on self-driving technology for trucks, plans to start offering its services next year.
And of course Google is quietly continuing their self-driving project, although they've gone a bit quiet recently.
Now, it does look to me like we've still got a fair few corner cases to straighten out, but either all of these companies are badly wrong, or we'll be seeing self-driving vehicles on sale to the public in the next few years (even if not next year).
IOW I meant to transclude that discussion here. (Perhaps within that comment thread a good specific summary comment is: https://news.ycombinator.com/item?id=13090869)
Obviously it is hard to know when that magic moment will happen that some kind of general AI is created that can learn in some sense similarly to how humans do. My every indication and astonishment at the results that are being produced strongly suggests "at any moment". The results are absolutely astonishing every day and we have vastly more than enough firepower.
You might also be interested in this separate thread where I dealt with questions of consciousness and pain. I again reference it here:
and you can click through at the top to follow my reference.
We are way past the point of no return here, and in my estimation it is a question of years or at the most decades - not centuries.
> Obviously it is hard to know when that magic moment will happen that some kind of general AI
> is created that can learn in some sense similarly to how humans do. My every indication and
> astonishment at the results that are being produced strongly suggests "at any moment".
>As an IT guy with a basic but solid neuroscience education
-- could you go ahead and take a few minutes (maybe will take you 5-10) to read through my above-referenced links referencing my previous discussion and tell me whether I'm correct in your estimation on the bottom-up aspect - i.e. the amount of computation that human neural nets can likely be doing, and how it compares to server farms with fast interconnects today?
I'm not an expert in neuroscience so your feedback might be helpful there.
The above sentence is true, but it has no bearing on anything.
The fact that they weren't reverse-engineered (yet) would still have huge bearing on everything.
By 7 billion samples I mean the humans walking around. Your analogy with an RSA crack is fundamentally different beccause biology doesn't do it in 3 pounds of grey goo in seven billion different bodies already.
so you would have to come up with an analogy that uses something we cant use, to say, okay fine it exists and fine, we have the hardware to also do it, but the former doesn't have any bearing on us doing the latter.
Still it wouldn't help us find the P-time algorithm in question. We could say "it seems to exist", but that would not imply "we'll discover it any day now".
> If brains routinely cracked RSA
> i.e. the amount of computation that human neural nets can likely be doing
Also, the "computation" a brain does at one moment leaves out the time aspect: Lots of things lead to constant changes. The wiring changes all the time. The "computation" metaphor has little use for describing or understanding this major aspect of "brain". The more I learned about neuroscience the more unhappy I got with the computing metaphor that I had had going in (as a CS graduate, naturally, I think). The brain is so very, very different from my pre-neuroscience-courses notions.
How much neuroscience do you know? If the answer isn't at least an undergrad introductory course (the accompanying book is over a thousand pages), why do you get the idea you can make any predictions?
Read this to read about complexity in biology vs. engineering and what scientists in the field think how well we are dealing with it:
Test yourself: Do you understand what he's talking about? http://inference-review.com/article/the-excitable-mitochondr...
Fortunately you don't have to sign up at university these days just for such knowledge:
Free courses (if you ignore the certificate nonsense):
- https://www.mcb80x.org/ (This is linked to from edX as "The Fundamentals of Neuroscience" Parts 1, 2, 3)
- https://www.coursera.org/courses?languages=en&query=neurosci... (Especially "Medical Neuroscience": https://www.coursera.org/learn/medical-neuroscience)
You're assuming neural nets are the right model.
Like a 19th century person saying, "if there is no pixie dust ... then eventually Newtonian mechanics will explain these unexpected wobbles we see in the planets' orbits."
Any model that gives reasonably similar results to human thinking will be useful. Planes don't have feathers and don't eat bugs, but we say that they "fly" anyway, and they carry far more cargo / passengers (and faster) than birds ever could.
What we need instead of "consciousness" is a set of concepts:
- "perception" - means representing inputs in such a way as to be able to adapt to the environment to achieve goals (I place qualia here)
- "judgement" - the ability to select the best action for the current state
- "reward" - comes from time to time, and is used as a signal in behavior learning
These three concepts : perception, judgement and reward are simpler, clearer and less ambiguous. They have implementations in AI, not human level yet, but getting there. These concepts make the problem concrete instead of bringing up 2000 years of attempts to "get consciousness" based off armchair philosophising.
A more interesting question than "what is consciousness?" would be: what are the reward signals that train the human brain. We do Reinforcement Learning in the brain, but the reward signal is complex, made of multiple channels, which are evolutionarily optimized for our survival. It's the engine of drive that puts us in motion. We need to reverse engineer that in order to replicate human consciousness in silico.
Consciousness is a replicator, a system that is concerned with survival and maintaining balance in the face of perturbations and entropy. What it does is to make us go find food, make babies and protect from danger - essential actions without which the human species would disappear, and consciousness with it. Or, in other words, it maximizes its rewards over time, reward being having food, shelter, company, access to learning and a few more. But it's just limited number of reward types, and they are much simpler to understand than consciousness itself.
So consciousness is just that thing that maintains itself instead of becoming disorganized and dissipating out like most physical processes that lack a self stabilizing, self replicating dynamics.
Consciousness is a balancing system for the colony of cells that just want to pass their genes into the next generation.
is there anything in that which requires that it be a physical biological substrate rather than a model? Can't computer models follow the same behavior as long as that's how their environment / system is set up? (More specifically, can't they also have brains made up of cells connected to each other, in a way closely analogous or even an exact simulation of chemical pathways?)
But as you said, it's the algorithms that are by far the long pole. Current supervised learning is much akin to simple rote learning. This is the promise of reinforcement learning - and the ability to truly learn on your own through experience. That's scalable.
*that said ... I'm biased being one of Rich's students :(
Off the top of my head the closest analogy would be number theory, especially the study of prime numbers. Before information technology in general, number theory was esoteric and considered useless by many pure and applied mathematicians (I'm simplifying a bit for argument's sake) but all of that accumulated research proved massively useful once we started to communicate electronically. WWI and II cryptographers didn't become or absorb number theorists as a group, they just adapted the knowledge to their field under the umbrella of electronic warfare. I think this is happening with data scientists, who are starting to experiment with ML but it's still just another tool in their toolbox. The media hype train focuses on the flashy AI contests and muddles the terminology but the real work  is happening behind the scenes in data science.
 By "real work" I mean work that directly translates into value on a company balance sheet. The theoretical work is important in and of itself and has been happening for decades.
You mentioned that that Deep learning was a method of choice for AI researched because Deep it has unlocked a lot of new application.
I have a question - is it also a "method of choice" for researchers because its not well understood yet why Deep Learning actually works?
An automated labyrinth could be solved by hand up to a size, but if it becomes larger, it would become "impossible to understand". We'd need to rely on computers to find the route for us.
It's not computer magic, just an ability to hold more data at once. We're limited to 7-8 things - try to remember a string of more digits and see if you can - it's hard, we just have that kind of limitation. So any computer algorithm that can't be broken down into 7-8 understandable parts is hard to grasp.
Of course we can make ML tools synthesize the preferred input of any neuron. That does help a little.
>deepmind steals people from the top ML research teams in univerisites around the world
>these people are given an incredible amount of money to solve an incredibly complex task
>a 6000 layers deep network is run for 6 months on a GPU cluster the size of Texas
>Google drops in their marketing team
>media says Google solved the AI problem
>repeat every 6 months to keep the company hot and keep the people flow constant
>get accepted at every conference on earth because you're deepmind (seriously, have you seen the crap that they get to present at NIPS and ICML? The ddqn paper is literally a single line modification to another paper's algorithm, while we plebeians have to struggle like hell to get the originality points)
I'll be impressed when they solve Pacman on a Raspberry Pi, otherwise they are simply grownups playing with very expensive toys.
Deep learning is cool, I truly believe that, and I love working with neural networks, but anyone with a base knowledge of ML knows better than to praise it as the saviour of AI research.
Rant over, I'm gonna go check how my autoencoder is learning now ;)
I think this is the thing people don't quite get when they buy into the hype. These systems are extremely inefficient. Requiring terrabytes if not petabytes of data and basically a powerplant next to a data center to power the whole thing.
The work is valuable and pushing the boundary on what the hardware can do is great but so far all these things lack any kind of explanatory power and suck up a lot of energy to power the black boxes. DARPA recently put out a research program for making systems more efficient and adding explanatory capabilities to them (http://www.darpa.mil/program/explainable-artificial-intellig...). Ultimately that is the direction these things must be headed if they are to provide real value for the masses. Relying on a clever black box only takes you so far and is not beneficial in the long run because as these systems become more integrated into the institutions that drive large scale decision making they'll need to be held accountable for those decisions.
The powerplant criticism is more true of the training phase and much less true of using the resulting networks on a larger scale. The good thing about the "hype" is that more resources are being directed into the field so they'll be more efficient processing platforms (e.g., ASIC or FPGA) and better delineation of what's really needed and if there are possible shortcuts (e.g., ReLU).
The black box problem will prevent its use in some systems, but even in some areas of medicine, it will be fine because medical AI must be used in conjunction with the final decision-maker, much like how Watson is being positioned. A deep learning system that detects anomalies in patient imaging with very high precision will be useful even if it can't explain why it thought it was an anomaly. It's quality control for the radiologist, etc.
Sure, maybe their timing was fortunate and it might have happened a year later anyways. But this is definitely not a guarantee, and they were still the ones to do it.
P.S. For anyone who is interested, another very strong bot is out in the wild now. It has been playing on the KGS go server the past week under the name Zen19L, and has a rank of 9d. Some great games have resulted as people challenge it.
There is also the symbolic value. It was a "coming of age" event to a certain degree. I believe go was the last classic board game that researchers had been longing to conquer.
I haven't looked deeply into this, but my understanding is that image recognition is still somewhat subpart outside of well-curated datasets. Is that not the case anymore?
The same holds for AlphaGo and Watson. Also, my understanding is that they are more technically interesting than Deep Blue. Given how recent these projects are, their legacy is only beginning to unfold.
I agree with you that applying this tech to domains other than games is no easy challenge. But I would be very surprised, in the long run, if events like this are not documented as key steps along the way at some point in the future.
1) Too much credit is given to Google for this result. What I see as already a huge brain drain on society is only going to get bigger.
2) People are going to expect that if "computers are smart enough to play Go", they're smart enough to do ____. What goes in the blank? Very little right now, but I guarantee you investors and the public have a lot of ideas and think it's around the corner.
Hopefully a lot of good things will come out of this wave of AI, but who knows what or when. I think the point is that there may be a panic and contraction before anything really awesome happens, and a lot of that is going to be because of "stunts" (for lack of a better word) like this being overhyped.
I'm pretty sure "grownups playing with very expensive toys" accurately characterizes >100,000 software employees in the US right now.
Have there been any solutions proposed, in particular to the limitations of adding extra cores.
No it isnt.
I really hope people stop spreading this myth.
They are primarily a research company, not a marketing setup. The latest stuff the released on letting the networks do something like dreaming which reduced the time to learn by up to 10x seemed interesting and I look forward to seeing how they do with Starcraft and the hippocampus. A lot of this stuff is cool because it gives insights into the human mind and how the brain works rather than practical gadgets. https://www.extremetech.com/extreme/240163-googles-deepmind-...
This said I still believe the article is mistaken in its evaluation of potential impact (and its fuzzy metaphore of pipes). Unstructured or semi-structured or dirty data is much more prevalent than cleaned structured data on which you can do simple regression to get insight.
Ultimately the class of problems solved by more advanced AI will be incommensurably bigger than the class of problems solved by simple machine learning. I could make a big laundry list but just start thinking of anything that involves images, sound, or text (ie most form of human communication).
For example, I predict stereo vision algorithms will die out soon, including deep-learning-assisted stereo vision. It's useful for now but not something to build a business around. Better time-of-flight depth cameras will be here soon enough. It's just basic physics. I worked on one for my PhD research. You can get pretty clean depth data with some basic statistics and no AI algorithm wizardry. We're just waiting for someone to take it to a fab, build driver electronics, and commercialize it.
This argument sounds like a second cousin of the "chemicals are bad, but if it is natural it is good" argument. Just because it evolved in nature doesn't mean it's optimal, or it's the best system under massively different constraints. And who knows what evolution would have thrown up after a few more billion years.
The distance at which a stereo vision system can capture precise depths depends on the distance between eyes, and the eyes' angular resolution. Human depth perception works well for things within about 10m, but when you get out to 20-40m humans get a lot less info from stereo vision.
When you get to that distance, humans seem to have a whole load of different tricks - shadows, rate of size change, recognising things of known size, perspective and so on. You can see a car and know how far it is even without stereo vision, because you know how big cars are, and how big lanes and road markings are. You can even see two red lights in the distance at night and work out whether they're the two corners of a car, or two motorbikes side-by-side and closer to you.
On the other hand, your basic general-purpose stereo machine vision system doesn't try to understand what it's looking at - you just identify 'landmarks' that can be matched in both images (high contrast features, corners etc) and measure the difference in angle from the two cameras. This is relatively simple and easy to understand!
For tasks that humans can do that involve depth perception of things more than ~40m away - flying a plane, for example, where most things are more than 40m away if you're doing it right! - nice simple stereo vision can't get the job done, because humans are actually using their other tricks.
Of course, despite this limitation stereo vision comes up a lot in nature - it's still a beneficial adaption, because most things in nature that will kill you do so from less than 10m away :)
It's actually pretty rare for non-predatory animals to have good stereo vision. Most of them are optimized for a wide field of view instead, evolving eyes placed on either side of their head. Think rabbits, parrots, bison, trout, iguanas, etc.
Edit: IMO binocular vision is probably more to do with redundancy than depth perception. If you damage or lose an eye, you can still operate at near full capacity. Losing vision 'in the wild' is a death sentence.
My experience is that waiting for cleaner data is often like waiting for Godot and will often be a project killer (sometimes justifiably). This is a key issue at the moment in advanced ML: clean large training sets ideal for supervised training are elusive and the companies making real-world advances are pretty much all using available data (and semi-supervised techniques) rather than expensive made up training sets.
You should talk to us at Leaflabs. Commercializing research-level tech in embedded electronics is what we do.
Maybe for regular "cameras", but parallax still gives the most accurate results for 3D reconstruction from satellite images, where it is a very lively area of research. The resolution, and surface coverage, of radar satellites is far worse than what you can obtain by stereo matching optical images taken by the powerful telescopes on satellites. I guess in other fields like microscopy it makes a lot of sense also. Not all imaging happens indoor on commodity cameras!
A lot of problems are not tackled on a fundamental level. Occlusion, context, proprioception, prediction, timing, attention, saliency, etc.
A simple rat has more intelligence than whatever is behind a dashcam, security cam or webcam.
Yes, a quality TOF system would be great. However good luck convincing consumers to adopt hardware with lidar on it. The Tango is having enough trouble on it's own and it does pretty well for consumer systems with IR.
Besides that you can't do FTDT with laser systems AFAIK. You need something to capture unseen places, such as ultrasonics/HF - which I guess you could argue fall into TOF but I haven't seen that work done.
In the end my money (literally!) is on the opposite if your approach, namely building better RGB systems because there are already a trillion cameras deployed that we can extract from.
First project Tango smartphone will be out soon.
I think it first went on sale Nov. 1 but just started shipping initial batches recently.
(None of which takes from your point.)
They can have problems operating outside because it is hard to make a lightsource brighter than the sun.
I've been expecting good, cheap non-scanning laser distance imagers for a decade. In 2003, I went down to Advanced Scientific Concepts in Santa Barbara and saw the first prototype, as a collection of parts on an optical bench. Today ASC makes good units , but they cost about $100K. DoD and Space-X buy them. There's one on the Dragon spacecraft, for docking. That technology isn't inherently expensive, but requires custom semiconductors produced with non-standard processes such as InGaAs. Those cost too much in small volumes. There's been progress in coming up with designs that can be made in standard CMOS fabs. When that hits production, laser rangefinders will cost like CMOS cameras.
Can anybody point me to some literature or reference materials about attempts to combine the inputs from multiple techniques simultaneously?
E.g. a device with stereo conventional cameras and infrared cameras & emitters which compares the resulting model from each input source/technique and actively re-adjusts final depth estimate?
Is "sensor fusion" the right jargon to use in this context?
Or, even crazier, a control system which actively jitters the camera's pose to gain more information for points in the depth map with lower confidence scores / conflicting estimates?
But maybe such a setup is overly complex and yields minimal gains in mixed indoor & outdoor scenarios?
One neat thing though you might want to look at: if all you have is structured light (ie Kinect v1) you can simply attach a vibrating motor to each emitter/receiver to avoid a lot of interference per 
The giant piles of dirty data are that way because for thirty years no one has considered them worth cleaning up. How will they create such astounding amounts of unexpected value?
It was left that way because we didn't have the tools to process it. Imagine the amount of unprocessed video data that we can now annotate pretty accurately. What's the value of that data now?
That's how AI always looks in the rearview mirror. Like a trivial part of today's furniture. Pointing a phone at a random person on the street and getting their identity is already in the realm of "just machine learning" and my phone recognizing faces is simply "that's how phones work, duh" ordinary. When I first started reading Hacker News a handful of years ago, one of the hot topics was computer vision at the level of industrial applications like assembly lines. Today, my face unlocks the phone in my pocket...and, statistically, yours does not. AI is just what we call the cutting edge.
Open the first edition of Artificial Intelligence: A Modern Approach and there's a fair bit of effort to apply linear regression selectively in order to be computationally feasible. That just linear regression is just linear regression these days because my laptop only has 1.6 teraflops of GPU and that's measley compared to what $20k would buy.
The way in which AI booms go bust is that after a few years everybody accepts that computers can beat humans at checkers. The next boom ends and everybody accepts that computers can beat humans at chess. After this one, it will be Go and when that happens computers will still be better at checkers and chess too.
Does the book mention linear regression at all? The term doesn't appear in the index.
The first edition discusses Least Mean Squares LMS which is in a way [that way being my way maybe] the 'evil twin' of Least Squares. It fits the data points (in this case the values produced by the agent) to a known curve rather than the curve to the data points. In the first edition, 'long running' examples are on the order of a few hundred to a thousand epochs and fitting a curve via linear regression is similar computationally.
Anyway, I'll claim poetic license if I must. And I must if my hand waving doesn't work.
Robin's post reveals a couple fundamental misunderstandings. While he may be correct that, for now, many small firms should apply linear regression rather than deep learning to their limited datasets, he is wrong in his prediction of an AI bust. If it happens, it will not be for the reasons he cites.
He is skeptical that deep learning and other forms of advanced AI 1) will be applicable to smaller and smaller datasets, and that 2) they will become easier to use.
And yet some great research is being done that will prove him wrong on his first point.
One-shot learning, or learning from a few examples, is a field where we're making rapid progress, which means that in the near future, we'll obtain much higher accuracy on smaller datasets. So the immense performance gains we've seen by applying deep learning to big data will someday extend to smaller data as well.
Secondly, Robin is skeptical that deep learning will be a tool most firms can adopt, given the lack of specialists. For now, that talent is scarce and salaries are high. But this is a problem that job markets know how to fix. The data science academies popping up in San Francisco exist for a reason: to satisfy that demand.
And to go one step further, the history of technology suggests that we find ways to wrap powerful technology in usable packages for less technical people. AI is going to be just one component that fits into a larger data stack, infusing products invisibly until we don't even think about it.
And fwiw, his phrase "deep machine learning" isn't a thing. Nobody says that, because it's redundant. All deep learning is a subset of machine learning.
I'm skeptical of claims about a one-shot learning silver bullet, unless people are talking about something different from how it has been classically presented, .e.g. Patrick Winton's MIT lectures. Yes, you can learn from a few examples, but only because you've imparted your expert knowledge, maintain a large number of heuristics, control the search space effectively, etc. There's a lot of domain-specific work required for each system, so I consider it more an approach of classical AI and not something that figures out everything from the data alone, like deep learning.
But again, maybe people are talking about something different than my above description when they talk about one-shot learning today. Either way, I don't think having to rely on a lot of domain specific knowledge is necessarily a bad thing.
I was really impressed by their results, but I haven't seen it being applied successfully to other applications.
I think the process they use to select the hyperparameters overfits on the labels in the validation set. The true size of the labelled data includes the 100 from the training set and all labels in the validation set.
I'm really not convinced by one-shot learning, or rather I really don't see how it is possible to show that any technique used generalises well to unseen data, when you're supposed to have access to only very little data during development.
Even with very thorough cross-validation, if your development (training, test and validation) set are altogether, say, 0.1 of the unseen data you hope to predict, your validation results are going to be completely meaningless.
I've been hearing more folks in research and industry express the importance of applying simpler techniques (like linear regression and decision trees) before reaching for the latest state-of-the-art approach.
See also this response to the author's tweet on the subject: https://twitter.com/anderssandberg/status/803311515717738496
Broadly, whether you should move from OLS to random forest regression = SNR increase / increase in manhours and money spent.
The GP compares python-vs-assembler and random forests-vs-linear-regression but the analogy breaks because python produces assembler and increases the programmer's general certainty concerning what they are doing. Random forests don't make their user more certain of the results as an application.
Basically, Python is a relatively "unleaky" abstraction whereas complex AI algorithms a very "leaky" abstractions.
My personal suspicion is, in a market full of people who are using ever more sophisticated algorithms to ratchet up their customer conversion classifiers' F1 scores by .001 per iteration, the leader will be the company who's decided to steer clear of that quagmire and spend their time and money on identifying new business opportunities instead.
which could be seen as retrograde vs assembler (but not really for the very funny and brilliant code above - you have to see in formatted nicely and run it to realize that there are some great people out on the web!) perhaps in fact I would agree with this dig - some people do write horrid bits in their python code and python seems to facilitate (or enable) this behavior rather more than other modern languages like Julia. But taking your comment more at face value, reading it to say that more complex methods represent an evolution and that they should be accessed by users as they are easier or better I would disagree. It is easy to screw things up with a random forest or a booster in the sense of overfitting, focusing on the method and not the features and not understanding what the model extracted is telling you about the data. Often a regression model or a decision tree can reveal that there are a few simple things going on which say more about how a process or system has been implemented than the generating domain that that process or system is operating in. This can be gold dust. So, I think that they can be easier to use and simpler to understand, of course when they don't do the job better model generators are required.
Notice now you can cogently disagree with the main idea while agreeing with most of the sub points (paraphrasing below):
1) Most impactful point: The economic impact innovations in AI/machine learning will have over the next ~2 decades are being overestimated.
2) Subpoint : Overhyped (fashion-induced) tech causes companies to waste time and money.
AGREE (well, yes, but does anyone not know this?)
3) Subpoint: Most firms that want AI/ML really just need linear regression on cleaned-up data.
PROBABLY (but this doesn't prove or even support (1))
4) Subpoint: Obstacles limit applications (though incompetence)
AGREE (but it's irrelevant to (1), and also a pretty old conjecture.)
5) Subpoint: It's not true that 47 percent of total US employment is at risk .. to computerisation .. perhaps over the next decade or two.
PROBABLY (that this number/timeframe is optimistic means very little. one decade after the Internet many people said it hadn't upended industry as predicted. whether it took 10, 20, or 30 years, the important fact is that the revolution happened.)
It would be interesting to know if those who are agree in the comments agree with the sensational headline or point 1, or the more obvious and less consequential points 2-5.
Sure, I suppose it's possible that advances we've seen in AI won't be translate into huge productivity gains, but I would think that extremely unlikely.
Arguing for more linear regression to solve a firms problems, is equivalent to arguing for machine learning. Now, if instead he wanted to argue that the vast majority of a businesses prediction problems can be solved by simple algorithms, that is most likely true. but economic impact of this is still a part of the economic impact of machine learning.
For the most part they haven't run those regressions at all, and where they have, they haven't been awe-inspiringly successful in their predictions, never mind so successful the models are supplanting the research of their knowledge-workers.
LR and general regression schemes are captured in supervised learning methods. So yes, the systems use linear regression as a fundamental attribute but build on them significantly.
Yes, DL has proven itself to perform (most?) gradient-based tasks better than any other algorithm. It maximizes the value in large data, minimizing error brilliantly. But ask it to address a single feature not present in the zillion images in ImageNet, and it's lost. (E.g. Where is the person in the image looking? To the left? The right? No DN using labels from ImageNet could say.) This is classic AI brittleness.
With all the hoolpa surrounding DL's successes at single task challenges (mostly on images), we've failed to notice that nothing has really changed in AI. The info available from raw data remains as thin as ever. I think soon we'll all see that even ginormous quantities of thinly labeled supervised data can take your AI agent only so far -- a truly useful AI agent will need info that isn't present in all the labeled images on the planet. In the end the agent still needs a rich internal model of the world that it can further enrich with curated data (teaching) to master each new task or transfer the skill to a related domain. And to do that, it needs the ability to infer cause and effect, and explore possible worlds. Without that, any big-data-trained AI will always remain a one trick pony.
Alas, Deep Learning (alone) can't fill that void. The relevant information and inferential capability needed to apply it to solve new problems and variations on them -- these skills just aren't present in the nets or the big data available to train them to high levels of broad competence. To create a mind capable of performing multiple diverse tasks, like the kinds a robot needs in order to repair a broken toaster, I think we'll all soon realize that DL has not replaced GOFAI at all. A truly useful intelligent agent still must learn hierarchies of concepts and use logic, if it's to do more than play board games.
Cleaning up data is very expensive. And without that, the analysis is good for nothing. AI helps provide good analysis without having to cleaning up data manually. I don't see how that is going away.
I don't even know where to start. I suppose you don't really think that what separates ML/AI (whatever that means) from your standard OLS regression is that denoising is not done manually?
> Cleaning up data is very expensive. And without that, the analysis is good for nothing.
I hope you are not saying here that linear regression cannot handle noise.
In the end, as this blog post also points out, ML/AI is just a vague blanket term. What you want is a statistical method that captures the signal as fast and efficiently as possible and often the gain from going beyond simple linear models might be marginal.
My own experience has shown that dirty data impacts advanced AI just as much as it impacts far more basic ML techniques.
Even for the most advanced AI we work on, we spend just as much time worrying about clean data as we do anything else.
In general: duplicate data, missing fields, different formats for different parts of the data, inconsistent naming schemes
For text: character encodings, special symbols, escape characters, punctuation, extra or missing spaces and newlines, capitalization
For images: different sizes, rotations, crops, blurry images
For numbers: inconsistent decimal point/comma, outliers with obviously nonsense values or zeros, values in different units of measurement etc.
And then there's bugs in your data pipeline: browser (particularly IE) bugs, logging bugs, didn't understand your distributed databases's conflict resolution policy bugs, failed attempts at cleaning all the previous categories, incorrect assumptions about the "shape" of your data, self-DOS attacks (no joke - Google almost brought down itself by having an img with an empty src tag, which forces the browser to make a duplicate request on every page) which result in extra duplicate requests, incorrectly filtering requests so you count /favicon.ico as a pageview, etc.
If this works at all, the problem is that what you wind-up with is a system that's been heuristically taught to clean-up data for a single snap-shot of your data - and the teaching is expensive and requires experts who are going to move on.
Less expensive than clean your data but still a cost.
So when you wind-up with a different pattern of dirty data after a year's time, the system winds crappier than previously and no one will be able to agree how to fix it.
Eventually AI is going to get an evil reputation and that may kill its appeal.
I have always been a stickler for making sure the data going in is good. Takes a bit longer, but makes life much much easier in the long run.
Practically speaking isn't unsupervised learning one of the areas that hasn't really shown much practical application?
I mean, if I'm wrong point me to some examples, but I was under the impression cleaning up data wasn't realky a ML strength at this point...?
This field is notorious for its hype-bust cycles and I don't see any reason why this time would be different. There are obviously applications and advancements no doubt about it, but the question is do those justify the level of excitement, and the answer is probably "no".
When people hear AI they inevitably think "sentient robots". This will likely not happen within the next 2-3 hype cycles and certainly not in this one.
Check out this blog for a hype-free, reasonable evaluation of the current AI:
There are a few very nice applications of the AI techniques, however most data sets don't fit well with machine learning. What you see is that in tutorials use the Iris data set so much because it breaks into categories very easily. In the real world, most things are in a maybe state rather than yes/no.
Not to get too far afield, but I disagree with this on a certain philosophical level. All states are yes/no. All states of all things should result in a yes/no and be differentiable, with enough data. This doesn't speak to the practicality of that but as far as I can tell the theoretical potential is huge, almost infinite even.
What? I'm sorry but this runs counter to everything in my experience, both professionally, and just casual very day experience.
More data, helps to a point, but then there's diminishing returns, and it certainly doesn't eliminate the ambiguity. On the contrary, you discover diversity, and you still have a misclassification and perhaps even a harder data cleaning problem, because now you're seeing cases that aren't actually clear cut. Even if you're only talking about adding more features, well again, that works up to a point, but then you hit sparsity issues.
Think about it. Let's say you had a problem which was find the black squares. So you collect some data and you find that you have a whole bunch of squares that are on the blackness scale of 0.0, and bunch that are 0.1, and then there's one at 0.5. Is 0.5 black? Maybe not. What about 0.7? Maybe. What about 0.999? Probably, but is it? It's not 1.0. And if we say 0.9 and higher are black, why not 0.89? Even discounting measurement error, there's nothing that supports a threshold at 0.9 beyond, "Well, I think it should be this."
Yeah I hear this but it seems only half-true to me. While for most intents and purposes the world is ill-defined, in another sense the world itself is "100% signal" and no noise. If we "zoom out" and take a grand view, imagining that we have a supercomputer and a huge database, and the algorithms are solved, I think every 'thing' in the universe has some unique features, and if you start to have them all in a database you may be able to uniquely identify any thing, at least those important to us. Everything one has excludes something else, but it also includes that specific thing. Every thing adds context to one thing and removes context from another. If you can draw a map of it, it seems to me like deep learning can, hypothetically, automatically differentiate it. Deep learning isn't just about one vector or one hierarchy of features, it's about how the world is ALL vectors like this, even if right now, the CS around it is pretty limited. It seems to me intuitively true at least. At the bare minimum, seeing as us humans are absurd about categorizing everything into objects, and it actually works very well functionally (we can manipulate, create and predict in the world)
I suspect DL will eventually settle into a less vaunted role in the historical saga of AI than it portends now. And that role may well be the 'grounding' of sensory experience -- the modeling of the world into something perceptually and cognitively manageable, like Plato's shadows on a cave wall.
I think you'll find your ideas are actually very, very old. ;)
Right now, they are saying AI self driving cars can get their predictions right 95%+ of the time. However, the cases where they cannot classify the object is the problem. Those are the "maybe" cases I was referring to where they algorithm simply cannot classify the object no matter how much it has seen.
Could you elaborate on why this is?
Now let's take sentiment analysis which tries to determine if some words are positive or negative. If someone said: 'that new machine learning algorithm is so sick'. The algorithm has no way of knowing that sick may be slang to mean good, because the system looks up 'sick' and finds that is a negative word. Sentiment analysis has no way of defining sarcasm or other natural language terms.
Of course it can. If humans are capable of detecting a given inflection, computers absolutely can as well (given enough data).
Any sentiment analysis algorithm which classifies "that new machine learning algorithm is so sick" as negative is not worth an ounce of consideration. Compared to other problems, that is absolutely trivial to classify, especially since you're typically training off data sets which already include such vernacular.
OTOH, the current progress in AI has enabled us to do things we couldn't do before and is pointing towards totally new applications. It's not about making existing functionality cheaper, or incrementally improving results in existing areas, it's about doing things that have been heretofore impossible.
I agree that deep nets are overkill for lots of data analysis problems, but the AI boom is not about existing data analysis problems.
I'm pretty sure industrial engineers would disagree here. The romans certainly didn't clean their lead pipes with plasma and neither did they coat them with some fancy nano materials to reduce stickyness.
The simple things with pipes are simple. Yes. However, to think we haven't made advances, or have no more to make, is borderline insulting to mechanical engineers and plumbers.
Ironically, deep learning will likely help lead to some of those advances.
Not what I was saying at all. My point was that pipes are used to transport something from point A to point B, and that regardless of what advances we make, they are still going to be used for that purpose, and that this is unlike the situation with AI.
Pipes do much more than just transport from a to b. Though, often it is all a part of that. Consider how the pipes of your toilette work. Sure, ultimately it is to get waste out of your house. Not as simple as just a pipe from a to b, though. You likely have a c, which is a water tank to provide help. And there are traps to keep air from sewage getting back in.
Basically, the details add up quick. And the inner plumbing for such a simple task are quite complicated and beyond simple pipes.
So, bringing it back to this. Linear algorithms are actually quite complicated. So are concerns with moving all of the related data. And that is before you get to things that are frankly not interpretable. Like most deep networks.
This hits the nail on the head for me. The author's observations in the first 9/10 of the article could all be perfectly valid, but the conclusions he draws I. The last two don't follow for exactly this reason.
Along similar lines, HFLP systems and systems that require laminar flow to be effective are both more recent techniques that come out of a better understanding and engineering of pipes. HFLP upgrades are a current engineering change over very recent and modern high-pressure systems.
How about a space elevator ?
To say that technology is like an iceberg is a major understatement.
The buzzwords which tech journalists, tech investors and even tech recruiters use to make decisions are shallow and meaningless.
I spoke to a tech recruiter before and he told me that the way recruiters qualify resumes is just by looking for keywords, buzzwords and company names; they don't actually understand what most of the terms mean. This approach is probably good enough for a lot of cases, but it means that you're probably going to miss out on really awesome candidates (who don't use these buzzwords to describe themselves).
The same rule applies to investors. By only evaluating things based on buzzwords; you might miss out on great contenders.
An AI winter doesn't mean that progress stops. It means that businesses and the general public become disillusioned by AI's or ML's failure to live up to the popular hype, and stop throwing so much money at it. The hype then dies down. Research continues, though, until enough progress is made that machine learning starts to produce results that excite the public again, and the cycle goes into another hype phase.
Robin Hansen is notoriously skeptical about the possibility that Deep Learning can make real gains. He for some reason thinks brain emulation is more likely to make large progress in AI.
>An AI winter doesn't mean that progress stops.
It doesn't completely stop, but progress would be at a snails pace.
> The hype then dies down. Research continues, though, until enough progress is made that machine learning starts to produce results that excite the public again, and the cycle goes into another hype phase.
I think we as a community may need to take a good long look at the hype cycle theory and be skeptical it has any merit.
Progress in machine learning has never been at a snails pace. What has happened at times is that the futurists of the world stop making breathless pronouncements about machine learning, and that creates a public perception that things are going slowly.
Case in point: Deep learning really isn't anything revolutionary or new. I first started noticing papers about stuff that now falls under the "deep learning" catchphrase about 15 years ago. Not because it started then, but because that's when I started reading that sort of thing. During that time, there's been relatively constant, steady progress being made. But you wouldn't know it unless you had been following the literature, which not many people do. And all of this happened at a time when everyone was convinced that neural nets were dead and support vector machines were the way of the future.
As far as Robin Hansen's skepticism about deep learning getting us to true artificial intelligence, meh. A technology doesn't need to make Ray Kurzweil's eyes roll back in his head for it to be useful.
If we can successfully emulate the brain, it seems we would have necessarily acquired the knowledge needed to build models that are very powerful without having to exactly mimic the brain.
The "AI winter" as I know it was around 1988.
I've still vintage issues of "AI Expert" and other publications of the time (which I kept for their great linocut-style artwork, and because they were expensive items here around).
Back then it wasn't so much about ML (aka "Deep Learning") but anything Prolog, expert systems, artificial neural nets and their generalizations, and Lisp.
Putting 'AI' on a startups prospectus will do it no harm. It may help sway lazy investors, or at least make the company appear more cutting edge.
Same goes for the investors, they want to be seen to be investing in cutting edge, buzzword compliant startups.
It is just human nature, following the herd is always going to be the safest option.
Have you ever played one of those strategy games with a tech tree? Research A and it lets you research B, C, and D; research C and D and it lets you research E; etc?
That's based on the way discoveries in the real world build on eachother. And in the real world, the research tree seems to be "lumpy".
Think "agricultural revolution", "industrial revolution", etc. Something new comes available, and everyone rushes to pick off all the new low-hanging fruit. Eventually the easiest gains are all taken, and people lose interest and move to other things. And as people keep picking away more slowly at the more difficult/involved things, eventually someone will find something that -- probably combined with some completely different existing knowledge -- opens up another new field. And it repeats.
Right now we're in the "low-hanging fruit" phase of (1) computers that are powerful enough to run neural networks, combined with (2) feedback algorithms that allow networks with lots of layers to learn effectively. Sooner or later the gains will get a bit tougher as we understand the field better, and then research will slow even further as many researchers find something else new and shiny -- and with better returns -- to focus on.
I'm not so sure about that. Places like Deepmind are not satisfied with simply having AI that does straight forward pattern matching problems (Though that's very important). They are moving into more complex problems like transfer learning, reinforcement learning and unsupervised learning for more complex, real world problems solving. They also seem to be making good progress on this as well.
ML companies are already tackling tasks which have major cost implications:
Those are just the two I had off the top of my head. We apply ML tasks for object/scene classification and they blow away humans. Not only that we're already structuring a GAN for "procedural" 3D model generation - in theory this will decimate the manual 3D reconstruction process.
I can't think of normal people wearing those heavy gears in their normal life. There will be its use cases in specialized applications like education, industry, games but I don't think it will get popular like an iPhone.
AR is still OK since it augments real life but there is a long way before it will become mainstream.
Ultimately, however, I resold it after a month. There are too few interesting full games available. Nearly every game is mostly a short trial, and most of the games are also very experimental and uninteresting to me in general. As an example, a full 1/3 of games available were musical demos that seemed to be geared toward folks having fun experiences while presumably smoking weed or otherwise in an altered state.
There was never a reason for me to come back to the system, but I would like to see if there are very imaginative useful practical applications that eventually see light. After surveying the other VR options I'm not convinced anything exists yet.
That's never stopped consoles from selling. Eventually, those games come out.
That seems like a clear example of a problem that will correct itself with time.
I wouldn't wear an Oculus while I am home sitting on my SOFA with my family members and not many people play games especially considering third world countries like India. VR would have its place but not to the likes of smartphones.
The fact that AI is now a meaningless term is mostly a testament to the success of AI.
* The article is correct and the current singularity (as described by Kurzweil) will hit a plateau. No further progress will be made and we'll have machines that are forever dumber than humans.
* The singularity will continue up until SAI. So help them human race if we shackle it with human ideologies and ignorance.
There is no way to tell. AlphaGo immensely surprised me - from my perspective the singularity is happening, but there is no telling just how far it can go. AlphaGo changed my perspective of Kurzweil from a lunatic to someone who might actually have a point.
Where the line is drawn is "goal-less AI," possibly the most important step toward SAI. Currently, all AI is governed by a goal (be it a goal or a fitness function). The recent development regarding Starcraft and ML is ripe for the picking, either the AI wins or not - a fantastic fitness function. The question is, how would we apply it to something like Skyrim: where mere continuation of existence and prosperity are equally as viable goals (as-per the human race). "Getting food" may become a local minimum that obscures any further progress - resulting in monkey agents in the game (assuming the AI optimizes for the food minimum). In a word, what we are really questioning is: sapience.
I'm a big critic of Bitcoin, yet so far I am still wrong. The same principle might apply here. It's simply too early to tell.
This is why you should work on something you're passionate about. Your time on earth is limited, so strive to leave good work and contribute to the progress of humanity on a larger scale.
But data science is here to stay in the same way that computer science is here to stay.
I meant that computer science has staying power, while particular branches (or JS frameworks) may rise or fall in popularity over time. Likewise, the "trunk" of data science knowledge is not a mere fad.
I'm not a data science academic or practitioner. My opinion is based on a small amount of tinkering and what I've read in various online sources.
I do have one suggestion: learn to handle dirty data.
I work with ML researchers and notice two things: they're pretty bad software engineers (no knowledge of software patterns, bugs galore), and they almost never know how to clean their data. The latter is because they do a lot of their research using pre-cleaned, standard data sets. You never get that in industry.
as to ML, its adoption is hyped. it is powerful, but not as anyone really talks about.
support vector machines and Bayesian learning have been around since the 70s/80s (ninja edit: SVM's since 1963! Markov Chains 1950s, Bayesian Learning/Pattern recognition sine the 1950's), but adoption has been slow due to the nature of business, which is now drooling over it since neural networks beat a few algorithms.
due to the hype, more business will opt for ML now, but the craze will plateau and ML will become another tool in your arsenal.
so basically, you really have nothing to worry about - use your Phd to do interesting things, come up with novel and new research and/or develop your own product.
don't let your job security worries get in the way of enjoying what you want to do now, you're already good and in STEM (and if you don't feel good enough, work on yourself until you do).
This is one of the things I find hardest about convincing managers and leads of. They think things like CRFs and Markov models are "new" methods and too risky. So they opt for explicit rule-based systems that use old search methods (e.g. A*, grid search), which hog tons of memory and processor. Those methods rarely ever work on interesting problems of the modern day.
They can understand the rule-based methods easily. They have a hard time leaping to "the problem is just a set of equations mapping inputs to outputs, and the mapping is found by an optimization method."
The people who can't hack it are those who'll be in trouble. Tech has never much been the place where credentials are necessary.
Don't specialize and saddle yourself with years of college debt if you're unsure of the field's long term prospects.
And companies like Google are pretty keen on academic credentials. They've assembled what must be one of the largest collections of PhDs in history.
Right. The debt problems that people have after PhDs are more often due to their undergrad loans sitting around for 4-7 years while they were earning enough to subsist and not more.