I've had similar criticisms to the parent author for years, but I thought of it as a hubris problem. In each AI boom, there was a good idea, which promoter types then blew up into Strong AI Real Soon Now. The arrogance level of the first two AI booms was way out of line with the results achieved. This time, it's more about making money, and much of the stuff actually works. Machine learning may hit a wall too, but it's useful.
The field isn't going to get trapped in a local minimum with neural nets because the field is too big now. When AI was 20 people each at Stanford, MIT, and CMU, that could happen. With 50,000 people taking machine learning courses, there are enough people for some to focus on optimizing existing technologies without taking away from new ideas.
We're going to get automatic driving pretty soon. That's working now, with cars on the road from about a half dozen groups. Not much question about that.
The author rehashes symbolic systems and natural language understanding as areas of recommended work. This may or may not be correct. Time will tell. He omits, though, the "common sense" problem. There's been work on common sense, but mostly as a symbolic or linguistic problem. Yet the systems that really need common sense are the ones that operate in the real, physical world. What happens next? What could go wrong? What if this is tried? That's what Google's self driving car project is trying to deal with. Unfortunately, Google doesn't say much about how they do this. That project, though, is really working on common sense.
Incidentally, Danny Hillis did not found Symbolics. He founded Thinking Machines, which built the Connection Machine, a big SIMD (single instruction, multiple datastream) computer with 1024 dumb processors each executing the same instruction on different data.
Riffing off the OP's argument of AI as gradient descent, one could say the 50,000 new people aren't necessarily doing purely random-restart hillclimbing so much as sequential monte carlo, sampling around a distribution laid out by the main pillars of the ML community. That is to say, it should only be robust against weak local minima. Hopefully we should be self-aware enough to know when we're trapped in a strong one, and await a true random-restart.
Whilst technically correct, this obscures the truly collosal scale at which evolution computes.
If we consider each organism as a processor executing a single fitness observation, we can fit hundreds of thousands of such processors (e.g. bacteria) on the few square millimetres of a pin head. These processors are spread (less densely) across the 510 million square kilometre surface of the Earth. They are suspended in oceans, which have an average depth of 3.7 kilometres; and in the atmosphere, at least up to 10 kilometres (based on our observations so far).
Whilst the generation time (AKA execution time) of such organisms varies wildly, it is very commonly less than an hour. Yet even with these truly staggering resources at its disposal, evolution still took 3.3 billion years of wall clock time to come up with multicellular organisms.
Relying on evolution to come up with meta-level optimisation mechanisms like intelligence is also rather flaky; we only have one example of biological evolution to study, acting on one representation (DNA + RNA + proteins). This just-so-happened to stumble upon aerobic respiration, multicellularity, neurons and brains, but it's certainly not a "goal" of the system, and sampling bias prevents us knowing how likely an outcome that is (i.e. whether it's an attractor). Still, organisms with nervous systems are a vanishingly small proportion of all lifeforms; and in fact, every "advance" is dwarfed by the success of single celled microbes.
We could always leave an evolving system running for longer, but setting up conditions which lead to open-ended evolution is still an unknown area; simulations always tend to level off/saturate at some point. As a "last resort" we could provide the system with a mechanism to sample organisms completely at random, and use that as proof that anything is possible given enough time; but in that case why use evolution at all, when we can just perform that sampling directly, or alternatively just enumerate organisms?
I agree, evolution's scale is colossal, but restrict your view to the tiny bit of the evolutionary machine that is implicated in figuring out how to arrange neurons (which evolution "discovered" previously, to be fair, with incredible effort) in a fashion that renders their behavior intelligent, and it becomes a vastly smaller effort.
These beings are all rather large, at least worm-sized or greater. Their lifecycles are measured in days to years, not minutes or hours. And they only had 250 million years or so to make the leap from "bunch of neurons hard-coded to do specific jobs" to "generally intelligent arrangement of neurons capable of higher level thought". All of this cuts the scale by at least 5-10 orders of magnitude compared to what you correctly point out as the overall colossal scale of evolutionary computation. And to me, it says that compared to all the other amazingly difficult stuff that evolution has discovered, intelligence was a damn easy find, especially since there's almost no fitness benefit - hell, worms probably do better than we do as far as evolutionary fitness goes, they're tiny, numerous, and reproduce like crazy.
My money is on the fact that human intelligence guiding a search cuts another couple orders of magnitude off of how much of a long-shot discovering an intelligent algorithm would be via a random search. I could definitely be wrong, it's very hard to be precise when you're vaguely arguing about orders of magnitude, but to me it doesn't seem like the algorithmic "magic" in the brain can possibly be very complex if evolution was able to get there, there just aren't many other circumstances where evolution stumbles on something that damn clever.
I certainly agree that directing/biasing evolutionary processes can be a reasonably efficient search strategy. The difficulty is that it still seems to be a fallback, suited to problems where we don't have a more informed strategy. For example, if we can calculate gradients, we're probably better off doing gradient descent. If we can perform deduction/induction on symbols, we're probably better off doing that. Those problems where evolutionary processes seem well suited are those where we might not know how to bias the search.
I think the best place for such approaches at the moment is meta-level algorithms, where "smarter" algorithms (like gradient descent) are applied to the underlying problems we care about, but the parameters, policies, etc. evolve on top (e.g. step sizes, which algorithms to use, when to restart, scheduling concurrent attempts, etc.).
That's the approach taken by NEAT for example, where a genetic algorithm comes up with neural network topologies, whilst those neural networks are trained using backpropagation.
My 95% bet is that a random or evolutionary search might be how people optimize an AGI, not how they find it. They'll get there via a minor but innovative twist on the RNN work we've already seen (first people need to realize that backdrop through time is a complete dead end, and they should be looking closer at reinforcement learning methods and figuring out how to bring them online).
There are actually ways to formalize and prove that, but it's intuitively obvious that an algorithm that can find the answer to a set of questions, can't be faster than an algorithm to find the answer to a subset of those questions.
I have a few responses. As an undergrad, I won't assume my background is strong enough for direct opinions, so apologies in advance as my responses are mostly pointers to other people.
Professor Patrick Winston at MIT would argue that the expert systems, the successes of the '80s also about making money. Obviously not as much present day, but that is due to the tech sector infiltrating more areas/markets simply because of the exponential growth processing power. It would be very interesting to compare AI's value-add to the world between 1980 and today when adjusted for the "inflation" of moore's law.
It's true I didn't consider the number of people in the field currently compared to the past, and that's an interesting point.
The instructor of the class that I wrote this paper for, Joscha Bach, would argue that real physical world results are not very significant, since the real world can be thought of as a simulation itself.
The idea of "what happens next? What could go wrong?" with self driving cars is interesting. A question arises, should you be able to ask the car, "why did you just stop short?" and receive an answer? This is a question Gerald Sussman has been discussing recently. If we do want systems that can explain their behavior, then they must speak in human-language and therefore must be symbolic at some level. In general, the idea of "Common Sense" seems too much of (what Minksy would describe as) a suitcase word -- because the definition is so abstract, it's not worthy of debate without defining the term.
Great call on Hillis, thanks! I completely mixed up Symbolics and Thinking Machines. Fixed!
I can't get that answer from my horse, but he has the common sense to stop before getting into trouble. Mammals with 99% DNA commonality with humans can run their lives successfully but can't talk. If we can get to good mammal-level AI performance, we should understand how to get to human. Right now, we're still having trouble getting to lizard level. Even OpenWorm  isn't working yet.
But specifically for the horses example: I think it would be pretty hard to defend the case that horses have commonsense, unless you use the definition of commonsense as "basic survival skills." Horses can walk around until they find food, and they can run if they see a fast moving object. But I can't think of any examples that seem like commonsense, in the definition of "sound judgment in practical matters."
When thinking of it from a bottom-up approach, Rodney Brooks comes to mind. He tried to build rat like creatures in the early 90s, with the goal that modeling rat behavior would enable modeling human behavior. However, the results were unsuccessful, and the implementations did not scale well. (Which is another case for the humans-are-fundamentally-different side of the argument)
> Ah, now we get into a pretty interesting debate: is there a fundamental difference between human cognition and other animals?
Actually, that's not an interesting debate at all, unless you like vacuous debates. You will have to be very - very - specific if you want that "debate" to be grounded in actual science.
> pretty hard to defend the case that horses have commonsense
Given that there is no neuroscience-based knowledge of what "common sense" even means, you are making up your own criteria.
I don't even see what point you are trying to make in your last paragraph.
> (Which is another case for the humans-are-fundamentally-different side of the argument)
Eh.. what? You name some random experiment, don't even say much about it at all, and then try to draw a general conclusion. And of course, as in all your other statements, you refrain from any specifics but remain a "politician", just playing with words that don't mean anything specific.
I'd politely disagree that my comment was a sequence of empty statements, but allow me to provide some clarifying details that might help your understanding.
The debate I describe is actually critical. If humans are not fundamentally different, than the field should be able to model more simplistic animals (such as a rat), and slowly build up to a model of human level intelligence. On the other-hand, if there is a fundamental difference between humans and other animals, then simply modeling other animals will not scale, and will leave the field wanting.
I don't think we have to get too specific to debate this. I would point to Winston, Tattersall, Chomsky as three widely respected individuals who present the case that symbolic language (and the uniquely human ability to combine two concepts into a new concept, indefinitely) is the keystone that separates humans from other animals.
In your second criticism, we agree exactly. As you can see in my previous comment, I agree that "commonsense" is an arbitrary term. Here, I was simply providing an example of how debating commonsense is not a useful exercise.
Finally, in my last paragraph I was providing an example of how attempting to use "simple animals" as a basis for modeling human behavior has not effective. The previous commenter said that we haven't reached "lizard level" and pointed to the OpenWorm project. I pointed to a related project, from Brooks, that had a similar mission. It did not work, perhaps, because modeling simple animals (such as a rat or lizard), won't scale to humans. I did not meant to draw a general conclusion (since I said that it is simply another case, not a proof).
As to your final line, let's keep this a lively discussion and not a personal attack.
They too were saddled with drunks and had to know the correct way home without direction.
Can you define "unsuccessful" and "did not scale well"? From what you've said, I cannot see any relation to the "humans-are-fundamentally-different side of the argument". You just make it sound like a failed experiment, without saying why it failed.
This is sad. Especially because I believe philosophy could have a meaningful contribution to the progress of human understanding. It was a "science of interfaces" in the beginning, and now with more and more specialization we also end up with more and more "interfaces" between sub-sub-fields until 90% of everything will be "an interface between something and something else". Too bad we'll have to re-invent it under a different name, and probably disguise it as some sub-sub-field of engineering, to make sure we sever the the connection with all people willing to waste everyone's time with empty-talk and empty-think...
Happens all the time.
I am hungry.
I am angry.
I need comfort.
I want to go somewhere.
I am pleased to see you.
And so on. It is elemental communication but it is real.
It's a joke, but funny because it is true.
That's one of the artifacts of the number of people in the field.
This is not in support of the parent's position, but expert systems also made money and in fact continue to do so today, as they're still in wide use in industry for instance in airliner maintenance, fraud detection etc. Obviously those are legacy systems and sometimes they're only de facto expert systems (as in, they're a big database of rules with an inference engine, though nobody calls them an "expert system"). Still- the technologies invented back then did make a lot of money to many people. But, the market for expert systems went bust and people lost their money, and that's what really killed the field.
The number of people working on AI then and now is not very easy to compare. There was no Google, Facebook or even Apple and Microsoft in the first AI boom. IBM was around and it did do a lot of research in GOFAI, particularly logic programming. I've read IBM papers discussing Prolog type systems for instance. And let's not forget that Deep Blue was essentially an expert system: it searched a database of domain knowledge compiled from the opinions of experts. So, mutatis mutandis (if I may), there was inerest in GOFAI from the industry and there was money invested in it. The difference in that respect with what's going on today is not that clear-cut.
>> This time, it's more about making money, and much of the stuff actually works. Machine learning may hit a wall too, but it's useful.
I'm not going to disagree with that- not entirely. That machine learning is useful so far, there's no doubt, but it remains to be seen how useful it is in the long term. Like with expert systems, the obstacles may be more political, than anything else. I'm finishing a Masters in AI (as in, lots of machine learning) that was paid by my former employer, a big financial corporation- and yet, there was no interest in machine learning in the whole company for most of the time I was there. Maybe that says more about my ability to sell stuff (it's approximately 0) but I really did get the feeling that the corporate world doesn't understand the tech and doesn't care to understand the tech, so if machine learning rises or falls will entirely depend on politics and not how well it works, or doesn't.
Here is a quote from a paper I randomly sampled on arxiv:
> For an expected loss function of a deep nonlinear neural network, we prove the following statements under the independence assumption adopted from recent work: 1) the function is non-convex and non-concave, 2) every local minimum is a global minimum, 3) every critical point that is not a global minimum is a saddle point, and 4) the property of saddle points differs for shallow networks (with three layers) and deeper networks (with more than three layers).
Also, if your critique is related to the perceived lacks of backpropagation, keep in mind than reinforcement learning is also a kind of backpropagation of a reward, but this time the reward is much sparser and low dimensional. Thus, they are somewhere in-between supervised and unsupervised learning, not quite enjoying the full supervision of backpropagating at every example, but still learning based on an external critic.
The way forward is to implement reinforcement learning agents with memory and attention. These systems are neural turing machines, they can compute in a sequence of steps.
While this is an interesting read (and NTMs are great), it is not particularly relevant to the model I describe or this paper.
I think you are misunderstanding my generality of my paper: I am not discussing a particular method of deep learning. I am using the idea of gradient descent as an metaphor for the field of AI, itself.
As described in the second paragraph of the "Gradient Descent" section, this analogy is not high dimensional. In fact, it is only three dimensional: distance the field is from General AI, time, and a hypothetical "method of attack".
Couldn't different objective functions be structurally more difficult than others to optimize? No matter how high-dimensional the search-space, trying to create a gaming laptop in the middle ages would have been a pretty frustrating experience.
Here's a hybrid approach
You should. It directly addresses the idea of blending different fields of AI.
The #1 threat to a.I. right now is kaggleism, that is, training data is more valuable than talent, algorithms and all that.
However, as you can see in the section "1960s", many similar comments were predicted 50 years ago. The point I address in this paper is that optimism is easy when things are going well.
This is a great example. Lots of people are saying today that we will have X in 10 years. But in 2006, no one was saying we will have X in 20 years.
I'm not saying you're wrong, but I'm saying it's worth questing why you are so optimistic, and whether it's because we aren't looking at the big picture.
Also, if anything is confusing, I'd be happy to clarify.
Matching it to Minsky's model really shows the limitations of current AI - but I'm going to read more about the validity of Minsky's model before setting any thoughts in stone.
Thanks for sharing that.
It's definitely worth reading more about Minksy's models. I have a lot of references in-line, but here are a few more starting points. The near-final-draft version his book Emotion Machines is available for free on his MIT site and WashPo gave it a very positive review.
It would be interesting to investigate some criticisms of the model, though I haven't found any that specifically target the 6 layer model.
If not, let's stick to substantive comments on the content rather than grammar or spelling.
"However, it would be useful to have a neural network that is able to perform symbolic algebra. There are two clear reasons for this desire. First, this hypothetical system would demonstrate that neural networks can be used as a substrate for previously-achieved AI systems. Second, a neural network that could perform symbolic algebra would, by definition, be able to manipulate symbols."
Then directly following the above, this sentence makes very little sense:
"This would show that high level knowledge representations can be grounded in statistical models."
What does it mean for "high level knowledge representations" to be "grounded in statistical models?" It sounds to me like your saying to implement symbolic algebra on NNs would prove that NNs can implement symbolic algebra.
There is some good content in the article but I find the conclusions and even the premise to be overreaching. It would read better if it were condensed into a history of AI booms and busts with out making wild predictions.