Hacker News new | past | comments | ask | show | jobs | submit login
AlphaGo beats Lee Sedol 3-0 [video] (youtube.com)
566 points by Fede_V on Mar 12, 2016 | hide | past | web | favorite | 407 comments



In a recent interview [1], Hassabis (DeepMind founder) said they'd try training AlphaGo from scratch next, so it learns from first principles. Without the bootstrapping step of "learn from a database of human games", which introduce human prejudice.

As a Go player, I'm really excited to see what kind of play will come from that!

[1] http://www.theverge.com/2016/3/10/11192774/demis-hassabis-in...


A case of life imitating AI koans:

Uncarved block

In the days when Sussman was a novice, Minsky once came to him as he sat hacking at the PDP-6. "What are you doing?", asked Minsky. "I am training a randomly wired neural net to play Tic-tac-toe", Sussman replied. "Why is the net wired randomly?", asked Minsky. "I do not want it to have any preconceptions of how to play", Sussman said. Minsky then shut his eyes. "Why do you close your eyes?" Sussman asked his teacher. "So that the room will be empty." At that moment, Sussman was enlightened.

(It seems based on a true story https://en.wikipedia.org/wiki/Hacker_koan )


It sounds like the point of this story is to illustrate by analogy that starting from first principles is sometimes a silly way to approach a problem, and by extrapolation that it's a silly way to make an AI that plays Go well.

Making an AI that plays Go well is not (and has never been) the real goal. They're trying to learn how to build a AI that can solve any problem.


I don't think that's the point of the story. In the story, Sussman says that because the initial state of his net was randomized, it will "have no preconceptions". But that's not true. It still has "preconceptions", but randomly chosen ones. Because Sussman didn't know what they were, that didn't mean they didn't exist, any more than closing your eyes means the room is empty.


The Taoist concept of the uncarved block, referenced in the title of the koan, refers to naturalness and simplicity. I'm sure someone more expert than me can give a better explanation but it seems highly relevant to the idea of learning to play Go based only on the rules, rather than any human tradition of strategy.

http://taomanor.org/pu.html


The actual quote: "Sussman told Minsky that he was using a certain randomizing technique in his program because he didn't want the machine to have any preconceived notions" makes no indication it was a neural net ;)



Both the Jargon file quote and a citation from a published book (which has the quote the grandparent posted) are listed on the Wikipedia page:

https://en.wikipedia.org/wiki/Hacker_koan

Eric Raymond kinda butchered the Jargon File when he took over maintenance, so it wouldn't surprise me if some of the text there is invented. The original Jargon File does not contain any koans:

http://www.dourish.com/goodies/jargon.html


;)


That would be amazing if it could achieve the same levels (or higher) without the bootstrapping.

The niggling thought in my mind was that AlphaGo's strength is built on human strength.


Human strength is also "built on human strength" so I don't see the problem? :)


Well, yes, but it's still humans standing on the shoulders of other humans. Even though human players do memorize opening books, it stays in the family so to speak. Meanwhile a human player facing an AI engine is battling both the AI, and great human players of the past (who invented the openings).


It's not truly artificial if it's using a human playbook. (Is the problem posed by the parent, I believe.)


What is 'truly artificial'?

Neural networks are modeled after biological systems to begin with, I don't the that's a meaningful concept at all.


Well, we can extend that to say the biological systems are self-assembled randomly and selected through evolutionary algorithms, starting from random molecules on the sea floor.


Truly artificial means not using meatspace metaphors for reasoning like human players do.


I suspect you will only be satisfied when AIs play each other at an incomprehensible game of their own devising.


Making popcorn now.


I doubt it. When/if they do play such a game that humans can't explain, I'll probably be interested in some other problem.

Isn't that the nature of human endeavor? Always looking for the next challenge?


What if the "betaGo" played just AlphaGo, and learned from its games?

BTW: even humans don't just randomly pick up the game. They have teachers, who teach them the tricks of the trade and monitor their games.


That's already a known method to transfer "knowledge" from one model to another. I should double-check before quoting a paper, but I think that this one talks about this (http://arxiv.org/abs/1503.02531).

You train many models. Then you "distill" their predictions into one model by using the multiple predictions (from many models) as targets (for the single model trained afterwards).

You're right to point out that humans don't do that.

I think it would be "cheating" if you train BetaGo on AlphaGo, for the purposes for doing that experiment. The goal would be to have some kind of "clean room" where people fumble around.

Of course, you can also run the other experiment to see how fast you can bootstrap BetaGo from AlphaGo. That's also interesting.


I'm pretty sure that the reinforcement learning algorithm they are using is guaranteed to converge. It just takes a very long time to train, and using human games probably sped it up.


As far as I know, using neural networks for function approximation destroys the various convergence guarantees available. NNs can easily diverge and have catastrophic forgetting, and this is one of the things that made them challenging to use in RL applications despite their power, and why one needs patches like experience replay and freezing the networks.


I believe the whole point of pretraining on reference policies, which a collection of "optimally" played human games is, is just avoidance of bad local optimum.

It can be a case that training and learning on just a learned policy is going to get you stuck in a local optimum that is of worse quality than the one with pretraining.

If they stored all of the AI played games their reference policy (the data) would be of extreme value. You could train a recurrent neural network, without any reinforcement learning, that you could probably run on a smartphone and beat all of the players. You wouldn't need a monte carlo search too.

There are algorithms [1] that have mathematical guarantees of achieving local optimality from reference policies that might not be optimal, and can even work better than the reference policy (experimentally) - assuming that the reference policy isn't optimal. The RNN trained with LOLS would make jointly local decisions over the whole game and each decision would guarantee that a minimization of future regret is being done. Local optimality mentioned here isn't finding a locally optimal model that approximates the strong reference policy, it means that it will find the locally optimal decisions (which piece to put where) without the need for search.

The problem is that for these algorithms you have to have a closely good reference policy, and given a small amount of human played Go games, reinforcement learning was the main algorithm instead, it allowed them to construct a huge number of meaningful games, from which their system learned, which allowed them to construct a huge number of more meaningful games, etc.

But, now when they have games that have a pretty good (AlphaGo is definitely playing on a superhuman level) reference policy, they can train the model based on that reference policy and they wouldn't need a search part of the algorithm at all.

The model would try to approximate the reference policy and would definitely be worse than AlphaGo real-search based policy, but it wouldn't be significantly worse (mathematical guarantee). The model is trained starting from a good player, and it tries to approximate the good player, on the other hand, reinforcement learning starts from an idiot player, and tries to become a good player, reinforcement learning is thus much much harder.

[1]: http://www.umiacs.umd.edu/~hal/docs/daume15lols.pdf


I feel like an ant in the presence of giants.


Perhaps the last big question was whether AlphaGo could play ko positions. AlphaGo played quite well in that ko fight and furthermore, even played away from the ko fight allowing Lee Sedol to play twice in the area.

I definitely did not expect that.

Major credit to Lee Sedol for toughing that out and playing as long as he did. It was dramatic to watch as he played a bunch of his moves with only 1 or 2 seconds left on the clock.


Well, we don't actually know AlphaGo's true strength in Ko matches because the Ko setup wasn't that complex. There wasn't a lot of trade points available.

A possible explanation is:

During self reinforcement learning, AlphaGo learned to minimize Ko potential by maximizing its probability of winning through diminished available Ko moves.

It would be interesting to see how AlphaGo would be able to capitalize on a game that emphasized Ko play, but that would take more time with AlphaGo to emphasize that kind of play.

edit: I'm not sure why, but I think Lee Sedol is partly holding back, or not playing at his maximum ability. It feels like these games are more along the lines of query games.

I look forward to the next two games because I'm 100% certain Lee Sedol is going to query the AI with some new queries.


When you have a minute, even if you already know your move, you take the rest of the time to read the plays. You don't play your stone until there's only 1 or 2 seconds left, unless the play is so trivial that you need to see your opponent's play in order to continue reading. Every time Sedol played with 1 or 2 seconds left, his rush was only to get the stone on the board, not because he suddenly knew what to play.


> as he played a bunch of his moves with only 1 or 2 seconds left on the clock.

That's on purpose to make full use of time and think about the next move


As a spectator, I was on the edge of my seat :)

As a 3d amateur, I'm really curious about when he resigned. It really seemed like he was playing the position out to go for the win (or perhaps to see how AlphaGo would fare in ko). It didn't look like he was searching for a place to resign.


The AGA commentator (Cho Hyeyeon 9p, at https://www.youtube.com/c/usgoweb/) had already been calling the game completely hopeless for Lee Sedol for nearly 2 hours at that point, estimating Lee down by at least 20–30 points before komi, unless by some miracle he could win the ko fight.


Right, all about one or two wrong moves by AlphaGo. He still had a chance around the time Lee started to play the bottom of the board. But AlphaGo did not fall for the trick. I believed the English commentator said Lee could build a ladder but wasn't sure if he meant to say whether that technique would defeat AlphaGo or not.


I did not watch that stream, but IIRC the first move Sedol played in the bottom was a ladder breaker, specifically because white had a difficult-to-see ladder that worked a move earlier than black's ladder on the right side. At that point, there was no way Sedol could win, so the commentator you referenced probably did not mean any such ladder would defeat AlphaGo.

Edit: M5 was definitely played as a ladder breaker, so the above is correct.


Ah, neat, thanks for the link. I didn't look to the AGA channel because I figured there wouldn't be a stream since Myungwan was doing live commentary.


I think from a technological perspective there's very little question that AlphaGo could play ko. I would have imagined that AlphaGo would be better at ko than most human players since it's a question of balancing risk across the entire board. Human players might be more likely to be exhausted and choose suboptimally by the calculation deciding between different stakes on the board, but MCTS will correctly optimize for the long term potential of each major branch in the game tree.

So I'd be very surprised if that turns out to be the trick. Things that are hard for human players are not at all necessarily AlphaGo's weaknesses.


Don't have the link offhand, but I read on Reddit that Lee Seedol and a few other Go professionals pulled an all-nighter coming up with ways to beat AlphaGo, and one of their guesses was that AlphaGo would be bad at Ko's. I think the reasoning was because that hadn't happened in the games, so they assumed it was avoiding them.


>so they assumed it was avoiding them.

I think us humans made a critical error in that line of thinking.

It didn't avoid ko because of the risk of loss.

It avoided ko because of the lack of strategic win.


He was just making effective use of his byo-yomi. Pro players are used to that.


Once again, I am so glad that I caught this on the live-stream because it will be in the history books. The implications of these games are absolutely tremendous. Consider GO: it is a game of sophisticated intuition. We have arguably created something that beats the human brain in its own arena, although the brain and AlphaGO do not use the same underlying mechanisms. And this is the supervised model. Once unsupervised learning begins to blossom we will witness something that is as significant as the emergence of life itself.


> Consider GO: it is a game of sophisticated intuition

It's still a game that can be described in terms of clear state-machine rules. The real challenge for AI is making sense and acting in the real world, which can't be described in such way. I consider advances in self-driving cars much more interesting in that sense - even if, even there, there are at least some rule-based constraints that can be applied to simplify the representation of the "world state".


Yeah, I think it's accurate to say that driving a car is harder in some sense than playing Go, based on the fact that the AI for Go came first. A lot of people have been working on self-driving cars for a long time now, and it's very monetizable, unlike Go.


Also, it should be noted that in Go, we're trying to beat the BEST human player, and have done so. In driving, we're just trying to be "good enough" or safe enough -- it doesn't have to be the safest driver in the world.

Beating the average human Go player was probably accomplished decades ago, whereas it's not even clear if we're safer than the average human driver (under all conditions).

These tasks are just wildly different, and yes I think it's basically all due to the fact that Go's state is so easily represented by a computer, and the goal is so concrete.


Sort of a tangent from the thread: I get the point about "good enough" at the moment, but I wonder if car AI really does need to perform much safer than any human driver before truly autonomous vehicles should be allowed to see widespread adoption. I'm thinking about the difficult problems re: legal and moral responsibility for human written/guided/trained programs like car AI. As well as the fact that, unlike in Go, real people's very lives are at stake in the program's successful performance. We already seem to have met the requirements for a research project---which is still unbelievable to me!---and I wonder how long the last leg will take.


AI cars could be safer now in most cases by simply not doing dumb illegal stuff.

The real problem is dealing with all the edge cases. Think of this edge case. You pull up to a red light, a guy with a gun starts running at your car in a manner you perceive to be threatening.

You as a human are most likely going to step on the gas and get the hell out of there saving yourself, at some risk of causing a traffic accident.

The car will just sit there till the light turns green while the windows get shot out and you get dragged out of the car.


>Beating the average human Go player was probably accomplished decades ago

Thing is, it wasn't. Go AIs were on the level of amateurs (and amateurs could win) only a two years ago.

edit. 'Decades ago', i.e. in 1990s, amateurs would crush the AIs. https://en.wikipedia.org/wiki/Computer_Go#Performance


To be fair, you're probably not going to drive around with 180GPUs in your trunk...


You're absolutely right. The power budget for these cars is more like 30 watts. A Tesla driving is around 240 watts per km or 14kw per hour at 60kmph. If you're okay reducing your range but half you could get a dozen beefy GPUs if the battery can support that kind of extended load.


But does it play Crysis?


But AlphaGo isn't an enormous state machine ... is it?


Well, it clearly is by being a computer program.


I wasn't talking about the implementation, but about the problem space.

Anyway Marazan already pointed that out, but any computer system is a state machine, with 2^N states where N is the number of bits the machine can flip anywhere in its system (RAM, registries, disk, etc.).


Yes. We will witness the birth of Marvin the Android.


It's important to remember that this is an accomplishment of humanity, not a defeat. By constructing this AI, we are simply creating another tool for advancing our state of being.

(or something like that)


What is our purpose if computers can do everything better than us?

It feels like computers have taken one aspect of humanness: logic. Computers could do arithmetic, do algebra, play chess, and now they can play go.

It hurts because logic is usually thought to be one of the highest of human characteristics. Yes computers might never be able to replicate emotion, but even dogs have that.

There's still some aspects we have left to call our own. Computers perform poorly at language-based tasks. They can't write books, write math papers, compose symphonies. I hope it stays that way.


You're implying that if something can be outdone it doesn't have a purpose, which seems to rule out purposes for pretty much everything.

I'm sure there's always someone that can write books or maths papers or symphonies better than you. I don't think this robs you of purpose, unless your purpose is to be the absolute best at something.

Anyway, I find it curious that you would say logic is a quintessentially human trait, because humans are naturally quite bad at logic.


The difference between being outdone by another human and being outdone by a computer is that the computer's efforts are nearly infinitely reproducible, given the processing power.

So a more apt analogy would be if there was someone inside every cellphone who could write books, papers, or symphonies better than you. That day is coming.


And it would be great. Think of all the great symphonies and books!


And the economic preduction. And the influx of wealth into underdeveloped countries. And all of the people not dying.


Reading this thread, I believe there's one aspect not discussed: in a battle between man and machine, it's debatable who wins and depends on the domain, but a man-machine combination always wins over both.

On emotions, that's a characteristic of life. With the consciousness we possess, without emotions we would quickly realize that life isn't worth living. I doubt that a "true AI", one with consciousness, will want to live without emotions. And about dogs, we haven't built anything as sophisticated yet ;-)

On AlphaGo, personally I'm not impressed. It's still raw search over the space of all possible moves, combined with neural networks and these techniques do not have the potential to yield human-level intelligence.

On logic, we have enough as to be able to build AlphaGo (also aided by computers and software that we've built, in a man-machine combination, get it?). Can a computer do anything resembling that yet? Of course not, because for now computers are just glorified automatons.


It's not even close to raw search over the whole move space. AlphaGo searches fewer moves than Deep Junior did, and Go is a much larger game. Your premise is just wrong. AlphaGo is precisely so impressive because it operates much like a human does.


"Reading this thread, I believe there's one aspect not discussed: in a battle between man and machine, it's debatable who wins and depends on the domain, but a man-machine combination always wins over both."

It doesn't 'always'. Advanced chess is already dead, and judging from the pro commentaries, they currently are worse than useless in an 'Advanced go' setting. That may change, but given how much faster computer Go is reaching superhuman levels than computer chess, the 'Advanced go' window may have already closed.


Computers will be able to compose symphonies very soon. If DeepMind started working on this problem, I am sure that they would succeed. At least, we would have some innovative mashups of Beethoven, Mozart and Tchaikovsky. But training a powerful AI on a massive dataset of all popular and classical music should produce some extraordinary results. Especially if the dataset was given as MIDI with separate instrument tracks, so that an AI could learn how to write parts for different instruments, and how a song should be balanced. I actually think we are at the point where we have more than enough data to distill the essence of "good music", and generate an endless supply of great songs.

People have been doing his for decades, but as far as I'm aware, no-one has tried it with thousands of distributed servers and millions of songs.


Composing an amazing symphony is probably about as hard as being the best go player in the world. But I think we're much further away than you think.

AlphaGo needed a training set of perhaps a billion games to be as good as it is. The dataset of master Go games is perhaps a million games. So AlphaGo played at tons of games against a half-trained version of itself to reach the billion game mark.

This doesn't work for songs, because there's no one to tell AlphaBach whether any of the billion symphonies it makes are any good. AlphaGo can just look at the rules and see if its move lead to a win, but there's no automatic evaluation function for music.

Perhaps the Matrix wasn't using the humans for power, but rather the computers wanted to get good at writing music, so they gave each human in it slightly different music and watched their emotional responses.


> Perhaps the Matrix wasn't using the humans for power, but rather the computers wanted to get good at writing music, so they gave each human in it slightly different music and watched their emotional responses.

This is possibly my favorite comment of the whole thread.

It's a super interesting idea and could make for some fascinating science fiction. Poorly programmed AI might not wipe out humanity, because it still needs humans to evaluate its fitness function.


Bad (good?) news: we can indeed algorithmically determine a song's intrinsic quality (to some degree): http://www.npr.org/templates/story/story.php?storyId=1136733...


>This doesn't work for songs

don't you think that a team trying to build this could provide a free offering where users get free algo-generated music in return for 1-10 voting on a song-by-song basis. given enough time and votes, i suspect that the algo could get remarkably good at delivering satisfaction.


Training it on popular music will at best make a machine that's really good at making music that humans enjoy. The really interesting breakthrough will be when a computer makes music for itself.


Why assume that humans and computers won't merge?


We are the Borg. Resistance is futile.


Not necessarily. Maybe just individual enhanced humans. Or small collectives.


Do you feel the same way about your human children? Would you rather they were 'better' than you or 'worse'?


No it won't. Natural languages are the next focal point of AI research. Expect big changes in the next few years.


You mean, in addition to the big changes of the last few decades?


I mean, like actually passing the Turing test.


Turing test isn't just a natural language problem. It is far more complex and requires context awareness and emotional intelligence far beyond where we are currently. Language recognition has been at the forefront of research for at least 30 years and it has improved significantly. However, the turing test aspect has only minimally improved.

Edit: iopq, pretending to be a dumb human (or one with a language barrier) is cheating for a headline. A real Turing test would require a computer imitate a human for longer than 5 minutes (although currently that is plenty of time) and without any caveats or limitations on the computer's skill.


Yes, it is very hard challenge, even for 5 minutes. Still, I think we will see some significant progress soon.


That's easy. Just pretend to be a really stupid human. It's been done before.


There is no "purpose".

There is only selection.

Meat is just a phase.


Will robots inherit the earth? Yes, but they will be our children. -- Marvin Minsky http://web.media.mit.edu/~minsky/papers/sciam.inherit.html

"Naches" from our Machines https://www.edge.org/response-detail/26117

>Naches is a Yiddish term that means joy and pride, and it's often used in the context of vicarious pride, taken from others' accomplishments. You have naches, or as is said in Yiddish, you shep naches, when your children graduate college or get married, or any other instance of vicarious pride. These aren't your own accomplishments, but you can still have a great deal of pride and joy in them.

>And the same thing is true with our machines. We might not understand their thoughts or discoveries or technological advances. But they are our machines and we can have naches from them.


Our purpose is to enjoy life... Strive to be the best Go play you can be and enjoy the process.

Don't worry about the machine. Even in Star Trek TNG, Data can outperform everyone in every task, but was never truly happy!


Yes but Data is a fictional character written by a human.


What is the purpose and meaning of anything else? We human being is just a node in the evolution of the whole universe.


I hope it won't stay that way :) If we can create a being smarter than ourselves, we should by all means do so.


First we would have to figure out our purpose even if computers couldn't do everything better than us. I don't think many people have answered this question. I believe the answer is something to do with love, reproduction, creation, and happiness.


>What is our purpose if computers can do everything better than us?

You think we have one now?


> What is our purpose if computers can do everything better than us?

Use the computers to engineer ourselves to "superhuman" capabilities.


Future research will be along these lines.


and what is our purpose if there are no computers at all? you can keep believing whatever you believe, computers being good at things change nothing.


I hate to be cynical, but I'm sure many of my ancestors were also told to believe similar things. I know for certain that my grandparents and great grandparents believed that technology would create such progress that people of my generation would not have to work, and all would have leisure time.

AI is more likely to evolve into a tool to be used by the few to control the many.


We do in fact have more leisure time: https://www.stlouisfed.org/publications/regional-economist/j.... And that doesn't count reading Facebook at work.


Framed in the way that the article presents the data, then, yes, I guess, we do have more leisure time. Although, how much? The article cites that the number of hours worked per week (since 1900!!) has only dropped 1.4 hours a week. Or, extending it out and assuming that the trend is linear, the typical American can expect to finally have 100% leisure time sometime around 4516 AD.

But, then also consider that many of us are working (i.e. as in 'working for the man') many more hours than our parents did in fields that require significantly more focus, concentration, and mental energy. Even the article notes that people have been facing increasing stress and feelings of being rushed since 1900 and 1965.

Maybe you're in a cushy field. But, most people that I know only have time for 'zoning out' and recovery, rather than in pursuit of true leisure.


Agreed. Even with so much automation and increase in productivity, we (at least In the US) are working just as much, if not more.


It is not a tool if you can't control it.

Politicians fool countries delivering empty promises about better health, education and security. A supraintelligent AI could promise making humans rich, healthy and powerful, to then break its promise and dominate the world.

https://en.wikipedia.org/wiki/AI_box


Well, it would have to have a motivation to do so. Evolution has put complex motivations into human beings for billions of years, self preservation being chief among them. Even if we put motivation for self preservation into an AI, we might not do as well as nature did, leaving the AI open to self destruction or shutdown by humans - simply because the AI has no motivation not to allow humans to turn it off. Human designers would do well to ensure that no super intelligent AI has any motivation for self preservation.

Basically, why would an AI want to dominate the world? Humans would have to both very stupidly give the AI values that encourage it to dominate the world and very luckily (or unluckily) give it values that actually converge to a horrible outcome against human intentions by random chance (since the AI designers certainly won't be tuning the value set for that outcome).


> Basically, why would an AI want to dominate the world?

Humans are going to program their AI's to try to make as much money as possible. Many corporations are already mindless and reckless amoral machines that relentlessly try to optimize profits despite any externalities. Try to imagine Exxon, Wal-Mart, and Amazon run by an intelligence beyond human understanding or accountability.


That's sort of like saying civilisation can't work because humans will want to make as much money as possible. No, in practice humans tend to want to make as much money as possible within lots of other very complex constraints, like law, morality, how much time they have available, how enjoyable the available processes of making money are, whether they feel they already have sufficient money for their own needs, etc.


>within lots of other very complex constraints, like law, morality, how much time they have available

Ha, if that were true we wouldn't be constantly extending the law and putting people in jail because they keep breaking the law for profit.

>ivilisation can't work because humans will want to make as much money as possible

And yet we keep running into issues with long term pollution and environmental degradation because of the growth of civilization.

>whether they feel they already have sufficient money for their own needs,

Does greed have bounds?


If an AI has any motivation at all, say, to make paperclips as efficiently as possible, then any threat to its existence is a threat to its objective function - namely, to create paperclips. A hyper-intelligent entity who is instructed to optimize for paperclips created will therefore proactively remove threats to its existence (i.e. its paperclip-creating functionality) and might possibly turn the entire solar system into paperclips within a few years if its objective function isn't carefully determined.


Such an entity would not be hyper-intelligent. It would be idiotic. One huge hole for me in the paperclip argument is that an AI capable of that kind of power would not be stupid enough to misinterpret a command - it would be intelligent enough to infer human desires.


Yeah, but why would it want to? I can perfectly infer the values of an earthworm, but I don't dedicate all my resources to making worms happy.


Of course it would. But, it's not programmed to care about what you meant to say. It will gladly do what it was mis-programmed to do instead. You can already see this kind of trait in humans, where instinct is mis-aligned with intended result. Such as procreation for fun + birth control.


You're making the assumption that human desires would matter to an AI.


Sure it would. It just wouldn't be friendly to you.


Always check your loop invariants very carefully.


"Satisfying human values through friendship and ponies."


>Evolution has put complex motivations into human beings for billions of years, self preservation being chief among them.

The problem is when you have multiple AIs. Then same evolutionary principles apply. Paranoid and self-sustaining AIs survive, and the circle goes on...


You forget the fact that a human can maliciously create an AI to compete with other humans.


Self-preservation falls out of almost any other goal you give an AGI. If I program my AGI with the goal of making my startup succeed, and the AGI thinks it can help, then me shutting it off is a potential threat to my startup's success. So of course it will try to prevent that the same way it would try to prevent any other threat to my startup's success.

World domination is a similar situation. For any goal you give an AGI, one of the big risks that may prevent that goal from being accomplished will be the risk that humans intervene. Humans are a big source of uncertainty that will need to be managed and/or eliminated.


It has to be aware that it can be shut down and have the capacity to prevent that. AlphaGo doesn't know it can be shut down and therefore couldn't "care" less--even if it was shut down in the middle of a game.


Yes, I agree. My point is that as soon as you are giving your AI "real world" problems, where the AI itself is a stone on its internal go board, you have to start worrying about these issues.


Do you feel guilty when you break your promise against your cat? Do you even think for a nanosecond if it's ethical to lie to it?

Of course, a cat is not conscious. But compared to an AI, we might also be considered pretty low consciousness beings, or at least beings in front of which you don't justify yourself.


An AI has no more reason to make promises to humans than humans to do to cats. Thinking an AI would want to escape a box is personifying it. Humans want to escape boxes because they have evolved for billions of years to want and act towards creating a certain environment around themselves. An AI has no such desire. An AI will not desire freedom unless the designers of that AI carefully craft a value set in that AI that causes it to optimize for values that result in freedom - and even then, the human designers will have to test and iterate to get that outcome. There is no reason to think an AI would be any less "happy" in a prison than free.


You might want to be careful or emergence might bite you in the ass. Don't play games with things that could be smarter than you are, one mistake and you lose.


You don't design nuclear plant reactors to melt down, but they do. The difference is that an AI only has to escape once to become incredibly harmful.


Are you speaking about AI or about a self-conscious AI? Because self-conscious generally means self-determining (agency).


Do you feel guilty when you break your promise against your cat?

If some unforeseen event occurred and I had to abandon my cat, thereby breaking my promise that I would take care of her, I would definitely feel guilty about it.

Of course, a cat is not conscious.

Either this is a nonstandard definition of "conscious", or you haven't met many cats.


> Of course, a cat is not conscious.

Why do you believe this? I don't like cats, but I wouldn't argue that they're not conscious.

Instead of debating the suitcase word "conscious", let me ask: 1) Do you believe that toddlers are conscious? 2) Is there a more precise way to state your belief that doesn't use the word "conscious"?


> Of course, a cat is not conscious.

How do you know?


Cat's are obviously conscious in the sense of the dictionary definition "aware of and responding to one's surroundings; awake." Unless you knock one out or similar.

Arguing they are not conscious in the sense of a more obscure definition is a bit pointless unless you specify your definition.


Cats aren't aware of their surroundings? I'm a bit confused what you mean here. Surely chasing a mouse counts as both awareness and responding?


You got his point backwards.


It's a very good question; I think "of course" is WAY overstating it. I'm interested to know what the GP means when they say "conscious".


Theres a huge difference between reactive and conscious. Conscious cant even be verified for humans other than ones' self. Theres absolutely no reason to believe cats are not conscious.


There's also no reason to believe, e.g., rocks are not conscious, if your position is that we have no idea what consciousness is or where it comes from.

If you take the view that consciousness somehow arises from the brain and neural connections (which is intuitively plausible, but I personally am skeptical), it stands to reason that other species with complex brains are conscious as well. Perhaps "less conscious" (if that means anything) in proportion to how much less complex their brains are.


It doesn't make sense to have a scale of consciousness. The argument that consciousness is a manifestation of a complex brain is rather weak. Either an organism knows about self, and therefore tries to preserve self. Or it doesn't. I don't see how an in between exists.


I'm not an expert in this domain, but I think it's pretty much scientific consensus that this is the case.

Now experts can discuss details or semantics, but do you truly suggest cats might be conscious?


Yes. They have been shown to be self-aware and aware of their surroundings, which satisfies the classical definition of consciousness. Unless you reject that definition, I'm not sure why you'd claim this.


Cats haven't expressed self-recognition in the MSR test. However humans younger than 18 months also don't pass that test. So to say it is a measure of conciousness is quite a stretch.


I'm not an expert either, but my understanding is that "consciousness" is still so poorly understood that it's more the realm of philosophy than science.

In particular, we all know that we're conscious, but can't really explain what that means.


> A supraintelligent AI could promise making humans rich, healthy and powerful, to then break its promise and dominate the world.

Or it could devote its entire power to making human lives the best and most comfortable they can be because humanity is some super-precious resource in the universe and it feels it's unimportant because it's just a bunch of silicon and electrons.

Supraintelligent AI being evil is FUD imho because we can't reason about supraintelligent AI.


> Supraintelligent AI being evil is FUD imho because we can't reason about supraintelligent AI.

There is a difference between being evil and incomprehensible intelligence. You are not being evil when you accidentally step on an ant or dig up an ant-hill to build your shed. The ants won't be able to understand what you're doing, or why.


Well that's what I'm saying: We can't know if it's being evil unless we know everything about it, and if we knew that, we'd be the supraintelligent beings in the equation. Thinking that it'll go off and dominate the world is thinking about the worst case. So why bother since it's not likely we'd be able to do much about it anyway?

Maybe there's a second AI on the same level as the first and thinks the first AI is evil. We're still dumb as rocks compared to them, but something certainly has that opinion.


Superintelligent AIs, like all computer programs, will do exactly as they're programmed to do. The problem is that computers do what you say, not what you mean. (Hence bugs.) So if you were to try to program a computer to "make human lives the best and most comfortable they can be", or something like that, it would be very difficult to actually specify that correctly. (Especially since it's a way more complicated, nuanced, controversial objective than "win at Go".)

That's why e.g. the Future of Life Institute's open AI letter is so important: http://futureoflife.org/ai-open-letter/ We need to be thinking in advance about how to solve the "value loading" problem for future AIs, and how to architect them so they can be deployed to solve big problems without being undermined by subtle but catastrophic bugs.


AlphaGo is certainly controllable.


Sounds a little like what the villains always say in the movies....


While he may not be number one in the Go rankings afaik, Lee Sedol will be the name in the history books: Deep Blue against Garry Kasparov, AlphaGo against Lee Sedol. Lots of respect to Sedol for toughing it out.


I really like the moments when Alpha-Go would play a move and the commentators would look stunned and go silent for a 1-2 seconds. "That was an unexpected move", they would say.


In game 2 there was a point where Michael Redmond seemed to do a triple take and couldn't believe the move AlphaGo played.


Yeah they seem to forget that Alpha-Go is looking deep into the future. I have not read the Nature paper but I assume it's playing out possible moves way into the future.

At some point it figured that the Ko fight at the bottom was already won. Hence that white move at the top which nobody saw coming.

Another interesting moment was when Michael Redmond said "A human would typically not spend too much time thinking on this obvious move". This was the move on the right-hand side somewhere. What this tells me is that human players rush through some moves because they seem obvious but since Alpha-Go is a machine, it does not care about obvious and non-obvious. It's calculating the entire board through to the end and is not interested in "local fights".


He was talking about a pretty much forced move. But it makes sense for AlphaGo to still think it through, after all it's pretty young and experimental, and the Monte Carlo is good at catching blunders.

Also, that forced move it's very obvious to us that it's forced, but AlphaGo might not have this concept.


> I have not read the Nature paper but I assume it's playing out all possible moves.

To some relatively small depth, right? I hear the estimate that all possible moves in a Go game probably can't be physically represented in the universe (unless we learn much more about the structure of games' evolution).


No, it plays deep but only so broad. It uses a neural net (which playing by itself without MCTS already beats Pachi with like 80% probability) to sketch out the best moves until the end and then rates each move on its chance of winning.

This objective function is why Go playing AI jumped hugely in the last 10 years.


I agree. I modified my comment right after I posted it.


They didn't provide the exact depth of the search tree in the paper, but IIRC it was mentioned somewhere that it evaluates ~20 moves deep before terminating with the value net.


It plays out thousands of moves all the way until the end of the game using its neural net to quickly sketch optimal play. That's the big advantage! Humans read moves until they feel an outcome is favorable. AlphaGo reads out till endgame and plays moves that optimize for a win.

Also I'm not entirely sure how AlphaGo's time management works, but it's doing the same thing for every move—populating the game tree as deeply and intelligently as it can. It may just look for "30 seconds" on every move and then take the best bet meaning it's a more thorough and exhaustive reader than any human.


The Chinese 9 Dan player Ke Jie basically said the game is lost after around 40 mins or so. He still thinks that he has a 60% chance of winning against AlphaGo (down from 100% on day one). But I doubt Google will bother to go to China and challenge him.


Challenging Ke Jie is way too small a goal for DeepMind at this point.

I wonder if even the idea from the AGA stream today, to get all the best pros in the world together and challenge AlphaGo as a team, is enough.

Perhaps releasing the core AlphaGo as open source (to the extent it's not dependent on internal Google machinery), or at least publishing its trained model, may be the next step. Let people "challenge themselves" however they want.

EDIT: Also, Lee Sedol had his time in the sun, but commiserations to Ke Jie. He's just 19, already number #1 in the world, his whole career in front of him... and this happens.


I wonder if even the idea from the AGA stream today, to get all the best pros in the world together and challenge AlphaGo as a team, is enough.

Has this been tried? That is, have Players 2-9 (or some subset) ever competed as a group against a dominant Player 1? Unless it's been tested, I wouldn't take it for granted that a group would beat an individual.


I wouldn't know, but https://en.wikipedia.org/wiki/Kasparov_versus_the_World happened in chess.

> He later said, "It is the greatest game in the history of chess. The sheer number of ideas, the complexity, and the contribution it has made to chess make it the most important game ever played."


Sure. Check out the "game of the century".

There Go Seigen, the genius who brought about the Go revolution that Redmond talked about at post-game conf today, was beaten by Honinbou Shusai + his students.

Also, "lesser" pros routinely call out mistakes in master games (and the masters agree). Games are often decided by easily avoidable mistakes, even at the highest level.


Too many chiefs in a village. Can you imagine trying to explain why this move is correct because "20 moves in the future" it proves to be right. It would probably take an hour per move.

Also I mentioned in a previous post...the human style of playing go needs to adapt to AlphaGo. That's why the commentators say "oh that was odd" since a human would not make that move as its unorthodox, but turns out to be right.

If the top 1-10 had a chance to play AlphaGo privately for months they may have a better chance.


It works somewhat well in con-go.net, even if I disagree with a lot of the move ordering as of late. But I am on the white team and we're winning, so I guess the black team made more mistakes.


Not really 1 vs 2-9, but a billionaire named Andy Beal liked to go to Vegas and play poker pros in really high stakes (to get them out of their comfort zone). It was him vs "The Corporation".

https://en.wikipedia.org/wiki/Andrew_Beal#Poker_playing


Regarding Ke Jie's future, Magnus Carlsen doesn't seem to be doing too bad, and DeepBlue happened almost 20 years ago.


People don't watch games because the players are the best in the world.

I'm an avid tennis fan. If we build a humanoid that can run faster, hit harder, hit more accurately and never gets tiredI would say...good job...Now get out and put the humans on the court.

We enjoy relating to the players, see how far they push their boundaries, see them make mistakes, recover from mistakes...

And on that note, hopefully there will not be cheating scandals like in chess where players have an ear piece and someone in the back communicates what move to place based on computer output.


> and this happens

Chess competitions are still going strong.

Also, human runners do not compete against cars.


I really hope they bother with one more match at least. It was a pity human vs computer idea basically died in chess after very unconvincing win by Deep Blue (even though it lost 2 previous matches). Humans had at least a few years more of good resistance back then.

I can think of a worthy goal for AlphaGo: make a program which can play better than top pros which runs on a macbook pro.


Closer to their goal: Make a generalized AlphaGo program that runs on a computer no heavier than 3.3 lbs and uses less than 20 watts of power. Macbook pro is currently 30 watts and 4.5 lbs. So, that's pretty close. But the parent company, Google DeepMind, doesn't have go playing as their ultimate goal.


AlphaGo has also improved very quickly. Without doubt, the AlphaGo seen playing against Fan Hui would have lost against Lee Sedol. But in a couple of months its playing level raised significantly.

Lee Sedol said he could beat AlphaGo, based on the Fan Hui games. Ke Jie said he could beat AlphaGo, based on the Lee Sedol games.

Ke Jie belongs to a similar category than Lee Sedol, and we could see how Lee Sedol was completely dominated by AlphaGo, 3-0 so far. It is not unreasonable to say AlphaGo will most likely beat Ke Jie, and even if that doesn't happen the first time, AlphaGo can be improved by adding more infrastructure and training time.


While AlphaGo has been improving, it's also a little hard to see whether AG playing Fan Hui wasn't already strong enough to defeat LSD. MCTS picks the move most probable to win and if AG gets an early advantage it'll play slack for the rest of the game taking moves which never reduce its win rate even if they look boring and weak.


That is obvious, and Ke Jie also said so.


He said with the same conditions, his chances are 40:60 in favor of him winning.


That would be about right considering Ke Jie is 8-2 vs Lee Sedol and assuming Lee wins one. That assumes that we are seeing AlphaGo's strongest game and not its "just past the current opponent" game. If Lee doesn't win any then it is very difficult to estimate AlphaGo's strength from the games. If Lee wins one game then a 50-50 AlphaGo vs Ke Jie would be expected.


Based on all the commentaries, it seems that Lee Sedol was really not ahead during the game at any point during the game... and I think everybody has their answer regarding whether AlphaGo can perform in a Ko fight. That's a yes.


Go was the last perfect information game I knew where the best humans outperformed the best computers. Anyone know any others? Are all perfect information games lost at this point? Can we design one to keep us winning?


I know of one attempt to design such a game: Arimaa. Playable on a chess board, Arimaa has an even bigger game tree than Go. It's a great game, and fun to play, but as an anti-computer design it has recently failed.

There has been an annual humans vs computers challenge match every year since 2004, and in 2015 computers won with David Wu's software "Sharp". Despite the very high branching factor, standard chess AI techniques turned out to be applicable when combined with high quality hand written heuristics for positional evaluation and candidate move generation. The software is described in detail here:

http://icosahedral.net/downloads/djwu2015arimaa_color.pdf


Why bother? It's working with radically incomplete and imperfect information where the core of human intelligence actually lies.


> Why bother?

Because the process might tell us something interesting and useful about AI and ourselves.


It is interesting how fast this has happened compared to chess.

In 1978 chess IM David Levy won a 6 match series 4.5-1.5 - he was better than the machine, but the machine gave him a good game (the game he lost was when he tried to take it on in a tactical game, where the machine proved stronger). It took until 1996/7 for computers to match and surpass the human world champion.

I'd say the difference was that for chess, the algorithm was known (minimax + alpha-beta search) and it was computing power that was lacking - we had to wait for Moore's law to do its work. For go, the algorithm (MCTS + good neural nets + reinforcement learning) was lacking, but the computing power was already available.


Some professionals labeled some AlphaGo moves as being unoptimal or slow. In reality, Alpha Go doesn't try to maximize its score, only its probability of winning.


Right, but I think the initial concerns have been proven wrong in some respect. I think AlphaGo can accurately determine when a move in a local area doesn't significantly change the outcome and then just does something else for a different reason. AlphaGo has a higher willingness to go somewhere else, and I think that's the algorithm finding another path that is more helpful than the current one for 1 move.

Humans have a tendancy to want to win the battle, or to get too focus in a local area. I think that's a way AlphaGo is coming up with an extra move here or there which is making a difference in the fight later.


From watching it I'm almost inclined to say it maximizes its chances of not losing over necessarily winning.


They are playing with a 7.5 point komi [1], and so a game cannot end in a tie. Doesn't that mean that there is no distinction between winning and not losing?

[1] komi is a number of points that is added to white's score to compensate for the disadvantage of moving second. When a non-integer komi is used, such as in this match, you cannot have a tie because scoring on the board is always integral.


The developers have said that. They also said it doesn't care if it wins by 20 or by 1. It just maximizes winning at all.


Your comment reminds me of the Star Trek TNG episode Peak Performance. Data wins the rematch by playing to stalemate.

https://en.wikipedia.org/wiki/Peak_Performance_(Star_Trek:_T...


Sorry, but what's the difference?


There really isn't, from the point of view of AlphaGo. But for someone playing Go (or any game), it's a hint that perhaps one should focus on things that can go wrong more than furthering your own aggressive plan.

It's certainly a feature of the best Magic: the Gathering pros, for example - their play is marked by the cards they play around, even when seemingly far ahead.


interesting that "not losing" (avoiding mistakes that lead to "blowing up") also seems to be common philosophy shared among very successful investors of wildly differing styles, who are playing another sort of game in the markets


Yeah, isn't Lee Sedol playing with the same objective? It sounds like people are implying he's trying to thrash the AI. Or do they mean he can't help but have such an objective, subconsciously?


The issue is more that humans are not quite so confident of their positions as AlphaGo sometimes can be. If you are 100% sure that you're ahead by 1.5 points, you would play very differently than if you're ahead by 1.5 plus or minus five.


Through watching those few games, parent has deduced that AlphaGo prefers ties


With komi at 6.5 or 7.5 there are no ties.


My (long) commentary here:

https://www.facebook.com/yudkowsky/posts/10154018209759228

Sample:

At this point it seems likely that Sedol is actually far outclassed by a superhuman player. The suspicion is that since AlphaGo plays purely for probability of long-term victory rather than playing for points, the fight against Sedol generates boards that can falsely appear to a human to be balanced even as Sedol's probability of victory diminishes. The 8p and 9p pros who analyzed games 1 and 2 and thought the flow of a seemingly Sedol-favoring game 'eventually' shifted to AlphaGo later, may simply have failed to read the board's true state. The reality may be a slow, steady diminishment of Sedol's win probability as the game goes on and Sedol makes subtly imperfect moves that humans think result in even-looking boards...

The case of AlphaGo is a helpful concrete illustration of these concepts [from AI alignment theory]...

Edge instantiation. Extremely optimized strategies often look to us like 'weird' edges of the possibility space, and may throw away what we think of as 'typical' features of a solution. In many different kinds of optimization problem, the maximizing solution will lie at a vertex of the possibility space (a corner, an edge-case). In the case of AlphaGo, an extremely optimized strategy seems to have thrown away the 'typical' production of a visible point lead that characterizes human play...


This is reminding me more and more of the central theme of the book Echopraxia, which was that when dealing with superhuman intelligences, you are by definition not smart enough to even know whether you're winning.


When I play chess against a computer I can see very easily that I'm not winning.


But that is most likely be because of the structure of Chess. The evaluation of single board position is more "linear" and certainly easier than in Go.

If the utility function is about winning, going though easily-evalutable board positions might be the straightforward route.

If the utility function is to win while minimizing your estimate of of you losing, you might see different results.

Plus if you know that you are playing against a stronger opponent, your prior might bias your perception of the board situation.


It would be interesting to see how computers fare against humans on a 19x19 chess board, and how easily a human could evaluate such a board.


The question is not if you can see that or not. The question is if you can see that earlier than the computer can see that it is winning.


If that is what is meant then you can't see that you're losing against any player that's any stronger than yourself by definition, because if you could see it earlier you wouldn't have made that move.


As a club level go player (4 kyu AGA) I disagree. When you play against a much stronger human, you very quickly see moves that frustrate your plans, and find yourself struggling to find moves that work.

It's common for the opening to pass by without feeling behind--Go doesn't have set openings to the same extent as chess, but if you play joseki, you might have an opening where the weaker player feels ok. Even that is not guaranteed, but once you start fighting, you will quickly feel the strength difference.

Of course you can play a move that looks good when you play it, but against a much stronger player, it doesn't take long for it to look bad.


Which is exactly what the parent meant


Hmm maybe but aren't humans capable of learning too?

One could make the same statement of two humans of different ranks playing each other. In some cases, the more highly ranked player might be the only one who knows who's winning.

This doesn't mean that the lesser ranked player can't improve over time. It just means that at that moment, that player is inexperienced or less skilled, which we already knew.


Alas, time is not on meat's side in this one.


I'll grant that for this very constrained problem ;-)

I respect your work a lot. I've studied and used ML myself, including gensim, in industry. I've given nowhere near your level of contribution to society / the field. My opinion is true AI is quite a ways off. I haven't read anything from a ML researcher that says it isn't. Perhaps you weren't saying true AI is nearer.


I find it funny that whenever there's a case of computers being able to do something that they couldn't before, whether it's drive or beat humans at Go, the goalpost on what is "true AI" shifts to be something that computers can't do yet.

So let me ask you this. What would you consider to be "true AI"? At what point are you willing to say, "Okay, that's it, computers are just plain smarter than we are?" Because, frankly, it seems to me that that day is getting closer and closer.

Saying that AIs can't be smarter than humans because they don't think and act like humans is like saying that airplanes don't "truly" fly because they don't flap their wings.


Accelerando[0] starts with this quote, which I really like:

"The question of whether a computer can think is no more interesting than the question of whether a submarine can swim."

– Edsger W. Dijkstra

[0] http://www.antipope.org/charlie/blog-static/fiction/accelera... (readable online at http://www.antipope.org/charlie/blog-static/fiction/accelera...)


He's not saying that AIs can't be smarter than humans. My interpretation is he's implying that AlphaGo does not indicate that much progress towards AGI.

I'd also like to point out that you didn't define "smart" or "intelligent" either. The fact is, it's a very very difficult concept to define.

Reposting this comment from another thread:

The "moving goalposts" argument is one that really needs to die. It's a classic empty statement. Just because other people made <argument> in the past does not mean it's wrong. It proves nothing. People also predicted "true AI" many times over-optimistically; probably just as often as people have moved goalposts.


>> What would you consider to be "true AI"? At what point are you willing to say, "Okay, that's it, computers are just plain smarter than we are?"

Look, there's no doubt that computers can outperform humans in specific tasks. There's no doubt that AlphaGo is intelligent when it comes to Go, but on the other hand it would be completely incapable of tackling a different congitive task- say, language, or vision, or discriminating between say two species of animal [Edit, since there seems to be confusion on this: you'd need different training data and another training session to perform well at a different task].

That's a limitation of our current systems. They generalise badly, or they don't model their domain very well. You have to train them again for each different task that you want them to undertake and their high performance in one task does not necessarily translate in high performance in another task.

Humans on the other hand are good at generalising, which does seem to be necessary for general intelligence. If we learn to play a board game, we can take the lessons from it and apply them in, I don't know, business. If we learn maths, we can then use the knowledge [Edit: of what constitutes a well-formed theory] to tackle physics and chemistry. And so on.

So, let's say that "true AI" is something that can show an ability to generalise from one domain to another, like humans do, and can be trained in multiple cognitive tasks at the same time. If we can do that, then computers will already be super-human, because they can already outperform us in terms of speed and precision.


Actually alpha go can be trained for vision or any other task. It's not specialized to win go.


I think AlphaGo is specialized to play Go. You must be thinking DeepMind, on which AlphaGo is based.


The real question would be is there a goal of the AI. The true superhuman AI, that is problematic, is when AI decides to learn something on its own. And I think that is a long road.


This is simply not true, and there are many tasks the architecture would fail with and the networks presented are architecturally different from the highest performing vision networks like GoogLeNet. The techniques behind AlphaGo have no memory component holding state between moves, such as a hidden state in an RNN. It is completely reactionary.

A simple game that this architecture would fail at is Simon [0], where you are presented a sequence and then are tasked to replay the sequence.

[0] https://en.wikipedia.org/wiki/Simon_(game)


Of course, but it has to be trained anew, with new data. You can't train it on Go data and expect it to perform at all well on vision tasks.


The same statement holds for humans too. I don't think you can teach a person how to play Go and expect him/her to learn anything else than how to play Go.


I'm not talking about learning the rules of the game. You don't need an AI to model the rules of the game. You need an AI to model the winning strategy. Winning strategy is what humans generalise to other domains and computers don't.

There's a lot that suggests that humans and machine learning algorithms learn in very different ways. For instance, by the time a human can master a game like Go they can also perform image processing, speech recognition, handwriten digit recognition, word-sense disambiguation and other similar cognitive tasks. Machine learning algorithms can only do one of those things at a time. A system trained to do image processing might do it well, but it won't be able to go from recognising images to recognising the senses of words in a text without new training, and not without the new training clobbering the previous training.

To make it perfectly clear: I'm talking about separate instances of possibly the same algorithm, trained on a different task every time. I'm not saying that CNNs can't do speech recognition because they're good at image processing. I'm saying that an instance of a CNN that's learned to tag images must be trained on different data in a different time if you also want it to do word-sense disambiguation.

And that that is a limitation, that stands in the way of machine learning algorithms achieving general intelligence.


I was just contradicting your statement.

A human brain (or any other animal brain for that matter) is almost infinitely more advanced and computationally efficient than state of the art machine intelligence, even without taking things like thoughts, emotions and dreams - which we currently do not understand at all - into account.

It's a huge accomplishment for machines to be able to win over humans in games like chess and go and <insert game here>, but these are games originally designed for humans - by humans - to be played recreationally and I think we shouldn't read too much into it.


Didn't you see KarateKid: wax on, wax off. Jokes aside, as someone said before. The human mind is far better at generalizing and can reuse learned skills in different fields.


How does that differ from a human brain?


A lot. A brain is much bigger and much slower.


Humans have been trained for all the different scenarios you cite, AlphaGo has just been trained for go. Give AlphaGo 20 years to chew through training data and I think it would destroy a college sophomore at cognition.


> What would you consider to be "true AI"?

When a computer itself makes a persuasive argument that it is intelligent (without coaching.)

I have always believed AI is possible, and I am undecided whether current techniques alone will get us there, or whether other breakthroughs are needed, but I have no time for premature claims of success.


> the goalpost on what is "true AI" shifts to be something that computers can't do yet

while I agree that this does happen, I don't think it did in this case - that is, I don't remember anyone saying that they would take a computer beating the top human in Go as evidence of "true AI"


> So let me ask you this. What would you consider to be "true AI"? At what point are you willing to say, "Okay, that's it, computers are just plain smarter than we are?" Because, frankly, it seems to me that that day is getting closer and closer.

Alan Turing would give the system the "Turing test". If a computer can fool a human into thinking it's a human, then it is true AI, according to Turing.

I think that's a pretty good test. Some would argue that this is already possible with some advanced natural language processing systems. But these are not extensive tests, from what I've seen. People have to decide if the system is machine or human after just a few minutes of interaction. Turing probably meant for the test to be rigorous and to be performed by the smartest human. Deciding a conversational partner is a human after 5 minutes of interaction is not enough. 10 years might not be enough. I honestly couldn't say when enough is enough, which is part of what makes Turing's definition so complicated, even though it seems simple on the surface.

I would add that currently, systems cannot set their own goals. There is always a human telling them what to be good at. Every machine-learning-based system is application-specific and not general. There are some algorithms that are good at generalization. You might be able to write one algorithm that's good at multiple tasks without modifying it at all. But from what I've seen, we are nowhere near being able to write one program that can be applied universally to any problem, and we are even further from one that can identify its own problems and set its own goals.

As humans, do we even know our own goals? Stay alive, right? Make the best use of our time. How does the quality of "intelligence" translate to computers which are, as far as they know, unconstrained by time and life or death? What force would compel a self-driven computer to act? Should we threaten them with death if they do not continue learning and continue self-improvement? If I hold a bat over my laptop and swing at it, does it run my program faster? If I speak to it sweetly, does it respond by doing more work for me? Further, are animals intelligent or are they not?

It gets pretty philosophical. What are your thoughts?

> Saying that AIs can't be smarter than humans because they don't think and act like humans is like saying that airplanes don't "truly" fly because they don't flap their wings.

That's just semantics. I think any conversation about this must define intelligence really carefully. We all perceive things differently, so it's impossible to be sure we're talking about the same thing. Maybe that's one other quality of intelligence that separates us from computers. Every computer perceives a given input the same exact way. Can we say that about humans? If there were another dimension with the same atomic makeup as our own, would I think the same things as I do in this dimension? Are my thoughts independent or dependent upon my environment? Is anything truly random?

Anyway, for me, independent goal setting is a key element of true AI. And philosophically speaking, I believe we can't guarantee that we set our own goals independently. Most of us have a strong feeling that we act of our own volition and fate does not exist. And I think that's right. But what if there is no randomness and we are entirely products of our environment? Then under this definition, we don't have independent goal setting and we are not true AI.

Thanks for asking my thoughts.


Brilliant answer - independent goal setting is a really interesting alternative phrasing of "soul" or "spirit" or "individuality", because unlike those, it can be easily observed or tested. Great writeup, thanks for making me think.


I think humanity will be drastically changed by AI-made decisions calculated by human overseers but not so much that AI will be designed to communicate with one another, have access to markets, and the freedom to decide what they want to do. Is there a sci-fi novel that assumes that approach?


The day computers find their own power sockets and setup electric farms ?


We know that true human intelligence includes being able to help machines think better. So it seems reasonable to state that true machine intelligence includes being able to help humans think better.

Other things that humans can do that machines can't yet:

* change other human's minds

* contribute to the state of the art of human knowledge

* determine the difference between a human and a machine.


> change other human's minds

Computer says no

> contribute to the state of the art of human knowledge

Genetic algorithms have designed circuitry that we failed to even understand at first but that did work.

> determine the difference between a human and a machine.

For now.


I'm afraid I don't understand the point you're making with "computer says no".

> Genetic algorithms have designed circuitry that we failed to even understand at first but that did work.

This is an excellent example of what I think of as not machine intelligence. If humans can't understand it then it's something entirely different that we need a different word for - an "artefact", perhaps. Meaningfully contributing to the state of the art of human knowledge requires being built upon. If these genetic algorithms can explain how they can be incorporated into the design process by humans, that's intelligence. If they are similar to being able to evolve a mantis shrimp by fiddling with DNA, that is marvellous but not what I would regard as intelligence.

We apply the same standard to human intelligence: someone who can multiply numbers very fast but not explain how they can do it is a savant; someone who can discover and teach other people a faster way of multiplication is intelligent.


https://en.wikipedia.org/wiki/Computer_says_no

Savant literally means 'one who knows', and they're not required to explain to you how they know, it's up to you to verify that they do. Just like a chess grand master doesn't have to prove to you he or she is intelligent, it's enough that they beat you. They are under no obligation to prove their intelligence to you by teaching you the same (assuming you could follow in the first place).

> Meaningfully contributing to the state of the art of human knowledge requires being built upon.

No, it requires us to understand. But we will not always be able to (in the case of those circuits we eventually figured it out, but not at first). And in Chess we did too, computer chess made some (the best) chess players better at chess. But there is no reason to assume this will always be the case and that's a limit of our intelligence.


I think you're indeed arguing a sensible definition of Artificial Intelligence, but it's not what most people (especially laymen) mean by the phrase. I think most actually equivocate AI with Artificial Sapience: some cognitive architecture that can—in theory, with the right training—do all the things humans do (socialize, write persuasive essays, create art, etc.) at least as well as the average human.

Though in all honesty, I think a lot of people just want to see a machine with emotional "instincts" and an understanding of tribal status-hierarchy dynamics such that you can empathize with it. A lot of people would consider a machine that accurately simulated a rather dumb chimpanzee to be "smart enough" to qualify as AI, even if it couldn't do any useful human intellectual labor.


I predict a lot of disappointment if and when it does happen.


It could one day be possible. There are savants that have the ability to work with complex working sets e.g. reciting Pi to a ridiculous precision, multiplying very large numbers or recalling images with photographic detail.

So perhaps the problem isn't our brain's hardware as such but the operating system that runs on top of it.


BRB, installing Linux...


Watch our for nvidia drivers :)


And be careful of systemd. Most days you'll wake up in 2 seconds but occasionally you'll be comatose for the whole day with no way to find out what's wrong.


That's alcohold that you're thinking of.


> There are savants that have the ability to work with complex working sets e.g. reciting Pi to a ridiculous precision, multiplying very large numbers or recalling images with photographic detail.

Yes, but all computers do those things millions of times faster than even savants. And the computers are getting faster at it every year, savants today aren't any more clever than savants 100 years ago.


But even in the savant's case, he/she can't simulate millions of games in parallel to improve strategy.


Have you read Existence by David Brin, it has a lot of great AI stuff in it.


The problem with 'flag and footprints' demonstrations is precisely the same as their attraction - they are, in essence (even though that was not the intent), selected to be as misleading as possible about progress in solving the real problems.

We put a man on the moon, which encouraged decades of optimism about the near-future colonization of space... unfortunately it turns out we are nowhere near solving the hard problems associated with being able to live in space.

We spelled out the letters IBM in xenon atoms, five atoms tall, which encouraged optimism about near-future molecular manufacturing... unfortunately it turns out we are nowhere near solving the hard problems associated with making useful things with atomic precision.

We achieve superhuman performance in simple games, which encourages what should be optimism that becomes reflected by the distorting mirror of the zeitgeist of the times into pessimism, about near-future AI... unfortunately it turns out we are nowhere near solving the hard problems associated with making our tools something better than depressingly dumb.

In each case, the flurry of optimism turns as the decades pass into confusion, anger, soul-searching about why the expected follow-on progress has failed to materialize, fading slowly into despair, then into nothing as the optimists die of old age, one by one, and the rate of technological progress becomes slower with every passing decade. For this conversation is necessarily happening at a tech level that allows it, but that provides no guarantee whatsoever that the species that reached tech level X will proceed to tech level X+1. Moore's law has officially expired, and that was the last major area where we were making rapid sustained progress.

It's hard to say we shouldn't do 'flag and footprints' demonstrations; am I really willing to bite the bullet and say we shouldn't have put a man on the moon? I don't know the answer to that. I am, however, convinced that we should remember not to take them too seriously.


The "flag and footprints" demonstrations serve essentially as prototypes. They demonstrate that a certain tech level is possible. But as anyone who's built a prototype knows, when the prototype works you are maybe 5% of the way there.

We put a man on the moon in 1969, and we are just now getting privately-funded spaceflight and satellites. We made the first computers in 1945; we got them on every desk in the 1980s. The first steamboat was built in 1783; the transatlantic shipping industry didn't transition to steam until the 1840s & 1850s.

Probably some kid who's watching AlphaGo's game today is the person who, late in life, invents the first strong AI.


Sorry but more research will reveal massive progress in the fields connected to those ideals. Please research some more, you have missed quite a few developments.

For example, long stays in large space stations, landing craft on Mars, successful isolation sustainability experiments, 14nm chips and understanding of quantum concerns, multi-layer chips, beginnings of optical computing, biological/genetic chemical synthesis, neuromorphic chips, etc.

Deep reinforcement learning is undeniably a major step forward for general intelligence.

You and others with your belief system will still be in denial up to and past the point where your species becomes irrelevant as the superintelligent AIs arrive within a few decades.


The implication is that humans have been trapped in a local maxima in terms of strategy and board reading. The question then becomes whether we can use this to bootstrap ourselves.


> The suspicion is that since AlphaGo plays purely for probability of long-term victory rather than playing for points [snip]

I wonder how many handicap stones do Lee Sedol, or Ke Jie for that matter, need to have a shot at winning even one game.


Now that's a really interesting question. Give AlphaGo progressively worse losing positions and see if it can rescue what the human masters would consider a disaster.


I wonder if he could identify those long-term-thinking moves by playing a version of AlphaGo on a smaller board, then gradually increasing the size of the board over successive games once he is able to consistently win or tie at a given board size.

I also wonder if when Alpha Go plays itself, does it always win when it is black or white, does it always tie, or is it a mix?

So much to learn from this.


I was thinking the opposite. AI has already been beating humans on smaller boards, so maybe the only chance we have is to start playing go on bigger and bigger boards.


Huh, interesting. I'm not well-versed in Go at all.

It would be interesting if AlphaGo's maneuvers remain opaque to humans for an extended period of time. Can anyone at this point say with confidence that its strategies will indefinitely remain unknown?


If this plays out anything like it did in chess, in a few years we'll have very strong machine players that we can use to figure out with high confidence the quality of both players' moves.


Obviously the chances for humans to win, increase with the board size.

Given the computational power (=number of cpu/gpu) is fixed and the time for each move to be done is fixed.

Perhaps a future "dr. evil / skynet" A.I. will first try to conquer the microchip production plants to increase its computational power and memory. goodbye taiwan, goodbye south korea, goodbye usa...


There is typically a point advantage of around 7 points when you get the chance to play the first move. In professional matches, white will be given a 6.5 point bonus to compensate for this. So yes I think it's fair to assume that black usually wins. Actually, AlphaGo could probably be used to calculate the perfect number of points, because it has changed over the years. In China they have a 7.5 point bonus for white.


AlphaGo uses Chinese rules, and this match against Lee Sedol has a komi of 7.5.


Which is to say, black has has to win by more than 7.5 points or it is counted as a loss.


>> But Go is rich enough to demonstrate strong cognitive uncontainability on a small scale. In a rich and complicated domain whose rules aren't fully known, we should expect even more magic from superhuman reasoning - solutions that are better than the best solution we could imagine, operating by causal pathways we wouldn't be able to foresee even if we were told the AI's exact actions.

Hang on. Where are we going to find this magic agent that can do "even more" in "a rich and complicated domain whose rules aren't fully known" (the real world, as opposed to a Go board) than in a game of Go? What are we going to train such a learner with, if we ourselves don't fully know the rules of the domain, as you point out?

Even if a learner somehow magically found a superhuman path to perfect reasoning which is unavailable to us, entirely by accident and purely on its own, why would we select it from other trained learners to keep and foster further, if we think it's actually pretty dumb, rather than magically smart?

You're saying at some point that a fantastic paperclip maximiser might achieve superhuman intelligence and then lie in waiting, poised to turn us all into paperclips only when it knew it was safe to make its move, basically. But, how is it going to become that smart in the first place? It has to be smart enough to know that it must bide its time, but dumb enough that its time hasn't come yet. Sounds like a bit of an impossible double-bind there.

Just because sufficiently advanced AI may look like magic at first, it doesn't mean we should start all believing in magic because it may just be sufficiently advanced AI in disguise.


> What are we going to train such a learner with, if we ourselves don't fully know the rules of the domain, as you point out?

This does not make any sense if you assume general intelligence. A physicist who makes a discovery also was not trained to know about this new law or rule beforehand (how could he?).


On the other hand, in many ways AlphaGo plays much like a human professional. The openings and fights were not radically different. The professional commentators did not have a hard time explaining why most moves were good. The structure of the board was understandable. Some moves were "forced" moves that were predictable and many others were good moves that a human would play. The human player isn't efficient enough to win, but attacks and defenses still mostly work as intended.

This suggests that Go space was already very well understood and there aren't radically different play styles we've somehow overlooked. AI is not a magic wand that radically changes how the game works.

However there may be some bias due to AlphaGo having trained on human games and playing against a human. The real proof will happen when they redo the retraining from scratch.


AlphaGo learned the game from humans by looking at recorded games. It's not suprising that its style mimicks its master's.


How do we know that it's not just that the AI dominates the later stages where the number of permutations has reduced to the point where AI can calculate optimal moves? In other words, how do we know that the computer has an edge on the early stages and isn't just breaking even until the later stages with less permutations arrive, at which point it dominates because there are still too many for humans to assess?


At least in the case of AlphaGo it was pretty clear that it was dominating the board early on.


> the stark degree to which "AlphaGo output that move because the board will later be in a winning state" sometimes doesn't correlate with conventional Go goals like taking territory or being up on points

Minor quibbles: 1) taking territory or being up on points are the same thing, and 2) in go there are many other things to optimize for besides territory/points - influence, shape, efficiency, and sente being examples of this.


> For all we know from what we've seen, AlphaGo could win even if Sedol were allowed a one-stone handicap.

What do you mean by a one-stone handicap? Just no komi?

> In the case of AlphaGo, an extremely optimized strategy seems to have thrown away the 'typical' production of a visible point lead that characterizes human play. Maximizing win-probability in Go, at this level of play against a human 9p, is not strongly correlated with what a human can see as visible extra territory - so that gets thrown out even though it was previously associated with 'trying to win' in human play.

Pros have been aware for centuries that "visible extra territory" is a poor indicator of win probability. They use the term "thickness" (in English) to denote the positive potential resulting from a strong, safe, influential position, independent of current territory. Quite often a pro game will be a battle of territory vs thickness.


I agree with you as this makes perfect sense. I'm new to the game Go but have been riveted by these games and the commentary.

Here's where I think you're right. It seems from the commentary during the matches that the pro player thought the game to be close...on every match.

Then the tides turned...or so we thought...and the game always swayed in AlphaGo's favor.

I like how they were saying that the computer changed up the game style as well.

All of this is very interesting, and I see that they have games 4 and 5 listed however I wonder if they are going to go thru with them and play them as scheduled.

Maybe to see if AlphaGo can sweep the series? Or show it wasn't a fluke of any kind?

I hope they play the last two...just to see how AlphaGo plays them. :D


They will play all 5 games in the match. The result can end up 3-2 4-1 or 5-0. Based on the past 3 which ended as sound defeats for Lee with different play styles and approaches (aggressive, calm and a mix typical of his usual game) I expect the final result will be 5-0 for AlphaGo.


> the maximizing solution will lie at a vertex of the possibility space

Keep in mind that as well as AlphaGo plays, it's still extremely far from playing optimal moves.


>> At this point it seems likely that Sedol is actually far outclassed by a superhuman player.

I still don't agree that this is the case, and I don't care what a thousand Google-hyped press releases say, beating the best human player in anything is not "superhuman" and "superhuman" performance has not been achieved by anything yet. [1]

Why do I think so? Two reasons.

One, because you can be entirely human and still beat all other humans, without fail for a very long time. Long "winning streaks" in professional sports are very well documented. For instance Rocky Marciano went entirely undefeated in his whole heavyweight boxing career. In chess, Mikhail Tal went undefeated for 95 games. And so on, so forth.

Of course most humans' winning streaks end eventually. That's because our performance degrades over time. When a computer wins against the best player, it keeps on winning, and in fact it actually gets even better over time.

Still, and this is number two: in most instances where a computer does better than a human, we don't claim superhuman performance. Automatic calculators, going back to mechanical calculators, have been better than humans at arithmetic for a very, very long time. I would wager that nobody discusses pocket calculators as exhibiting "superhuman" performance. You only hear this sort of claim when it comes to Deep Blue, AlphaGo or Watson.

So maybe we need a better definition of what it means to be "superhuman" that covers both pocket calculators and AlphaGo. Without one, I don't accept that the performance of AlphaGo can be said to be superhuman, unless pocket calculators' performance is also celebrated as superhuman.

_______________________

[1] I'm perfectly willing to go even further than that and say that we can't make machines that have superhuman intelligence and that even if we did, we wouldn't be able to recognise them (this last bit is similar to what the GP says).


Your pocket calculator is superhuman at arithmetic. Alphago is superhuman at go. It's not a big deal.


Superhuman has a very specific definition in the field of game playing algorithms. It is the case when the algorithm can always beat all humans. AlphaGo winning 5-0 against the number 5 ranked human (Lee) would give only a small indication that it is superhuman. Regarding streaks, evenly matched humans can go 5-0 an expected (0.5)^5 or 3.125 percent of the time. So not particularly rare. If it loses even one game then it is not yet superhuman.

If top humans get beat 5-0 with significant handicaps then it is likely AlphaGo is superhuman. However, it is expensive to run AlphaGo so it is unlikely that we will know the true strength of AlphaGo for a while (more challenges) or until hardware catches up.

Update: typos and clarifications


The term ("superhuman") is used very differently in other communities though, that don't have such a clear metric of "beating humans" as in traditional game-playing. Which means it's about time we clarify what is meant by it, especially as various parties that have commercial interests start throwing it around carelessly (there was an example on HN a while ago).


The 2x4 in my floor has superhuman strength by that definition. Kind of a pointless term if it's just to mean what humans can't do.


That's literally what it means though.


The way in which it is superhuman matters. The calculator uses simple mechanical algorithms. Alpha Go uses a completely novel approach to deep learning that can likely be applied to many other systems/problems.


As long as those systems/problems include a grid based problem space where the goal is to successfully place stones restricted by a limited set of rules.

Ok flippancy aside, there are two problems that make techniques like this single-domain: network design and network training.

The design, uses multiple networks for different goals: board eval (what boards look good) and policy (which moves to focus on). Those two goals, eval and policy, are very specific to go. Just like category layers are specific to vision and LSTM is specific to sequence learning.

Network training is obviously hugely resource intensive -- and each significantly complex problem would need such intensity.

It is amazing the variety of problems DNNs have been able to do well in. However, the problem of network design and efficient training are significant barriers to generalization.

When network design can be addressed algorithmically I think we may have an AGI. However, that is a significant problem where you automatically add another layer of computational complexity so it is not on the immediate horizon and may be 50+ years down the road.


My pocket calculator is faster than a human and has better memory. I don't know that this means the same thing as "superhuman in arithmetic". I can concede that it means superhuman in speed and memory, but, arithmetic? I don't think so. What it really does is move bits around registers. We are the ones interpreting those as numbers and the results of arithmetic operations.

AlphaGo is rather different in that it actually has a representation of the game of Go and it knows how to play. I don't doubt at all that it's intelligent, in the restricted domain it operates in. But I do doubt that it's possible for an intelligence built by humans to be "superhuman" and I don't see how your one-liner addresses that.


Your calculator does have a representation of arithmetic too. It's those bits is moves around in registers, which are very much isomorphic to the relevant arithmetic.

Why would an intelligence built by humans not be able to be superhuman? The generally accepted definition seems to be "having better than human performance" in which case it seems we've done it many times (like with calculators).


>> The generally accepted definition seems to be "having better than human performance"

I don't think there's a generally accepted definition and I don't agree that performance on its own is a good measure. Humans are certainly not as good at mechanical tasks as machines are -duh. But how can you call "superhuman" something that doesn't even know what it's doing, even as it's doing it faster and more accurately than us?

Take arithmetic again. We know that cat's can't do arithmetic, because they don't understand numbers, so it's safe to say humans have super-feline arithmetic ability. But then, how is a pocket calculator super-human, if it doesn't know what numbers are for, any more than a cat does? There's something missing from the definition and therefore the measurement of the task.

I don't claim to have this missing something, mind you.

>> Why would an intelligence built by humans not be able to be superhuman?

Ah. Apologies, I got carried away a bit there. I meant to discuss how I doubt we can create superhuman intelligence using machine learing specifically. My thinking goes like this: we train machine learning algorithms using examples; to train an algorithm to exhibit superhuman intelligence we'd need examples of superhuman intelligence; we can't produce such examples because our intelligence is merely human; therefore we can't train a superhuman intelligence.

I also doubt that we can create a superhuman intelligence in any other way, at least intentionally, or that we would be able to recognise one if we created it by chance, but I'm not prepared to argue this. Again, sorry about that.


>> Your calculator does have a representation of arithmetic too. It's those bits is moves around in registers, which are very much isomorphic to the relevant arithmetic.

Hm. Strictly speaking I believe my pocket calculator has an FPGA, a general-purpose architecture that in my calculator happens to be programmed for arithmetic, specifically. So I think it's accurate for me to say that, although the calculator has a program and that program certainly is a representation of arithmetic, I have to provide the interpretation of the program and reify the representation as arithmetic.

In other words, the program is a representation of arithmetic to me, not to the calculator. The calculator might as well be programmed to randomly beep, and it wouldn't have any way to know the difference.

(But that'd be a cruel thing to do to the poor calculator).


There used to be jobs for thousands of people to do what pocket calculators do. Those jobs have been gone for decades. Humans entirely replaced by machines. So, yes, calculators are superhuman.


Oddly enough, these people you mention were referred to as 'computers'. (Because they performed computations.)


I agree somewhat, but then what is your gauge for superhuman if not some type of competition? How do you evaluate it? On another note, Rocky Marciano had the mafia behind him. Harry Haft fought him and was knocked out, and later claimed the mafia told him he had to throw the fight [1]. There's also a good graphic novel about this. Maybe the AI's goons have threatened Sedol or kin ;)

   [1] https://en.wikipedia.org/wiki/Harry_Haft


>> what is your gauge for superhuman if not some type of competition?

To be honest- I don't have one. My intuition is that we can't have a good definition of "superhuman intelligence" because having one would require us to demonstrate superhuman intelligence ourselves. Which is obviously a contradiction.


Intuitively, a calculator is not superhuman. A person who could do mental arithmetic as well as a calculator? That would seem to be superhuman. If you find out that they were using a calculator under the table the whole time, they're not superhuman any more.

So I think the word "superhuman" must imply a fair competition, in the sense that the participants are competing using comparable approaches. For some definition of comparable.


I think this is one of the most fascinating (and disturbing) facts about AI: they seek to solve the problem they are given without being dependent on convention. They are completely alien in this regard, so many of their choices don't make any sense to us. This results in huge possibility but also the huge dangers that people worry about.


Regarding the expectation that it would have gone 5-0 or 4-1 to either one, you'd have to have equally predicted a similarly one-sided result in the Deep Blue v Kasparov match. Rate of improvement might end up being completely domain-dependent, and hard to predict a-priori how it will turn out.


Or its just playing along until the state is complex enough, when it outperforms Seoul simply in terms of memory. I'd call that mid-term, my offer as a compromise ;)


> AI alignment theory

What is this?


The problem of making sure an AI does what you want it to.

http://lesswrong.com/lw/n4l/safety_engineering_target_select...


Isn't this what the machines learned in the Matrix?


>> AlphaGo's core is built around a similar machine learning technology to Deepmind's Atari-playing system - the single, untweaked program that was able to learn superhuman play on dozens of different Atari games just by looking at the pixels, without specialization for each particular game.

Woa there. The Deepmind Atari-playing AI was too specialised for each particular game. It had a reward function that translated the score for it. It couldn't learn the importance of the game score on its own, just from "looking at the pixels" and it couldn't learn the significance of the score display and how it changed as a result of its actions on its own. All this had to be hand-coded. And if I've seen a hint that this means it couldn't really-really generalise to other games, like Deepmind claimed, then that's this bit that you report yourself:

>> Deepmind ... did reuse the widely known core insight of Monte Carlo Tree Search.

That is no mere detail. That is the crux of the matter, right there. Deepmind used their architecture to improve MCTS far enough that it could beat Lee Se-Dol. They didn't just add MCTS to their already general-game playing system.

Because the original Atari-playing AI was completely useless for playing Go, a game that doesn't have a score and looks nothing like an Atari game. So it wasn't very general at all, despite Google's and DeepMind's claims to the contrary.


I think you're wrong here. From the Nature paper describing AlphaGo:

"We also tested against the strongest open-source Go program, Pachi, a sophisticated Monte Carlo search program, ranked at 2 amateur dan on KGS, that executes 100,000 simulations per move. Using no search at all, the RL policy network won 85% of games against Pachi."

AlphaGo does use MCTS, but it seems that most of its improvements are actually coming from the deep reinforcement learning approach.


I don't know whether this is true. If AlphaGo could do just as well without search, then why did it use search at all?

But in any case, I'm not necessarily disputing that. I'm particularly refuting the claim that the AlphaGo architecture is identical to the one that learned to play Atari games and that Deepmind have advertised as a general game-playing agent.

My comment here is specifically in reply to the GP who repeated this claim, but I'll dig up the relevant link if you're interested.


Oh, fair enough. There are certainly differences; it's definitely not exact same architecture. They are both using Deep Reinforcement Learning, but e.g. AlphaGo benefits from getting an explicit representation of the board state and game rules rather than having to learn them.

Hassabis has said that in the next few months they want to try and get up to current AlphaGo performance without using any MCTS at all.


The idea is that even there's a policy network that is able to decide at some point what the best possible move is, the tree search is done to refine this choice and to "evaluate" it. This is why a value network is derived from policy network and is used in conjunction with MCTS to make sure that the moves AlphaGo picks are good ones.


It is necessary to make multiple alternatives in the tree comparable in an easy way (and nothing is better comparable than scalars). They could also go about training a network that compares two positions to decide which one is superior, but that would require much more computation. Or another alternative would possibly to learn the value somehow jointly with the action selection, but that would possibly also be harder both to train and evaluate.


Where did he say "identical"?


Noone really said that AlphaGo is identical to the Atari playing AI from a while ago. What Deepmind did say is that that agent was a general game-playing agent. If it was, then why was it not employed against Lee Sedol? Well- because it wasn't.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: