Is there something more specific about the application of neural nets to generalized problems that makes them unsuitable?
You and your personality are fine, but the jump between the intro and the payoff is jarring.
You need to hold our hand a little more ... jus' sayin'.
It is also a great way to be able to track and organize what is being created rather than having to sort through amateur projects scattered across the web or research publications that often lack accompanying code.
Some key ways they're making it easier for amateurs:
* Starting point for problems to solve
* Way to get noticed (instead of needing a university/company brand)
* Technological infrastructure for building and testing. The diversity of tools they brought together to build this platform is very impressive.
But I feel that contrarians, such as myself, have an ethical commitment to young people to voice our doubts and criticisms, so that they can avoid making a long journey down a career/research path that leads to a dead end. That being said, I think this project leads in a very unpromising direction. Here are some reasons:
1. Games aren't a good testbed for studying intelligence. In a game the main challenge is to map an input percept to an output action (am I drifting off the side of the road? Okay swerve right). The real challenge of intelligence is to find hidden abstractions and patterns in large quantities of mostly undifferentiated data (language, vision, and science all share this goal).
2. This platform is not going to help "democratize" AI. To succeed in one of these domains, contestants will need to use VAST amounts of computing power to simulate many games and to train their DL and/or RL algos. DeepMind and others will sufficient CPU/GPU power will almost certainly dominate in all of these settings.
3. Deep Learning, as it is practiced, isn't intellectually deep. With a few exceptions, there is nothing comparable to the great discoveries of physics, not even anything comparable to the big ideas of previous AI work (A*, belief propagation, VC theory, MaxEnt, boosting, etc). Progress in DL mostly comes from architecture hacking: tweak the network setup, run the training algo, and see if we get a better result. The apparent success of DL doesn't depend on any special scientific insight, but on the fact that DL algos can run on the GPU. That, combined with the fact that, except for the GPU, Moore's Law broke down roughly 10 years ago, means that relative to everything else, DL looks amazingly successful - because all other approaches to AI are frozen in time in terms of computing power.
2. Your argument, which boils down to large organizations can accomplish more than individuals is in general true. But then that isn't saying anything new. Still, I'd prefer the car company give its blueprints than not. My factory (my cpu) can then at least build the car, albeit at a smaller scale. And soon my factory will be bigger/cheaper. Yeah I know, I'd like to be wealthy like Google too.
3. This is your own value judgement. The insight that we simplify AI architecture to some addition and multiplication is a big idea - in my opinion. Transistors are getting cheaper and I believe they will continue to do so. DNN are better suited to take advantage of this new computing power. Turns out, all those great discoveries were just some multiplications and adds the whole time ;)
Multi-linear fitting is an insight? Perhaps it's application to certain problems is. That might be what you mean.
Agreed, though, that the focus on games is a long-standing source of problems holding back AI. My colleague has referred to this as "ludic AI" in a recent blog post where he expands on your statement about the real challenge of intelligence. 
 Reactive vs Predictive AI (http://blog.piekniewski.info/2016/11/03/reactive-vs-predicti...)
 Learning Physics (http://blog.piekniewski.info/2016/11/30/learning-physics-is-...)
To be fair isn't this what physicists do all day at CERN too? Smash some particles together, analyse the numbers, try to find patterns, tweak a few things and try again?
Where might we look for deeper principles? One idea is to consider what brains do and how they might be doing it. (I'm not saying we need to go down the rabbit hole of biological detail -- on the contrary I'm suggesting we look at known or even hypothesized principles of brain operation and import them into AI.)
Two ideas we have used in our work: prediction (over time), recurrent feedback (most brain regions have more feedback than feedforward inputs)
Along one axis, you could compare: supervised, semi-supervised, self-supervised and unsupervised learning. Along another axis, consider that there are versions of each method that take into account temporal/dynamic data, versus others that require randomly shuffled static data.
In the current problems of visual perception, I think the field would benefit greatly a shift to focus on multiscale interaction/dynamics rather than on (static) statistics as it is currently (for more on this, see my colleague's blog: ).
 Statistics and dynamics. (http://blog.piekniewski.info/2016/11/01/statistics-and-dynam...)
Your friend's blog has a lot of good insights that I've seen in the theoretical neuroscience and computational cognitive science literature as well. Where do you guys work?
If so I think a layman law of ML is already understood.
To draw an analogy, some programmers might use trial-and-error to produce programs by fiddling with a few lines of code, seeing if it pass more unit tests, repeat.
If you believe that a human brain can be represented by machine code, then given infinite time, that trial-and-error programmer can write down the "source code" of the brain.
Then machine learning is just a "turing complete programming language"(i.e., a neural network architecture) with "source code" (in the form of matrix weights) where "passing more unit tests" is done by numerically following a gradient to update the "source code".
Everything else is just finding a better "programming language" that can make this run very fast on our current machines.
This has been my question too...
Based on my understanding, it all really boils down to probability, statistics and a few important theories like Vapnik-Chervonenkis which provides the mathematical foundation for what "learning" is, can we even learn from the given data and how well can we learn (VC dimension, etc).
But I would love it if someone can point me to or explain / derive from core first principles the concept of "learning".
Experimentation does not always follow theory in science. I would argue that many of the great discoveries of physics in the 20th century followed directly from startling results of experiments conducted at the end of the 19th century (the photoelectric effect for example). I agree that there seems to be an awful lot of experimenting going on and that hardware and large datasets have helped tremendously but there are also many researchers poking and prodding at deep learning theory [1, 2].
So from a glass-half-full perspective we have rapid (sometimes iterative) experimentation coupled with yawning gaps in theory to explain surprising results. In other words, the opposite of previous AI booms and a big reason to be optimistic despite all the hype.
2. This platform will provide a test bed for AI algo's and help "democratize" AI. One does not need to set up their own platform. One can compare approaches. One can learn from other implementations running on a common ground. Being resource constrained forces one to be more creative and this paves the way for more energy-friendly methods. Sure, a high school student will not dominate the power houses like DeepMind et al. But the high school student can get up and running in a week or few days.
3. https://arxiv.org/abs/1608.08225 Physics and Deep Learning are well entwined. Deep Learning certainly is a big idea, up there with VC theory and boosting. It exists for decades now, I agree the more recent incarnation was made possible with more computing power and better bigger datasets, and relies less on new tricks. Yet, still tricks are being invented in the recent years that have majorly contributed to better generalization. Dropout being one. DL, and especially the relevant Deep Reinforcement Learning, is not just GPU's, but a lot of new (and budding) theory. One can run Random Forests (and other approaches) on CUDA too. The Neural Turing Machine heralded a whole new, intellectually deep and stimulating, field in Deep Learning, and we haven't seen the best of it yet. There are also fields, like vision, where the other approaches are significantly underperforming relative to DL. Try to train a SVM or RF on ImageNet. Also, one is not required to use DL for your agent. Experiment with the classic approaches and see which is better (https://arxiv.org/abs/1603.04119).
I actually shared some of your concern, not for AI research, but for game development. I thought it was very hard to actually get a good job in that field. Then the mobile game market started booming, and indy developers could make a living. AI has got the backing of all the major players in industry. Instead of pipe dreams and philosophical meanderings, we have actually business-value adding working models now. It is not going anywhere soon.
If all else fails, you remain a good coder or data analyst with a lot of automation skills.
I 100% agree we need to teach them like humans, so they at least can build a model of how humans interact and participate with one another. At the least this will teach them about us, more than it will teach them about anything else. And if we want to participate and collaborate with that future of AI we need to have these models.
More generally, imagine AI that could learn the physics of the world. For example, if the ball is rolling away, the AI should be able to predict that the ball will look smaller on the next frame.
Going further, if the ball is about to roll under a shadow, the AI should predict that the ball will become a darker shade of green.
(After several years working in a robotics research company, these kinds of capabilities are exactly what we determined would be necessary for robot AI.)
These are like unit tests of AI (basic shapes and transforms) and I agree physical reckoning is at the top, one of the big tests that is a capstone and something beautiful to behold in nature (eg. sports). Maybe the a virtual soccer game at the end?
From my lidar experience, I wanted to reach for a model rather than deal with noisy sensor data. I want to generate the output (3d world) with my model, then the NN learns the inverse (eg. the scene graph used to generate the scene).
I enjoy thinking about this stuff, though it really makes my head spiral sometimes when I relate it to my own reality. It's easy to feel like you're losing touch.
If requests are being taken, it would be useful to be able to search through the listed environments. And a poker environ for the internet section would be a good balance of fun, widely appreciable and a straight forward but very non-trivial environment.
Imagine an AI team in League of Legends world championship!
Mostly because, with the latter, improved AI means better competition and a deeper understanding (which boosts sales). With the former, improved AI means improved automation which means imbalanced economies.
I would love to hear more about how they were able to achieve increased performance over other VNC drivers.
We'd started by adapting an existing Python driver in Twisted, implementing additional encodings and offloading to threads for calls into C libraries like zlib. We got this working reasonably on small environments like Atari, but for environments which generated many update rectangles, we started to be bitten by the GIL. I still believe that one could make Python work, but it'd take quite a lot of effort.
libvncserver is a fast C driver, but it's GPL, and doesn't have any particular support for parallelization. We wanted Universe to be usable by everyone, from hobbyists to companies, so GPL was a no-go. (We actually talked to the libvncserver maintainers, who said that they would be interested in dropping GPL restriction, but there have been far too many contributors over its long history to figure out how to do so.)
Our Go driver, based on https://github.com/mitchellh/go-vnc, has scaled quite well. It takes advantage of Go's lightweight thread model: each connection runs in its own goroutine, which makes it easy to run hundreds of connections in parallel without needing hundreds of threads.
>You can keep your own VNC connection open, and watch the agent play, or even use the keyboard and mouse alongside the agent in a human/agent co-op mode.
The list of third-party gaming partners is extremely impressive, and a Docker config helps resolve the dependency hell that some of the AI packages require.
Is there a way to deal with "sparse" training data (state, action, reward) triples -- sparse in "state"?
Finally there is a trend of using recurrent neural network as a top component of the Q-network. Perhaps we will see even more sophisticated RNNs like DNC and Recurrent Entity Networks applied here.
Also we'll see meta-reinforcement learning applied to a curriculum of environments.
> Are you a logistics major? Are you masochistic? Do you think that the calculations required to play a game should take longer than actually moving the units? Then do I have a game for you! Get yourself a copy of The Campaign for North Africa, and say goodbye to the family for a couple of months, if not years.
The Campaign for North Africa is the most detailed game that I have ever played. It isnt necessarily the most complicated, but for sheer size of the detail and planning involved, it is by far the most laborious and detail-oriented game that has ever been produced. As a first example, this is the only game that I know of that differentiates between British and German jerry cans for fuel. More about this later on.
The Campaign for North Africa is Richard Berg and SPIs simulation of the war in North Africa in the Second World War. The seven foot long mapsheet (divided into five sections), two sets of rulebooks, charts and tables galore and, oh yes, thousands of counters complete the game in a nice sturdy box, not the usual SPI flat game holder that falls apart. Most of this is standard SPI fare, with the functional but not pretty counters, standard three column style SPI rulebooks, and a fairly attractive map that does an excellent job of creating an epic sense of scale. True, this is the desert, and most of it is desolate, but the numerous tracks and roads, the coastal plains and mountains, and the railroad (both already built and railroad you can build as the game goes on) all combine to present an appealing picture of the area.
Each turn is one week of time, and each turn is broken down several stages. There is an initiative determination, naval convoy stage, stores expenditure stage, and then three operations stages. The Ops Stages are where most of the activity occurs. There are also stages that are used in the air game. I did not play the Air Game for the purpose of this review, but did play with the advanced logistics.
The game also includes on of each type of chart, which can be used to make copies. I made my own in Excel. There are charts for Division and Brigade organization, truck convoy sheets, naval convoy sheets, prisoner sheets, broken down and destroyed vehicle sheets, supply dump sheets, sheets for the air game and more. I even created a couple of my own for production and independent units. As each Division in the game needs its own Org chart, which fit best on legal size paper, these are a lot of charts and sheets to keep track of. All of these must be filled out before the game even starts, and just setting up for the beginning of the game requires filling out hours (literally) of paperwork. And for heavens sake, dont use pen! Much of what you write in the charts at the beginning of the game will be erased by the end of the first turn. After every movement, every combat, just sitting there and doing nothing will require updating of the org charts for every unit in the game.
Any applications with a keyboard and mouse? Can I use emacs and have it start learning to code?
What if AI can do anything what can human do you with a browser over the phone?
Also love "bring your own Docker container format".
gran tourismo is advanced, for a videogame at least
Related to the blog post: https://openai.com/blog/universe/
Also, an algorithms that can learn 100s of different games with the same hyperparameters is more highly regarded than one that needs different hyperparameters for each.
Those sorts of goals are better suited to individual hobbyists. OpenAI is a blue sky research project set up by billionaires with the goal of improving all of mankind. I hope I'm not being too uncharitable when I say that your comment reminds me of those who scorned the Apollo program for its ambition.
So these would be human-like bots, rather than bot-like bots, like you normally have in games. The bot would simply learn by doing, until it masters the game, not by getting access to game algorithms.
It did not. It received the state of the board as one array, another board state array for capture/komi (since Go does have global state which is not visible purely from the board representation), and a few additional features to help it out with stuff like ladders. It was architected with convolutional layers, but over the Go grid, not pixels. See the AlphaGo paper pg11 for the exact structure of the input: http://www.postype.com/files/2016/04/08/16/05/03384c91046e8e...
Could it have learned from pixels (augmented by the additional necessary global state)? Sure. But that would've been a waste of computation since the visual layout of a Go board is fixed and static, unlike Atari games.
Source? That literally seems to make zero sense to me. Go can be represented in a super-simple state. Why make it spend millions of cycles learning to categorize pixels into that state you already have?
Instead, researchers have provided the raw feed input data to these agents with the hope that the learned features could be interpreted as game state data by humans.
There isn't an API for me to check if i'm still on the foot path and not the road as i walk down the street.
I can't use an API to tell me water is boiling and that i shouldn't stick my hand in it.
If there is remote communication, can you detail why and where it exists in code?
Other than SWF downloading or specific Internet-enabled environments, running offline should just work.
You should be able to implement this protocol for your environment and run a VNC server for the rest. A new class for the client representing your environment can be based on this:
Then register the class with OpenAI Gym:
After creating the environment using gym.make you need to add information about your remote in the call to configure:
env = gym.make('gtav.SaneDriving-v0')
This is only based on a cursory reading, but it should be possible to use custom environments with OpenAI Universe as it is today.
I only briefly poked around because it's nearing on midnight here - maybe you can pull open the examples included and work out how to rewire them to work on new games, maybe not. Either way, I've got a particular use case I'd like to make a gym for so I'm interested in finding out.
...That being said...
Instead of presenting the agent with a 2d plane of pixels, they should be presented with a sphere of pixels, with their POV inside.
Actually many of us can empathize with the feeling of not getting detailed information—the real goods—when it's want you want and you know you're capable of absorbing it. But this is a bad way to express such a strong feeling on Hacker News. A good way might be to give someone the benefit of the doubt and explain what you'd really like without insulting them.
We detached this subthread from https://news.ycombinator.com/item?id=13104019 and marked it off-topic.
The audio cuts are likely because he had a deadline that didn't allow him to completely re-record the audio. I appreciate he inserted clips that added clarity despite knowing that he'd get negative comments for that effort.
Thanks for creating this video and sharing.
I made the argument that nearly every single one of this guy's videos are not the least bit helpful in actually learning how machine learning works, and gets viewed based on click-baity title and click-bait thumbnails (attractive girls a lot of the time). Which i will stand by. Every one of them is "how to write an AI that ___" when its importing tensorflow, setting 2-3 hyper parameters then letting it run.