Hacker News new | comments | show | ask | jobs | submit login
The future of deep learning (keras.io)
242 points by nicolrx on July 18, 2017 | hide | past | web | favorite | 69 comments

> In DeepMind's AlphaGo, for example, most of the "intelligence" on display is designed and hard-coded by expert programmers (e.g. Monte-Carlo tree search);

Not true. This paraphrases the original paper: https://www.tastehit.com/blog/google-deepmind-alphago-how-it...

> They tested their best-performing policy network against Pachi, the strongest open-source Go program, and which relies on 100,000 simulations of MCTS at each turn. AlphaGo's policy network won 85% of the games against Pachi! I find this result truly remarkable. A fast feed-forward architecture (a convolutional network) was able to outperform a system that relies extensively on search.

Also, this article reeked of AGI ideas. Deep learning isn't trying to solve AGI. Reasoning and abstraction and high level AGI concepts that I don't think apply to deep learning. I don't know the path to AGI but I don't think it'll be deep learning. I think it would have to be fundamentally different.

>Also, this article reeked of AGI ideas. Deep learning isn't trying to solve AGI. Reasoning and abstraction and high level AGI concepts that I don't think apply to deep learning. I don't know the path to AGI but I don't think it'll be deep learning. I think it would have to be fundamentally different.

I think that this is actually what the article is arguing. from the article: >Models closer to general-purpose computer programs, built on top of far richer primitives than our current differentiable layers—this is how we will get to reasoning and abstraction, the fundamental weakness of current models.

This means not using current deep learning ideas, and instead finding ways to integrate other types of programs (conventional algorithms, other types of ML) alongside Deep Networks.

> Also, this article reeked of AGI ideas.

Of course it did. It is about how machine learning can evolve to solve problems that require reasoning and abstractions.

The previous article was about the limits of machine learning and this one is about how to overcome them in the future. The limits was pretty much defined as "Reasoning and abstraction" so of course this article is about how to get that working.

> Deep learning isn't trying to solve AGI.

Well, I dunno about "deep learning", but AGI is DeepMind's explicitly stated goal.

And your source for this is? Could not find any such claim on their site.

> We really believe that if you solved intelligence in a very general way, like we're trying to do at DeepMind, then step 2 ['use intelligence to solve everything else'] would naturally follow on.

They go on to talk about general purpose learning machines.

Source: https://youtu.be/ZyUFy29z3Cw?t=4m42s

Dr. Hassabis is awesome, but that video and the language is misleading to a layman. He is distinguishing between expert driven systems that rely on heuristics/feature engineering and between systems that learn from raw input and derive their own optimal set of features (unsupervised learning).

This is a far cry from AGI. I think Dr. Hassabis rather in a tongue and cheek manner played with the terminology in the video. Deep learning and all the modern AI stuff you hear about is within the realm of "narrow AI", or more formally, applied AI. In his video, he uses "narrow AI" to define systems that rely on expert based heuristics and feature engineering, and general purpose AI to be what they are currently doing with reinforcement learning.

Whilst it's wonderful that their advancement in reinforcement learning has been applied to various different problems successfully, it shouldn't be confused with AGI.

AGI is on a totally different playing field. I don't think we are substantially closer to AGI than we were 50 years ago, and I would be very interested in anyone arguing the opposite.

I think at this point the only company trying to seriously tackle AGI is: https://numenta.com/

> Reasoning and abstraction and high level AGI concepts that I don't think apply to deep learning

What do you mean? Each layer is selecting features and abstracting previous layers. A cat neuron abstracts all possible pixels that form cats.

He's just saying that we need better ways of composing this `deep` abstractions.

> we need better ways of composing this `deep` abstractions

Tensor2Tensor[0], from the Google Brain team, has some strong recent results in that direction.

Related paper: "One Model To Learn Them All"[1].

[0]: https://github.com/tensorflow/tensor2tensor

[1]: https://arxiv.org/pdf/1706.05137.pdf

What about the future of jobs in the field of deep learning?

EDIT: I'm thinking deep learning will become much like web development is today. Everybody can do it, and only a few experts will work at the technological frontier and develop tools and libraries for everybody else to use.

Therefore, if one invests time in DL, then I suppose it better be a serious effort (at research level), rather than at the level of invoking a library, because soon everybody can do that.

I highly doubt machine learning is going to become a commodity skill any time soon. The optimal model structure is tightly coupled to the specific data and the types of questions you want to answer. Additionally, understanding your model well enough to really trust it requires significant knowledge.

I expect that common deep learning tasks are going to end up being performed by large pre-trained networks behind an API. Image tagging, sentiment analysis, etc. It will be possible to fine-tune these networks for specific tasks by retraining top layers using new data.

Everybody can do web development? Then why do so many people complain about how complicated it is?

Because it was somehow designed to be as complicated, difficult, and utterly un-modular as possible. I actually have a more difficult time fully testing a commit's worth of Rails dev than I ever did with a commit's worth of embedded firmware.

anybody can do it poorly

I enjoyed part 1 of Chollet's two articles today but am less fond of this one. It suggests that deep learning will expand from its present capabilities of recognizing patterns to one day master logical relations and employ a rich knowledge base of general facts, growing into a situated general problem solver that one day may equal or surpass human cognition. Maybe. But he then proposes that deep nets will rise to these heights of self-organization and purposefulness using one of the weakest and slowest forms of AI, namely evolutionary strategies?

I don't think so.

The many problems bedeviling the expansion of an AI's competence at one specific task into mastery of more general and more complex tasks are legend. Alas neither deep nets nor genetic algorithms have shown any way to address classic AGI roadblocks like: 1) the enormity of the possible solution space when synthesizing candidate solutions, and 2) the enormous number of training examples needed to learn the multitude of common sense facts common to all problem spaces, and 3) how to translate existing specific problem solutions into novel general ones. Wait, wait, there's more...

These roadblocks are common to all forms of AI. The prospect of replacing heuristic strategies with zero knowledge techniques (like GA trial and error) or curated knowledge bases with only example-based learning is unrealistic and infeasible. Likewise, the notion that a sufficient number of deep nets can span all the info and problem spaces that will be needed for AGI is quite implausible. While quite impressive at the lowest levels of AI (pattern matching), deep learning has yet to address intermediate and high level AI implementation challenges like these. Until it does, there's little reason to believe DL will be equally good at implementing executive cognitive functions.

Yes DeepMind solved Go using AlphaGo's deep nets (and monte carlo tree search). But 10 and 20 years before that IBM Watson solved Jeopardy and IBM Deep Blue solved chess. At the time, everyone was duly impressed. Yet today nobody is suggesting that the AI methods at the heart of those game solutions will one day pave the yellow brick road to AI Oz.

In another 10 years, I predict it's just as likely that AlphaGo's deep nets will be a bust as a boom, at least when it comes to building deep AI like HAL 9000.

TLDR is that models will become more abstract (current pattern recognition will blend with formal reasoning and abstraction), modular (think transfer learning, but taken to its extreme - every trained model's learned representations should be applicable to other tasks), and automated (ML experts will spend less time in the repetitive training/optimization cycle, instead focusing more on how models apply to their specific domain).

I think it's true, but I hope this synergy between logic and pattern recognition actually happens, as I feel like this has been proposed for years but never really come to fruition. However, with recent work on differentiable communicating agents, differentiable memory etc., perhaps it now has a chance to get there.

The author says not everything should be differentiable. Intuitively, I agree, but the question is how to do a sufficiently fast search through a high-dimensional space when you don't have a gradient.

If you don't have a gradient, one tactic is to make the most of the situation. Give your model the Bayesian treatment, and sample from the posterior using MCMC. This is slow, but you end up with posteriors on your parameter values, which is a huge win.

Yeah, I've been a big fan of probabilistic programming for a while. The real problem is that getting Monte Carlo methods to converge and produce a large sample from the posterior takes orders of magnitude more time than running an optimizer to descend a gradient. Hey, you can even make it a probabilistic gradient: variational inference! But then you still have a hard time with discrete, nondifferentiable structure.

Guided by domain-independent lowerbound functions, obviously. There are lots of research in this area, see every year there are papers on heuristic search in AAAI. ML people are simply not reading them and citing them.

Could you point some out to me? I haven't been deliberately avoiding them by any means, but I'd like to see some that allow fast search. And especially, if you've got it, fast stochastic search for Monte Carlo methods.

I bet you already know MCTS. But I see you have a problem with defining "fast". Define "fast". You can usually only say something is faster or slower.

Absolute speed depends on the machine resource and the low-level optimization. If you take MCTS it is "enough fast" considering it has already beat Lee Sedol.

There are multiple means to measure the speed. Blind search is "fast" re: node expansion but takes exponential time to find a solution and quite likely exhausts memory. Search with heavy heuristic functions has "slow" expansion but can solve the problem quickly, despite the cost of evaluating them, since it can prune a large part of the state space and expands fewer nodes. Only the latter is practically meaningful.

Just read the classical AI textbooks like AIMA first, then [this one](https://www.amazon.co.uk/Heuristic-Search-Applications-Stefa...) if you want to know more. Maybe [this short paper](https://www.cs.cmu.edu/~maxim/files/searchcomesofage_aaai12_...) may give you the gist.

On the topic of "likely exhausts memory":

The best search methods allow reversible state updates. The reversibility makes things super-fast -- you no longer need to copy all data structures on path expansion. Instead, you modify only a single representation, incrementally. And when retracting a search step, you "undo" the same modifications again, arriving at the exact same state as when you started.

This is of course non-trivial -- it is much easier to copy everything, then throw away the entire copy when it's not needed, rather than keeping a single state incrementally consistent. But the effects due to data locality (excellent caching), better memory management (no allocations, fragmentation) and less work (only touch and update parts of the state that matter) can be tremendous.

I haven't seen any discussion of DeepMind's implementation details for AlphaGo, but since they come from a game development background (David Silver was the CTO and lead dev at Elixir Studios), where each cycle counts, I have no doubt they're well familiar with all these concepts. But then the TPU throws a wrench into it again...

That idea is nearly 30 yrs old, it is the whole point of IDA* (Korf 85) and maybe Frontier Search (Korf 99) too. Their benefits, drawbacks and the cure are all well studied. I'm surprised some people who discuss AI do not know these famous algorithms.

MCTS's playouts do not need to backtrack (they are just the greedy probes), so it is irrelevant. By backtrack, do not confuse it with the backpropagaton in MCTS.

I do not see the connection to TPU.

Reversible state updates have nothing to do with IDA.

The concept is completely orthogonal to choosing which node (next state) to expand next. It's about managing the internal representation of a state.

OK, reversible updates. You need to somehow store the parent state information in order to backtrack. And the natural way to do this is to push the edge/diff info in the stack. By doing so you only have to maintain one state at a time, keeping the backtrack information compact. But when to backtrack? You surely use iterative deepening, otherwise it returns a suboptimal path or may not backtrack forever on the state space graph. You also want to prune the states? Use the sum of the depth and the heuristic lower bound. This is IDA*. You can call a blind IDA* as IDDFS too, but I generally just call them IDA*. A plain DFS is not viable in complex problems, so let's forget about it.

> choosing which node (next state) to expand next

I know this is "heuristic search", a broader class of algorithms which includes A* and IDA* and many more. But the core selling point of IDA* is not the "heuristic search", but its compact linear-space memory usage (thus fast) compared to A*. So I did not suggested IDA* because of its heuristic search aspect.

For an efficient implementation, reversible updates are just natural and common in IDA*. Below is an excerpt from [1]. I believe this is equivalent to what you call reversible state updates.

    * In-place Modification

    The main source of remaining overhead in the expansion function is
    the copy performed when generating each new successor (line 12 in Figure 1).
    Since IDA* only considers a single node for expansion at a time, and because
    our recursive implementation of depth-first search always backtracks to the
    parent before considering the next successor, it is possible to avoid copying entirely.
    We will use an optimization called in-place modification where the search 
    only uses a single state that is modified to generate each successor and
    reverted upon backtracking.
[1] Burns, E. A., Hatem, M., Leighton, M. J., & Ruml, W. (2012, July). "Implementing fast heuristic search code". In Fifth Annual Symposium on Combinatorial Search.

Yep, that's the one.

By the way, this "local-changes-only" approach crops up in many other places too. In CS and nature, because it's just so damn energy-efficient.

For example, check out this (recent, July 2017) paper:

Gomez, Ren, Urtasun & Grosse: "The Reversible Residual Network: Backpropagation Without Storing Activations", https://arxiv.org/abs/1707.04585.

You can't help but think of the ancient programmer's trick for swapping two variables without a temp storage...

  a = a + b;
  b = a - b;
  a = a - b;

This is part 2 from the post yesterday: https://news.ycombinator.com/item?id=14790251

And the author posted a comment on hn:

"fchollet: Hardly a "made-up" conclusion -- just a teaser for the next post, which deals with how we can achieve "extreme generalization" via abstraction and reasoning, and how we can concretely implement those in machine learning models."

I like the ideas presented in the post, but its not concrete or new at all.Basically he writes "everything will get better".

I do agree with the point that we need to move away from strictly differential learning though. All deep learning problems only work on systems that have derivates so we can do backpropagation. I dont think the brain learns with backpropagation at all.

* AutoML, there are dozens of these type of systems already, he mentions one already in the post called HyperOpt. So we will continue to use this systems and they will get smarter? Many of these systems are basically grid search/brute force. Do you think the brain is doing brute force at all? We have to use these now because there are no universal correct hyperparameters for tuning these models. As long as we build AI models the way we do now, we will have to do this hyperparameter tuning. Yes, these will get better, again, nothing new here.

* He talks about reusable modules. Everyone in the deep learning community has been talking about this a lot, its called transfer learning and people are using it now, and working on making it better all the time. We currently have "model zoos" which are databases of pretrained models that you can use. If you want to see a great scifi short piece on what neural network mini programs could look like written by the head of computer vision at tesla, check out this post: http://karpathy.github.io/2015/11/14/ai/

Everyone makes the assumption that computers should get to be as smart as humans but in some ways, its the other way around. For example, the human brain is not a turing machine, it doesnt have memory (in the sense that its lossy). You need memory to have a turing machine so with a paper and pencil, a human is a turing machining but a very slow run. Compare the difference to read and write on paper than a computer has to access ram.

I think there will be some kind of meta deep learing (still using deep learning but compose of algebras which are augmented compared to today's standards). We have already started this by using pretrained networks for tasks. There is no reason RNNs won't go this way (i imagine they already are but this isnt my research area specifically) after all, RNNS are a turing machine.

Interesting article, in a difficult topic. Speculating about the future of deep learning. The author deserves recognition for writing about this. In my personal opinion, within the next 10 years, there will be systems exhibiting basic general intelligence behavior. I am currently doing early hobbist research on it, and I see it as feasible. These system will not be very powerful initially. They will exist and work in simpler simulated environments. Eventually we will be able to make these systems powerful enough to handle the real world. Although that will probably not be easy.

I somewhat disagree with the author. I don't think that deep learning systems of the future are going to generate "programs", composed of programming primitives. In my speculative view, the key for general intelligence is not very far from our current knowledge. Deep learning, as currently we have, is a good enough basic tool. There are no magic improvements to the current deep learning algorithms, hidden around the corner. Rather what I think will enable general intelligence, is assembling systems of deep learning networks in the right setup. Some of the structure of these systems will be similar to traditional programs. But the models they generate will not resemble computer programs. They will be more like data graphs.

I expect within 10 years there will be computer agents capable of communicating in simplified, but functional languages. Full human language capability will come after that. And within 20 years I expect artificial general intelligence to exist. At least in a basic form. That is my personal view. I am currently working on this.

20 year time frame that is around 2040 for AGI Artificial General Intelligence in the it's BASIC Form seems in line with many experts in this filed.

> I expect within 10 years there will be computer agents capable of communicating in simplified, but functional languages. Full human language capability will come after that. And within 20 years I expect artificial general intelligence to exist. At least in a basic form. That is my personal view. I am currently working on this.

> 20 year time frame ... seems in line with many experts in this filed

When has "20 years" not been in line with the predictions of experts for the advent of AGI?

And quantum computing, as well as fusion generators... :-)

Yup, https://xkcd.com/678/ and its flavor text

what kind of research have you done on this? Most of what I hear about this is more pessimistic than anything else. I'm curious what you've learned that's made you optimistic.

I am working on creating basic artificial general intelligence. My research is still early, and I don't want to give much details for now. For example, in my opinion it is wrong to feed a giant neural network with tons of text, and expect it to understand human language. That will never work. Language understanding requires a grasp of a world where the agent exist first. A general understanding of things. Then, on top of that, it can build basic communication. And then it can be hoped that this communication can become sophisticated enough, to reach human language level.

Another example: Feeding a neural network with millions of product reviews, and expect it to be capable to understand and write product reviews is hopelessly laughable. Not even with petabytes of data.

I started working on this, because I think that not enough focus is being put on AGI, or research is not creative enough. At least not that is being published. I am optimist with my work, and soon I will reveal more. But even if I don't reach my goals, I think that it is just a matter of persistence. At any moment, someone will solve the problem of AGI.

>I am working on creating basic artificial general intelligence. My research is still early, and I don't want to give much details for now.

Do you have peer-reviewed publications? A github link? References to such? Talking big on the internet is easy.

I agree. I think the next big advancements will come from a different model, not from fine-tuning current ML setups.

I'm more interested in a general AI that can learn any game or environment it encounters to optimize a return. Not quite general AI, but a different path than what is going on right now.

Glad to see Deep Learning "coming down to earth". This is the first high profile post I've seen that spells out exactly how DL models will become reconfigurable, purpose-built tools, and what a workflow might look like. We're still a long way aways from treating them like software components.

I mean, these are topics that have been discussed countless times over the years and in some cases decades.

It's all well and good to say we need generalizable machines, and something other than backprop, and something closer to traditional programs, but we all know this. The issue is that no one knows what this would even mean, never mind how one would go about implementing it. In the few cases we do know how, the results are horrible compared to the methods we already use.

We use the methods we do today because they work, not because we think they are the best, or because we don't understand the limitations of our models.

True, there's been discussions, but from what I've seen it's mostly flag planting or vague pop-eng fodder that project directors dish out to tech journalists. Having Keras make a statement on this carries far more weight, because fchollet is not selling a product, or pushing an agenda, or creating a walled garden of some sort.

The only thing that's a bit off about Keras is that it's mostly the efforts of one guy. Sure, there's many other contributors, but they don't seem to be acknowledged. I've never seen anyone else speak for the project. I'd really like to see a neutral party emerge for deep learning practice and tooling, before the whole industry gets sucked into a single dominant ecosystem like AWS.

Do you think with Google's adoption of Keras for TensorFlow it'll get more resources dedicated to it?

Here are the results of my research into program synthesis using genetic algorithms.

Using Artificial Intelligence to Write Self-Modifying/Improving Programs


There is always a research paper, if you prefer the sciency format.

BF-Programmer: A Counterintuitive Approach to Autonomously Building Simplistic Programs Using Genetic Algorithms


Seeing how gradient descent is such a pinnacle of deep learning, I can't help wondering: is this how our brain learns? If not, then what prevents us from implementing deep learning the same way?

One of the most consistent theory about how our brain learns is described in HTM (Hierarchical Temporal Memory), a more biologically inspired neural network. See Jeff Hawkins' "On Intelligence". It is based on:

* Input of continuous unlabeled time-based patterns.

* Associative Hebbian Learning (when distinct inputs/patterns come together, they are neuron-wired together). Synapses can be modified via experience. See "Hebbian Theory".

* The brain is a prediction machine: it is always trying to predict the future based on past learned patterns. Learning happens when reality does not match the originally prediction and we rewire the world model based on new input. See "Bayesian approaches to brain function".

* Input signals are processed by many layers, each one creating more abstraction from the previous one, from sensory neurons to the highest cortex layers.

* Each region of the hierarchy forms invariant memories (what a typical region of cortex learns is sequences of invariant representations).

* There is lots of feedback (highest level neurons back to the lowest levels). In some structures (e.g. the thalamus, that is a kind of "hub of information") connections going backward (toward the input) exceed the connections going forward by almost a factor of ten.

* Brain uses Sparse Distributed Memory (SDM). See SDM by Pentti Kanerva (NASA researcher).

* Neuron models have many more variable/parameters (that can be used to transfer or process information) than usual nodes/links from artificial neural networks. E.g.: Long-term potentiation vs Long-term depression, neuronal Habituation vs Sensitization, inhibitory vs excitatory neurons, firing rates, synchronization, neuromodulation, homeostasis and more.

* The backward propagation of errors in artificial neural networks only occurs during the learning phase. But the brain is always learning and updating weights and relationships between patterns, given new inputs.

* During repetitive learning, representations of objects move down the cortical hierarchy (from short-term memory to long-term memory), forming invariant memories.

* The brain needs to replay the memory (memory rehearsal) of a learned stimulus so it can be stored in long-term memory.

* The job of any cortical region is to find out how inputs are related (pattern recognition), to memorize the sequence of correlations between them, and to use this memory to predict how the inputs will behave in the future.

* Predictive coding: the brain is constantly generating and updating hypotheses that predict sensory input at varying levels of abstraction.

Jeff Hawkins is kind of a crank when it comes to neuroscience, and his AI companies have tended not to publish state-of-the-art results on machine learning problems either.

(Replying here for visibility.) In a different comment branch you mentioned counterfactuals. I've watched a video lecture about counterfactuals in graphical models by Pearl, but I'm not exactly seeing the significance as a "missing piece" in AI. Would you mind explaining a bit what you exactly mean?

Do counterfactuals have something to do with learning from negative examples and simulations? For example, if one shoots a ball and misses the goal to the right, one does not 'mindlessly' penalize the circuits that led to the exact motor decisions that were involved, but instead, one simulates alternative actions and uses e.g. (in this case linear) relationships between e.g. the angle of the foot or the wind speed and the shooting direction. The next time, one hence tries to aim slightly to the left.

Or are you referring to a much more fundamental level and my example might rather be a learning strategy that is more likely acquired by trial & error, reinforcement learning, meta learning ("learning how to learn") and/or via the shared concept space of language and culture?

Is it maybe related to e.g. prototype-based associative recall and a counterfactual is basically an alternative way of interpreting the data? "What error signal would I get, if I had interpreted X as Y?"

Or does it come from the Bayesian approach where you marginalize out all hypotheses, including the factual one that corresponds to the state of the world, but also all counterfactual hypotheses. So, including counterfactuals means going beyond the maximum likelihood point estimate e.g. by communicating confidence intervals or even entire distributions from neurons to neurons or neuron populations to other neuron populations?

Counterfactuals in Pearl's sense are what allow particular models to be causal: to represent cause and effect under intervention, as opposed to mere correlation. This is an important part of how to build models that think like people[1].

[1] https://arxiv.org/abs/1604.00289

Is it in particular the dot product (correlation) in MLPs that prevents them from inferring all causal structures in the data? So, instead of template matching of co-occurrences of features in the layer below, we (also) need to learn whether and how one feature causes the other?

Again, it's the lack of counterfactuals: the ability to intervene on a node and cut it off from its parents, then see what happens, and the ability to perform inferences over discrete spaces.

Are there any concrete attempts at transferring this concept to MLPs? E.g. by overriding the values of particular nodes/features by feedback connections?

No, because neural nets do not work that way, even when they output actions. Making things More Neural doesn't make them better, and AFAIK, not everything good can be made More Neural.

> Because neural nets do not work that way

Are there works that expose this limitation of MLPs more formally?

>not everything good can be made More Neural.

Neural networks are universal function approximators, so you probably mean not everything good can be made with MLPs trained by gradient descent?

>It's the lack of [...] the ability to perform inferences over discrete spaces.

How would you judge the extent to which AlphaGo has learned to react to single discrete changes in the input. It seems that it learned very well to react very sharply to whether a single stone is placed at a strategically significant position.

The idea the brain is a prediction machine is counter-indicated by 80% of the people I meet.

Our brains have many local loops in their neural networks. These local loops inhibit the neurons that activate them. This plays a major role in the stability of brainwave patterns. The neurons of these local loops learn like any other neuron, so inhibition can increase over time. Thus errors are "forward" propagated in a loopy model.

Our brains do not use gradient descent. Last i read they use some kind of local optimization. Can someone elaborate on this?

Regarding logic and DL, there is NeSy workshop in London http://neural-symbolic.org/

Are there popular modern libraries that do program synthesis? Although I've thought about this and read about the concept on hn, I've not heard it mentioned seriously or frequently or strenuously as a thing to do either just for fun or to get a job doing it. This could be a popular way to solve programming problems without needing programmers. I think this truly would kick off AI as a very personal experience for the masses because they would use AI basically like they do already do now with a search engine. People would use a virtual editor to design their software using off the shelf parts freely available. The level of program complexity could really skyrocket as people now have more control over what and how they run programs because they can easily design it themselves. Everyone could design their own personal Facebook or Twitter and probably a whole new series of websites too complex or for other reasons not invented yet.

For instance, you want to program the personality of a toy, so you search around using the AI search engine for parts that might work. Or you want a relationship advice coach so you put it together using personalities you like, taking only the parts you want from each personality. Or another example would be just to make remixes of media you like. Because everything works without programming anyone can participate.

Check out Genetic Programming: https://en.wikipedia.org/wiki/Genetic_programming

AFAIK GP remains the primary means to automate the synthesis of software. Though it was introduced perhaps 30 years ago, it hasn't been an active area of research for the past 20, AFAIK.

I'm also interested to see how the worlds of program synthesis (specifically type directed, proof-checking, dependently typed stuff) can combine with deep learning. If recent neural nets have such great results on large amounts of unstructured data, imagine what they can do with a type lattice.

Recent baby steps in gradient checking: https://news.ycombinator.com/item?id=14739491

Great work! Glad someone can finally explain this to the masses in an easy to understand way. Looking forward to the future!

About libraries of models, it would be useful to have open source pre-trained models which can be augmented through github-like push requests of training data together with label sets.

It would allow to maintain versioned versions of always improving models everyone can update with a `npm update`, `git pull` or equivalent.

Self-driving cars are expected to take over the roads, however no programmer is able to write code that does this directly, without machine learning. However, programmers have built all kinds of software of great value, from operating systems to databases, desktop software and so on. Much of this software is open source and artificial systems can learn from it. Therefore, it could well be that, in the end, it would be easier to build artificial systems that learn to automatically develop such software than systems that autonomously drive cars, if the right methodologies are used. The author is right to say that neural program synthesis is the next big thing, and this also motivated me to switch my research to this field. If you have a PhD and are interested in working in neural program synthesis, please check out these available positions: http://rist.ro/job-a3

I'm wondering if we will ever figure out how nature performs the equivalent of backpropagation, and if that will change how we work with artificial neural networks.

I'm excited for the easy to use tools that have to be coming out relatively soon. There are a lot right now, but the few I've used weren't super intuitive like I feel like they could be.


That one word disrupts his whole point of view. This idea that we need orders and orders of magnitude more data seems insane. What we need is to figure out how to be more effective with each layer of data, and be able to have compression between the tensor layers.

The brain does a great job of throwing away information, and yet we can reconstruct pretty detailed memories. Somehow I find it hard to believe that all of that data is orders of magnitude above where we are today. Much more efficient, yes. And that's through compression.

Crap. I just realized why this got voted down - I posted my comment on the wrong article.

I guess that's what I get after walking away for 30 minutes before posting. Doh!

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact