Hacker News new | past | comments | ask | show | jobs | submit login
Relational Reasoning with Neural Networks (arxiv.org)
137 points by idibidiartists on June 6, 2017 | hide | past | web | favorite | 13 comments



Can someone go in to more detail about how certain neural networks inherently are able to encode and interpret particular structures? Why RNs capture relations and Recurent neural networks can capture sequential data?


Basically, you can encode your (human) assumptions/intuitions about the data/task into the math. Neural networks are just big math equations with a bunch of tunable parameters, and there is a lot (a LOT) of freedom in what those math equations can be. So, CNNs assume spatial locality (stuff being near each other) matters, which turns out to be super true for images - an image is not at all the same if you just shuffle all of its pixels around. All it takes to do that is a simple tweak to the math (make a small set of weights that is the same for a bunch of patches of the input, instead of separate weight per input). This RN concept is also a simple math tweak, it's just a function designed to do a pairwise sum over pairs of 'objects', which they claim is a good way to enable the neural network to learn about relational concepts. RNNs are also a simple math tweak, just make the function recursive (sort of, it's trickier in implementation, but close enough).


I'm not familiar with research on relational reasoning, and the subset they're working with seems pretty well defined, but even so this seems like a pretty meaningful step forward. Can anyone involved in the research comment further on how confident they are that RR is generalizable across dissimilar domains to the one tested?


Very cool! Not quite the same, but brings to mind SHRDLU, which was recently discussed here.

The first few thoughts I had on seeing this: 1) Of course this is by DeepMind! Why would I think anything different. (I love the "basic" research they are doing on NNs & Deep Learning, and am always excited to see a new paper by them).

2) I would love to see more investment into this kind of basic ML research. (By that I don't mean "easy", but addressing the fundamentals of how to approach different types / classes of problems). A lot of where the DeepMind guys seem to be finding these big wins is in combining "classic" AI / CS techniques with Deep Learning / Optimization.

Examples (And I'm a novice at deep learning, so someone PLEASE PLEASE correct me if I'm wrong): AlphaGo - Take a technique like Tree Search for playing a game, and combine with deep networks for the tricky bit of evaluating play positions Deep Reinforcement Learning - Q-Learning and other reinforcement techniques have been around for a while, but they adapted them to a deep neural net architecture Neural Turing Machines - Took a classical model of computation and made it differentiable, alowing for a neural net to "learn" algorithms like sorting. Deep Neural Computing - Figured out how to add and address external memory in a differentiable computer, allowing a neural net to solve problems like path finding on a graph.

Where I think a lot of cool stuff is going to continue to come from is by revisiting classic techniques, and figuring out how they can be adapted to a differentiable / optimizible architecture. Or taking a classic problem and finding an efficient way to evaluate "goodness" of an answer that lends itself to being used in an optimization problem. Again, not saying it is easy, but I wonder how much "low hanging fruit" there is in revisiting classic algorithms and GOFAI techniques, and asking "can I use this in a Neural Net or adapt this to be differentiable so that I can learn or optimize the tricky bits?"

I'm sure I'm glossing over a lot / missing the point of a lot of it - like I said, just a noob whose super excited about this stuff :-)


I recently went to a talk at the London Machine Learning meetup (https://www.meetup.com/London-Machine-Learning-Meetup/) on "End-to-end Differentiable Proving" (https://arxiv.org/abs/1705.11040).

Basically, this was about building neural networks based on propositional logic (e.g. Prolog-style statements), which was how some traditional expert systems were built.

Unfortunately, there wasn't a video of the presentation, and I can't find the slides anywhere.

If you're based around London, the London Machine Learning meetups are always worth attending!


Oh that is awesome, and I'll have to check the paper out! That's actually something I've been wondering -- about combining the power of deep learning with the domain expertise of old style expert systems, whether for bootstrapping the neural net, or as a rule base it can consult.

It seems like it would be very powerful to be able to harness Neural Nets for dealing with fuzzy inputs (images, natural language, spoken words), but adding a propositional rule-base they can consult for whatever the actual task is once you have dealt with that input.

On that, if it was learning a rule base, it might also be really helpful with getting insight into what your model is doing, if you could somehow introspect on / view the rules that it learned on a higher level than "when these neurons fire, we do this to the output"

(I woouldn't be surprised of course, to find one day that the DeepMind guys are already on top of it, and come up with a "Neural Warren Abstract Machine" at some point)


here goes my phd thesis :( They are too fast for a single mind to beat.


Do you know which academic groups are doing this kind of fundamental research in neural networks?


Same here, I am working an a very similar approach.


This looks interesting. I'd love to train a GAN with relational constraints.


Could this be useful for multivariate statistical analysis, such as Basketball Player A has more steals than Player B but only when Player C is on the court? If so, it might get pretty busy on FanDuel


Standard MLPs already capture this kind of non-linear relation. The huge impact of RNs, it seems, is in how it organizes the computation of these relations in a more elegant/efficient way (object pairwise, rather than all at once)


Well, it's a much more interesting idea than a usual hype article with yet another application of the statistical inference.




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: