
An Explicitly Relational Neural Network Architecture - jonbaer
https://arxiv.org/abs/1905.10307
======
shoo
i'm not familiar with the standards of reproducibility in computer science,
but compared to many other disciplines, it should be trivial to release
"research-quality" [+] source code or a vm/container image that allows others
to more directly attempt to reproduce the results described in the paper.

This paper's section on related work calls out a recent paper by Asai

> Asai [1], whose paper was published while the present work was in progress,
> describes an architecture with some similarities to the PrediNet, but also
> some notable differences. For example, Asai’s architecture assumes an input
> representation in symbolic form where the objects have already been
> segmented. By contrast, in the present architecture, the input CNN and the
> PrediNet’s dot-product attention mechanism together learn what constitutes
> an object.

re: [1], Asai's paper:
[https://arxiv.org/abs/1902.08093](https://arxiv.org/abs/1902.08093) .

Asai has released code for earlier papers on github: "LatPlan : A domain-
independent, image-based classical planner".
[https://github.com/guicho271828/latplan](https://github.com/guicho271828/latplan)

it reads as if the code for [1] may be open-sourced in future "Asai, M.: 2019.
Unsupervised Grounding of Plannable First-Order Logic Representation from
Images (code not yet available) "

[+] research-quality, i.e., in whatever random state the prototype code it
happens to be in, without any guarantees about it compiling, or running, or
behaving correctly, having an automated test suite, or whatever. in all of
these cases, it is _technically_ trivial to make it available to the wider
research community

~~~
alexlikeits1999
This is from DeepMind. They rarely release source code at the time that the
paper is submitted/published. They often release source code later on. While
not ideal it seems like a billion times better than the situation where they
don't publish at all. I don't think any of their results are fake so if
nothing else it's a signal that says "this is possible to do in a way roughly
described within".

~~~
winterismute
Did they ever release sourcecode at DeepMind? I mean, not the frameworks or
such but the code used for the experiments...

------
skdotdan
Tangentially, I have a concern with neural-symbolic hybrids, surely someone
can address it. Do you think that symbols produced by a neural network (in a
complex enough task) will ever be comprehensible by a human? Because my
intuition here is that the symbols will look super complex and even random,
and actually they will be just a combination that "just works" but we won't
know why, just as other deep learnng models. Instead of arbitrary floats
(weights), we will have arbitrary chains of symbols.

~~~
taliesinb
It depends how those symbols are encoded. Techniques like attention, and
systems like transformers that are built on top of them, are often produce
highly interpretable execution traces simply because their patterns of
activity are very revealing of how they are going about solving the task. It's
harder to interrogate their learned weights in any free-standing way, of
course.

But the neuro-symbolic concept learning paper I gave already shows the
potential translucency of these kinds of hybrid systems: the linguistic
interface (its a VQA task) allows one to simply look up the feature vectors
and programs associated with particular phrases or English nouns. Similarly,
the scene is parsed in an explicitly interpretable way, with bounding boxes
for the various objects on which reasoning will commence. This 'bridge' theme
between natural language and the underlying task space is really powerful, and
it probably makes sense to figure out how we include them for systems that
have nothing to do with natural language.

[https://arxiv.org/abs/1901.11390](https://arxiv.org/abs/1901.11390) contains
another great example of how interpretable such models can be, especially if
they are generative. Take a look at those segmentations!

Lastly, [https://arxiv.org/abs/1604.00289](https://arxiv.org/abs/1604.00289)
lays out this vision in a lot more detail.

~~~
skdotdan
Many thanks for your answer!

------
bflesch
This looks very interesting. From my layman understanding they use very
abstract input images (like tetris objects) in order to train the network to
detect first-order logic statements. This demonstrates that a network might be
one day fully restricted to utilizing strict logic internally, and thereby
making it more auditable, efficient and generalizable.

------
YeGoblynQueenne
Well that's very interesting work and it could potentially be _very_ useful to
my own reserach which is basically machine learning _from_ symbolic
representations. Unfortunately, from my quick readig of the paper it doesn't
seem like the "relational" representations learned by PrediNet are ever
encoded in explicitly symbolic structure, or that it is even possible to
disentangle them from the trained model at all.

That would have been really useful, because for instance one could set up a
pipeline with PrediNet on one end and the symbolic machine learning system I
work with, Metagol [1], on the other end. That would be damn close to an ideal
of a deep neural net as a "perceptual" unit and a symbolic machine learning
system as a "reasoning" unit, which has long seemed to me like a winning
combination, even the next big thing in machine learning- _if anyone ever
manages to get it right_. Unfortunately, it doesn't seem PrediNet can be used
in this way. Its representations stay locked up in its model. Too bad. Though
still interesting.

As a side point- what's up with "propositional relational" representations? I
dare say Murray Shannahan knows a thing or two more than me about logic, but
I'm pretty sure I know this well: a representation is either relational, or
it's propositional. A (first order) relation relates propositions. I don't get
where this terminology comes from and it's a bit confusing.

___________________

[1] [https://github.com/metagol/metagol](https://github.com/metagol/metagol)

------
taliesinb
What excites me about this and similar work (e.g.
[https://arxiv.org/abs/1904.12584](https://arxiv.org/abs/1904.12584)) is that
it augurs well for a new era of AI in which we combine the strength of deep
learning (specifically, the teaching signal provided by automatic
differentiation) and symbolic GOFAI (with its appeal to interpretability and
traditional engineering principles).

Of course we need to keep the gradients, architecture search, hyperparameter
tuning, scalable training to massive datasets, etc., but there is a growing
sense that the programs we write can encode extremely powerful priors about
how the world works, and to _not_ encode those priors leaves our learning
algorithms subject to attacks, bugs, poor sample efficiency, bad
generalization, weak transfer. Not to mention a host of rickety conclusions
that are probably poisoned by hyperparameter hacking.

Conversely, we need to try to avoid the proliferation of black box systems
that require heroic efforts of mathematical analysis to understand and debug.
Take for example the highly sophisticated activation atlas work by Shan Carter
and others, which was needed to reveal that many convnets are vulnerable to an
almost childlike kind of juxtapositional reasoning (snorkler + fire engine =
scuba diver). Beautiful work, but to me it would be better if that form of
analysis wasn't necessary in the first place, because the nets themselves were
incapable of reasoning about object identity using distant context.

We need systems that are, by design, amenable to rigorous and lucid scientific
analysis, that are debuggable, that admit simple causal models of their
behavior, that are provably safe in various contexts, that can be
straightforwardly interrogated to explain their poor or good performance, that
suggest modification and elaboration and improvement other than adding more
neurons. We need to speed the maturation of modern deep learning out of the
alchemical phase into something more like aeronautical engineering.

The major innovations in recent years have been along these lines, of course.
Attention is a great example, basically supplanting RNNs for a lot of sequence
modelling. Convolutions themselves are probably the ur-example, of course.
Graph convolutions will be the next major tool to be pushed into wider use. To
the interested observer the stream of innovations seems not to end. But the
framing that makes this all very natural is precisely that of this being the
union of computer programming, where coming up with new algorithms for bespoke
tasks is commonplace, with automatic differentiation, which allows those
algorithm to learn.

What remains exciting virgin territory is how we best we put these new beasts
into the harness of reliable AI engineering. That is in its infancy, because
how you write and debug a learning program is completely different to the
ordinary sort... there are probably 10x and 100x productivity gains to be
realized there from relatively simple ideas.

~~~
YeGoblynQueenne
>> We need systems that are, by design, amenable to rigorous and lucid
scientific analysis, that are debuggable, that admit simple causal models of
their behavior, that are provably safe in various contexts, that can be
straightforwardly interrogated to explain their poor or good performance, that
suggest modification and elaboration and improvement other than adding more
neurons.

You mean specifically _machine learning_ systems with these properties. Such
machine learning systems do exist and have a big great body of research behind
them: I'm talking about Inductive Logic Programming systems.

ILP has been around since the '90s (and even earlier without the name) and
it's only the lack of any background in symbolic logic on the part of the most
recent generation of neural net researchers that stops them from evaluating
them, and using them to mine ideas to improve their own systems.

For a state-of-the-art ILP system, see Metagol (created by my PhD supervisor
and his previous doctoral students):

[https://github.com/metagol/metagol](https://github.com/metagol/metagol)

~~~
taliesinb
Thanks! I see you are doing your PhD in ILP. As someone who is pondering the
topic of his future PhD, the obvious question is: have ILP models been
enriched with automatic differentiation? How? Did it help?

Either way, can you recommend a good survey article on ILP and the last few
years of progress on that front?

~~~
YeGoblynQueenne
Hi. ILP is Inductive Logic Programming, a form of logic-based, symbolic
machine learning. ILP "models" are first-order logic theories that are not
differentiable.

To put it plainly, most ILP algorithms learn Prolog programs from examples and
background knowledge which are also Prolog programs. Some learn logic programs
in other logic programming languages, like Answer Set Programming or
constraint programming languages.

The wikipedia page on ILP has a good general introduction:

[https://en.wikipedia.org/wiki/Inductive_logic_programming](https://en.wikipedia.org/wiki/Inductive_logic_programming)

The most recent survey article I know of is the following, from 2012:

(ILP turns 20 - Biography and future challenges)
[https://www.doc.ic.ac.uk/~shm/Papers/ILPturns20.pdf](https://www.doc.ic.ac.uk/~shm/Papers/ILPturns20.pdf)

It's a bit old now and misses a few recent developments, like learning in ASP
and meta-interpretive learning (that I work on).

If you're interested specifically in differentiable models, in the last couple
of years there has been a lot of activity on the side of mainly neural
networks researchers interested in learning in differentiable logics. For an
example, see this paper by a couple of people at DeepMind:

(Learning explanatory rules from noisy data)
[https://arxiv.org/abs/1711.04574](https://arxiv.org/abs/1711.04574)

Edit: May I ask? Why is automatic differentiation the "obvious" question?

