It is interesting to consider that the Mongol conquests in Asia, much like the Spanish conquests in America, were facilitated by a weapon they did not know they had.
It’s also interesting in that is the opposite of the book “Guns, germs and steel” likes to tell. Apparently China is the place where the disease come from due to being more urbanized early with more interactions with domesticated animals. Europe just had enough interactions with Asia to have all the disease exported to them and were the first to do real exploration.
If I remember correctly, the thesis of that booked talked about the Old World more broadly. This would include Europe, China, and the North coast of Africa as the disease producing centers.
Cleanly connecting NNs to contracts / rules is indeed a hard problem, which many people would like to see solved. Please see my post about that [1], and the corresponding HN comment thread [2].
That is my opinion as well. And indeed the post talks mainly about dynamic verification and achieving some "good enough" verification quality (as determined by coverage and other metrics).
And it is in that context that "soft" techniques like ML can help a lot, and thus the question of how to connect them to "hard" rules (which are also part of dynamic verification) becomes interesting.
I enjoyed your article very much and it touches on many of my own interests, but if we got to the point where "it's not neural networks" is now a legitimate reason against trying some technique, then something, somewhere has gone really wrong.
1. Most ML techniques are bad at connecting to rules (random trees and inductive logic programming are a small subset).
2. Most of the ML techniques that one encounters in practice while verifying intelligent autonomous systems are currently neural-network-based: Sensor fusion in the AV itself, coverage maximization attempts I am currently aware of in the verification environment, and so on.
I suspect that most ML techniques, by their nature, will not play nice with rules by default. But this is just a hunch.
Well, eventually any machine learning system needs to integrate with some other piece of software that is not, itself, a machine learning model.
For instance, in AV, is the practice to train ANNs end-to-end, so that they learn to drive a car from scratch, without any interaction with other components at any time? My intuition is that rather than that, the ANN is trained to recognise objects in images and then some hand-crafted logic decides what to do with specific types of objects etc. I think some of the examples in your article say that this is the done thing in some companies.
If this sort of integration is possible, is there any reason why integrating rule-based reasoning with machine-learned models is not?
Right - as far as I know most ANNs _are_ embedded in some pipeline which contains also "regular" SW, and thus by definition there _is_ some way to connect them to a rule-based system.
The only issue is that there is no easy, _natural_ way to do it. For instance, consider the various attempts at adding safety rules to an RL ANN (depicted in fig. 2 in the paper). Say that (in the context of an ANN controlling an Autonomous Vehicle) your ANN decided to do something on the freeway, but the safety rules say "no". There is no easy way to gracefully integrate the rule and ANN: One way is for the rule to disable the ANN's output at this point, take full control and decide what the AV _should_ do. But this leads to duplication and complexity.
So the four solutions I describe take various ways to avoid this problem. They all "work" in a sense, but none does real "integration" of the ANN and the rules (the shield synthesis solution perhaps comes closest). And it looks like you have to invent this kind of solution anew for every new instance of connecting-ANN-to-rules.
And this was just "inserting rules during execution". Then there is the issue of "verifying via rules", and "explaining the rules". It is tough, and I am wondering if there could be some conceptual breakthrough which would make it somewhat easier.
Your article caught my attention because I was thinking about the problem of
integrating probabilistic machine learning models with deterministic rule bases
(specifically, first-order logic ones). The rules
themselves would be learned from data, with ILP. I'm starting a PhD on ILP in October and this is
one of the subjects I'm considering (although the choice is not only mine and
I'm not sure if there's enough "meat" in that problem for a full PhD).
My intuition is that in the end, the only way to get, like you say, a "natural"
integration between rules and a typical black-box, statistical machine learning
model is to train the model (ANN, or what have you) to interact directly with a
rule-base- perhaps to perform rule selection, or even to generate new rules
(bloody hard), or modify existing ones (still hard). In other words, the rule
base would control the AV, but the ANN would control the rule-base.
I think there's gotta be some prior work on this but I haven't even looked yet.
I'm kind of working on it, but not from the point of view of AVs and I'm using
logistic regression rather than ANNs (because it's much simpler to use quickly
and it outputs probabilities). And I'm only "kind of" working on it. And I don't
think it'll come to anything.
I indeed meant it in the philosophical sense you describe. But I am very interested in the possible technical solutions. I tried to describe (in the chapter "Connecting ML and rules") the approaches I know of, none of which are very exciting.
I'd love to hear if anybody knows of good approaches.
As part of my graduate work with George Konidaris we've been exploring the creation of symbols and operators with ML. The goal being symbolic planning for continuous systems, however I see similarities in our approach, and the goals of rule based systems.
maybe your test cases are your rules. as long as you're recording things over time you have data to feed back to
and learn from. also, each level of abstraction you could store less data to potential learn from. instead of storing every pixel just store edges and other low level features from the first layer.
BTW, I think information travels over "normal" networking gear (e.g. fiber optics) at about 70% of the speed of light (which is why high-frequency trading tends to move to over-the-air microwave networking to shave a few nanoseconds - see [1]). But I guess the story still works, at the resolution it is told.
I think this presentation is mainly about implementation-related coverage, and there are of course many other kinds (not sure if this is what you were asking about). For instance, in HW design, there is a bigger emphasis on "functional coverage", i.e. coverage derived from a description of what the Device Under Test should do, what the inputs look like etc..
reply