> I mean, try to define some BK and generate combinations of it until you solve ...

YeGoblynQueenne · on Nov 20, 2020

Deep neural nets do not learn only from examples! They encode strong inductive biases in their carefully hand-engineered and hand-tuned architectures, hence for example CNNs are used for image recognition and LSTMs for sequence learning etc. Without these biases deep neural nets would not be able to generalise as well as they do (in the sense of local generalisation but not global generalisation as meant by François Chollet [1]).

The biggest advances in deep neural nets have come from the discovery and use of good inductive biases: training with gradient descent, backpropagation, more hidden layers, the "constant error carousel", convolutional layers, ReLu over sigmoid, attention, etc, etc. One could say that deep neural nets are all about good inductive bias.

It's interesting that you bring up the ARC dataset. The paper that introduced it (also from Chollet) [2] makes a strong claim about the necessity of "knowledge priors" for a system to be considered intelligent. These are described at length in section III.1.2 "Core knowledge priors" and are exactly a set of strong inductive biases that the author of the paper considers necessary for a machine learning system to solve the ARC tasks and that consist of such problem-specific biases as object cohesion, object persistence, object influence via contact, etc. It is exactly such "knowledge priors" that are encoded as background knowledge in ILP systems.

Indeed, in the ARC challenge on Kaggle, the best-performing systems (i.e. the ones that solved the most tasks) were crude approximations of the ILP approach: a library of hand-crafted functions and a brute-force search procedure to combine them. I note also that attempts to use deep learning to solve the challenge didn't go anywhere.

Humans also have strong inductive biases that help us solve such problems. But I'm not the best placed to discuss all this - I'm not a cognitive scientist.

In the end, what you are asking for is magick: a learner that learns only from examples, without any preconceived notions about how to learn from those examples, or what to learn from them. There is no such machine learning system.

>> Doesn't matter, since the programmer gives that information. As you admit, Louise cannot check for arbitrary programs.

I don't understand what you mean "check for arbitrary programs". I can give Louise zero BK and metarules and ask it to generate all Prolog programs, say. Prolog is a Turing complete language so that would give me the set of all programs computable by a Universal Turing Machine (it would take a while). But what would that achieve?

At this point I'm not sure I understand what your remaining objections are against the approach I showed you. For the purpose of learning arbitrary programs it works better than anything else. Of course it's not magick. Perhaps you should take my suggestion to think about the problem a bit more carefully, if you're really intersted in it. Or are you? I mean, if you consider AI solved, e.g. by GPT-3, then I can see how you wouldn't be interested in thinking any further about the issue.

_________________

[1] https://blog.keras.io/the-limitations-of-deep-learning.html

[2] https://arxiv.org/abs/1911.01547

P.S. To clarify, I'm keeping this discussion up for your sake, albeit eagerly. You have expressed some strongly held, but incorrect opinions that it seems to me you have acquired by consulting inexpert sources, probably because you have a day job that has nothing to do with AI and doesn't leave you enough time to study the matter properly. My day job is to study AI and I feel that such a privilege is only justified if I spend time and effort to help others improve their knowledge on the subject. I'm guessing that on your part, you're more interested in "winning" the conversation, but please try to gain something from our interaction, otherwise all the time we both spent at it would be to waste. When this is over, try to dig out and read some authoritative sources. I would advise you on which ones - but you'd probably resist my recommendation anyway, so you're on your own there.

Veedrac · on Nov 20, 2020

My initial response was a fairly kneejerk reaction to the snark. The following is a rewrite. Please don't; if you really think so little of me, rather don't reply than reply unpleasantly.

> Deep neural nets do not learn only from examples! They encode strong inductive biases in their carefully hand-engineered and hand-tuned architectures

“Solomonoff Induction does not learn only from evidence! It encodes strong inductive biases in its construction and choice of Turing machine...”

but it doesn't matter. Our universe is not a random soup of maximal entropy.

The tasks I am talking about solving are overtly not impossible.

You talk about ML methods like the success of, say, image recognition comes from image-recognition-specific architectures. You mention ‘hand-engineered’ or ‘hand-tuned’. And yet, to throw your snark back at you, if you were up to date with the literature, you would know this is not true.

Consider ViT as an example. The same Transformer, the same minimal inductive biases, work as well for language modelling as for image segmentation as for proof search—the only difference perhaps that ViT works on patches for efficiency, though the paper shows that probably hurts performance in the limit. All it takes is an appropriate quantity of data to learn the appropriate task-specific adaptations the network needs. Heck, even cross-domain works; it's all one architecture, so it's all one inductive bias.

To my mind, this is what it means to learn from examples. There is no way that an architecture designed for language translation could also encode task-specific priors for these different tasks.

For sure, one might call this ‘strong inductive biases’, in that the program is not random bytes (as a truly bias-free algorithm must be), but please at least admit that this is a complete different conceptual plane to the sort of biases you give Louise. Louise's biases aren't merely task specific, they're problem-specific. It would be one thing if Louise's biases were a handwritten web of a million BK rules: fine, whatever, as long as it solves the task that is obviously possible to solve. But they're not, they're tuned per example.

ML people call that data leakage.

> I don't understand what you mean. Yes, Louise can check for arbitrary programs. I can give it zero BK and metarules and ask it to generate all Prolog programs, say.

Louise can perhaps generate all Prolog programs. Louise cannot search the space of Prolog programs.

YeGoblynQueenne · on Nov 20, 2020

I see I made you feel bad with my advice to read up. I'm sorry, because that was not my intention. However, you really do need to take my advise seriously. You've insisted throughout our conversation that you don't need to read older machine learning or AI papers because they're not relevant anymore. And yet, they are. And you do need to read them because without them you will not be able to understand the recent developments you seem to be intersted in.

Take for instance your example of ViT. This is a transformer, so it's clearly not an unbiased generaliser that learns only from examples. You say so yourself: "it's all one inductive bias". Yes, that's how machine learning works and deep neural nets don't do anything different, neither do they learn only from examples, as you seemed to suggest in your previous comment (you replied "That's literally what DL is" to my comment that "you can't learn only from examples").

But I think you misunderstood my comment about how the biggest advances in deep neural nets have come from purpose-built architectures. That is not to say that the same architectures cannot be applied to different domains- but the state of the art systems are always fine-tuned for specific tasks or datasets. This hasn't changed recently and it hasn't changed in the last 30 years.

>> For sure, one might call this ‘strong inductive biases’, in that the program is not random bytes (as a truly bias-free algorithm must be), but please at least admit that this is a complete different conceptual plane to the sort of biases you give Louise. Louise's biases aren't merely task specific, they're problem-specific. It would be one thing if Louise's biases were a handwritten web of a million BK rules: fine, whatever, as long as it solves the task that is obviously possible to solve. But they're not, they're tuned per example.

A truly bias-free algorithm is not "random bytes". It's a learner that memorises its training examples and can only recognise its training examples. Hence why it can't generalise. This is in Mitchell's paper which I suggested you read.

Louise's biases are not problem-specific in the short example I showed you. I defined BK predicates with wide applicability in programs processing lists and numbers. There is no such limitation, theoretical or practical, in the general sense, either. You can give Louise a million irrelevant BK predicates, if you like, and it will still find the ones it needs to complete the learning task assuming they're in there somewhere. In fact, it will find all of the relevant ones - and return the superset of all programs that solve the task (so you can use it for example to identify interesting relations in your dataset). Like I say in a previous comment, Louise's learning algorithm was originally designed to select relevant BK. Additionally, like I said in an earlier comment, Louise can learn its own bias, both the BK and the metarules, so it is not only not limited to task-specific bias, it is not even limited to user-provided bias. Under some circumstances it can even invent new examples. And then use them to learn a hypothesis that generalises better to unseen examples. *

>> Louise can perhaps generate all Prolog programs. Louise cannot search the space of Prolog programs.

I don't follow. What do you mean?

Veedrac · on Nov 20, 2020

> Louise's biases are not problem-specific in the short example I showed you.

This is clearly untrue.

You were customizing the BK to each specific task. You were also customizing the stepping stones for each specific task.

Justifications can come later. At least admit that you customized the BK for each problem instance and prior to doing so the solver did not solve the problem asked.

Not responding to the rest since you've missed my entire point and I don't feel like rephrasing it.

YeGoblynQueenne · on Nov 21, 2020

I did not "customize the BK to each specific task". You can go back and see what I did. I provided some generic BK predicates that manipulate lists and numbers, I defined some metarules and I gave a few examples of each program's inputs and outputs.

I don't understand your criticism and I don't understand what you want me to "at least admit".

>> This is clearly untrue.

Can you show me which biases in the example I showed are problem-specific?

>> Not responding to the rest since you've missed my entire point and I don't feel like rephrasing it.

I don't think I missed your point. I think you, yourself, are horribly confused about what point you are trying to make. And the reason of course is that you want to be able to express strong opinions about AI and machine learning, but you don't want to have to do the hard work to understand the subject. So you keep saying "five impossible things before breakfast", like asking for a learner that learns only from examples, or saying that's what deep learning is, etc.

I'm sorry but despite what the article above suggests, there is't an easy way to being an expert- not even in machine learning. If you want to know what you're talking about, then you'll have to do your homework.

Veedrac · on Nov 21, 2020

As I said, rather don't reply than reply unpleasantly. I'm cutting this here.

YeGoblynQueenne · on Nov 21, 2020

I don't know why you need to reply unpleasantly. It should be possible to give and receive criticism, even strong criticism, without having it turn into a flamewar just because we're on the internet.

Indeed, you yourself have criticised my work and my field mercilessly in this thread and I did not once reply with unpleasantness. In fact, what you keep dismissing as irrelevant and basically cheating (Louise) is my PhD research. I would be well within reason to be defensive about it. Instead, I believe I have remained polite and respectful towards you throughout and strove to answer all your questions about it.

Although you did take my criticism as snark, so this is perhaps something that is not entirely objective - you might perceive my criticism as a personal attack, say. Again, this should not be the case. In my field of work, criticism is what makes your work better and without criticism one never improves. So I do mean it when I say that my contribution to this thread was for your sake and to help you improve your knowledge of a subject you seem to be interested in.

In any case, I'm sorry this conversation turned sour. I didn't want to make you upset and I apologise for having done so.

Veedrac · on Nov 21, 2020

> I believe I have remained polite and respectful towards you throughout

I disagree. I do not mind in the slightest being told I am wrong, or having my ideas criticized. But calling me too stupid to understand my own point, or too intellectually lazy to want to understand a subject, or to talk down to me like a child—that is not kosher. This conversation is not worth being attacked, or my day being made unpleasant because you choose not to avoid the impulse to throw insults.

To the other side of things, it might help calm you to know I never much considered what I was saying a criticism of Louise. Louise, from what I can tell, is fine, and an interesting take on the task. What I was objecting to was only the way you used it in the argument. A bike is cheating if you bring it to a 100 metre sprint, but that doesn't mean they serve no purpose. Eg. I do not consider SAT solvers particularly relevant to AI progress, but one can hardly deny they are quality tools.

YeGoblynQueenne · on Nov 22, 2020

As far as I can tell, I did not talk down to you as to a child, and I certainly did not call you intellectually lazy or stupid. I criticised the fact that you don't want to put in the hard work to understand the subject you are discussing, which is what you have stated from the start of the conversation, claiming you don't need to read up on the history of AI because it is not relevant (I'm paraphrasing your point but correct me if I misunderstood it).

It seems to me I am right to think that you took my criticism as an insult to your faculties. If I say something wrong, I expect to be corrected and criticised if I insist on it, but I don't take that as an insult.

>> To the other side of things, it might help calm you to know I never much considered what I was saying a criticism of Louise.

And still you persist with the same style of commenting. "Calm" me? And you complain that I talk down to you? You have replied to my original comment with arrogance to tell me that my entire field of study is "not AI" and irrelevant - and then continued to insist you don't need to know anything about the ~70 years of work you dismiss even when it became clear that this only causes you to make elementary errors. You speak of things you know nothing about with great conviction and then you get upset with me for pointing out this can only result in errors and confusion. Given all that, I have shown great patience and courtesy. Others would have just ignored you as ignorant and unwilling to learn. I gave you the benefit of the doubt. Was that a mistake?

Veedrac · on Nov 23, 2020

> "Calm" me?

A poor choice of words, sorry. I meant, I understood you to be saying you found the criticism of Louise unpleasant, and I thought it would lessen that to know that I didn't and don't think Louise was bad.