Everything you've written here is an invalid over-reduction, I presume because you aren't terribly well versed with Prolog. Your simplification is not only outright erroneous in a few places, but essentially excludes every single facet of Prolog that makes it a turing complete logic language. What you are essentially presenting Prolog as would be like presenting C as a language where all you can do is perform operations on constants, not even being able to define functions or preprocessor macros. To assert that's what C is would be completely and obviously ludicrous, but not so many people are familiar enough with Prolog or its underlying formalisms to call you out on this.
Firstly, we must set one thing straight: Prolog definitionally does reasoning. Formal reasoning. This isn't debatable, it's a simple fact. It implements resolution (a computationally friendly inference rule over computationally-friendly logical clauses) that's sound and refutation complete, and made practical through unification. Your example is not even remotely close to how Prolog actually works, and excludes much of the extra-logical aspects that Prolog implements. Stripping it of any of this effectively changes the language beyond recognition.
> Plain Prolog's way of solving reasoning problems is effectively:
No. There is no cognate to what you wrote anywhere in how Prolog works. What you have here doesn't even qualify as a forward chaining system, though that's what it's closest to given it's somewhat how top-down systems work with their ruleset. For it to even approach a weaker forward chaining system like CLIPS, that would have to be a list of rules which require arbitrary computation and may mutate the list of rules it's operating on. A simple iteration over a list testing for conditions doesn't even remotely cut it, and again that's still not Prolog even if we switch to a top-down approach by enabling tabling.
> You hard code some options
A Prolog knowledgebase is not hardcoded.
> write a logical condition with placeholders
A horn clause is not a "logical condition", and those "placeholders" are just normal variables.
> and Prolog brute-forces every option in every placeholder.
Absolutely not. It traverses a graph proving things, and when it cannot prove something it backtracks and tries a different route, or otherwise fails. This is of course without getting into impure Prolog, or the extra-logical aspects it implements. It's a fundamentally different foundation of computation which is entirely geared towards formal reasoning.
> And that might have been nice compared to Pascal in 1975, it's not so different to modern garbage collected high level scripting languages.
It is extremely different, and the only reason you believe this is because you don't understand Prolog in the slightest, as indicated by the unsoundness of essentially everything you wrote. Prolog is as different from something like Javascript as a neural network with memory is.
The original suggestion was that LLMs should emit Prolog code to test their ideas. My reply was that there is nothing magic in Prolog which would help them over any other language, but there is something in other languages which would help them over Prolog - namely more training data. My example was to illustrate that, not to say Prolog literally is Python. Of course it's simplified to the point of being inaccurate, it's three lines, how could it not be.
> "A Prolog knowledgebase is not hardcoded."
No, it can be asserted and retracted, or consult a SQL database or something, but it's only going to search the knowledge the LLM told it to - in that sense there is no benefit to an LLM to emit Prolog over Python since it could emit the facts/rules/test cases/test conditions in any format it likes, it doesn't have any attraction to concise, clean, clear, expressive, output.
> "those "placeholders" are just normal variables"
Yes, just normal variables - and not something magical or special that Prolog has that other languages don't have.
> "Absolutely not. It traverses a graph proving things,"
Yes, though, it traverses the code tree by depth first walk. If the tree has no infinite left-recursion coded in it, that is a brute force walk. It proves things by ordinary programmatic tests that exist in other languages - value equality, structure equality, membership, expression evaluation, expression comparison, user code execution - not by intuition, logical leaps, analogy, flashes of insight. That is, not particularly more useful than other languages which an LLM could emit.
> "Your example is not even remotely close to how Prolog actually works"
> "There is no cognate to what you wrote anywhere in how Prolog works"
> "It is extremely different"
Well:
parent(timmy, sarah).
person(brian).
person(anna).
person(sarah).
person(john).
?- person(X), writeln(X), parent(timmy, X).
brian
anna
sarah
X = sarah
That's a loop over the people, filling in the variable X. Prolog is not looking at Ancestry.com to find who Timmy's parents are. It's not saying "ooh you have a SQLite database called family_tree I can look at". That it's doing it by a different computational foundation doesn't seem relevant when that's used to give it the same abilities.
My point is that Prolog is "just" a programming language, and not the magic that a lot of people feel like it is, and therefore is not going to add great new abilities to LLMs that haven't been discovered because of Prolog's obscurity. If adding code to an LLM would help, adding Python to it would help. If that's not true, that would be interesting - someone should make that case with details.
> "and the only reason you believe this is because you don't understand Prolog in the slightest"
This thread would be more interesting to everybody if you and hunterpayne would stop fantasizing about me, and instead explain why Prolog's fundamentally different foundation makes it a particularly good language for LLMs to emit to test their other output - given that they can emit virtually endless quantities of any language, custom writing any amount of task-specific code on the fly.
The discussion has become contentious and that's very unfortunate because there's clearly some confusion about Prolog and that's always a great opportunity to learn.
You say:
>> Yes, though, it traverses the code tree by depth first walk.
Here's what I suggest: try to think what, exactly, is the data structure searched by Depth First Search during Prolog's execution.
You'll find that this structure is what we call and SLD-Tree. That's a tree where the root is a Horn goal that begins the proof (i.e. the thing we want to dis-prove, since we're doing a proof by refutation); every other node is a new goal derived during the proof; every branch is a Resolution step between one goal and one definite program clause from a Prolog program; and every leaf of a finite branch is either the empty clause, signalling the success of the proof by refutation, or a non-empty goal that can not be further reduced, which signals the failure of the proof. So that's basically a proof tree and the search is ... a proof.
So Prolog is not just searching a list to find an element, say. It's searching a proof tree to find a proof. It just so happens that searching a proof tree to find a proof corresponds to the execution of a program. But while you can use a search to carry out a proof, not every search is a proof. You have to get your ducks in a row the right way around otherwise, yeah, all you have is a search. This is not magick, it's just ... computer science.
It should go without saying that you can do the same thing with Python, or with javascript, or with any other Turing-complete language, but then you'd basically have to re-invent Prolog, and implement it in that other language; an ad-hoc, informally specified, bug-ridden and slow implementation of half of Prolog, most like.
This is all without examining whether you can fix LLMs' lack of reasoning by funneling their output through a Prolog interpreter. I personally don't think that's a great idea. Let's see, what was that soundbite... "intelligence is shifting the test part of generate-test into the generate part" [1]. That's clearly not what pushing LLM output into a Prolog interpreter achieves. Clearly, if good, old-fashioned symbolic AI has to be combined with statistical language modelling, that has to happen much earlier in the statistical language modelling process. Not when it's already done and dusted and we have a language model; which is only statistical. Like putting the bubbles in the soda before you serve the drink, not after, the logic has to go into the language modelling before the modelling is done, not after. Otherwise there's no way I can see that the logic can control the modelling. Then all you have is generate-and-test, and it's meh as usual. Although note that much recent work on carrying out mathematical proofs with LLMs does exactly that, e.g. like DeepMind's AlphaProof. Generate-and-test works, it's just dumb and inefficient and you can only really make it work if you have the same resources as DeepMind and equivalent.
The way to look at this is first to pin down what we mean when we say Human Commonsense Reasoning (https://en.wikipedia.org/wiki/Commonsense_reasoning). Obviously this is quite nebulous and cannot be defined precisely but OG AI researchers have a done a lot to identify and formalize subsets of Human Reasoning so that it can be automated by languages/machines.
Prolog implements a language to logically interpret only within a formalized subset of human reasoning mentioned above. Now note that all our scientific advances have come from our ability to formalize and thus automate what was previously only heuristics. Thus if i were to move more of real-world heuristics (which is what a lot of human reasoning consists of) into some formal model then Prolog (or say LLMs) can be made to better reason about it.
Note however the paper beautifully states at the end;
Prolog itself is all form and no content and contains no knowledge. All the tasks, such as choosing a vocabulary of symbols to represent concepts and formulating appropriate sentences to represent knowledge, are left to the users and are obviously domain-dependent. ... For each particular application, it will be necessary to provide some domain-dependent information to guide the program writing. This is true for any formal languages. Knowledge is power. Any formalism provides us with no help in identifying the right concepts and knowledge in the first place.
So Real-World Knowledge encoded into a formalism can be reasoned about by Prolog. LLMs claim to do the same on unstructured/non-formalized data which is untenable. A machine cannot do "magic" but can only interpret formalized/structured data according to some rules. Note that the set of rules can be dynamically increased by ML but ultimately they are just rules which interact with one another in unpredictable ways. Now you can see where Prolog might be useful with LLMs. You can impose structure on the view of the World seen by the LLM and also force it to confine itself only to the reasoning it can do within this world-view by asking it to do predominantly Prolog-like reasoning but you don't turn the LLM into just a Prolog interpreter. We don't know how it interacts with other heuristics/formal reasoning parts (eg. reinforcement learning) of LLMs but does seem to give better predictable and more correct output. This can then be iterated upon to get a final acceptable result.
Firstly, we must set one thing straight: Prolog definitionally does reasoning. Formal reasoning. This isn't debatable, it's a simple fact. It implements resolution (a computationally friendly inference rule over computationally-friendly logical clauses) that's sound and refutation complete, and made practical through unification. Your example is not even remotely close to how Prolog actually works, and excludes much of the extra-logical aspects that Prolog implements. Stripping it of any of this effectively changes the language beyond recognition.
> Plain Prolog's way of solving reasoning problems is effectively:
No. There is no cognate to what you wrote anywhere in how Prolog works. What you have here doesn't even qualify as a forward chaining system, though that's what it's closest to given it's somewhat how top-down systems work with their ruleset. For it to even approach a weaker forward chaining system like CLIPS, that would have to be a list of rules which require arbitrary computation and may mutate the list of rules it's operating on. A simple iteration over a list testing for conditions doesn't even remotely cut it, and again that's still not Prolog even if we switch to a top-down approach by enabling tabling.
> You hard code some options
A Prolog knowledgebase is not hardcoded.
> write a logical condition with placeholders
A horn clause is not a "logical condition", and those "placeholders" are just normal variables.
> and Prolog brute-forces every option in every placeholder.
Absolutely not. It traverses a graph proving things, and when it cannot prove something it backtracks and tries a different route, or otherwise fails. This is of course without getting into impure Prolog, or the extra-logical aspects it implements. It's a fundamentally different foundation of computation which is entirely geared towards formal reasoning.
> And that might have been nice compared to Pascal in 1975, it's not so different to modern garbage collected high level scripting languages.
It is extremely different, and the only reason you believe this is because you don't understand Prolog in the slightest, as indicated by the unsoundness of essentially everything you wrote. Prolog is as different from something like Javascript as a neural network with memory is.