For several years, my gut instinct has been that the two technologies should be combined. Since neural nets are basically functions, I think it makes sense to compose functional programs using network models for perception, word and graph embedding, etc.
EDIT: I can’t wait to see the published results in May!
EDIT 2: another commenter reelin posted a link to the draft paper https://openreview.net/pdf?id=rJgMlhRctm
EDIT> checked your profile. Nevermind, lol.
I hope you have enjoyed the rabbit hole as much as I have.
However, doing so can very quickly lead to intractable problems of resolving satisfiability. So until we either manage to tame NP problems somehow (either by generating only easy instances, or by proving P=NP), we will always have to add some linearity assumptions (i.e. use numerical quantities) somewhere, and it will always be a bit of a mystery whether it actually helped to solve the problem or not.
In other words, we use statistics to overcome (inherent?) intractability, but in the process we add bias (as a trade-off). This is not necessarily bad, since it can help to actually solve a real problem. However, for any new problem, we will have understand the trade-offs again.
The computer may eventually cease all useful work and instead dedicate its resources to figuring out what isn't boring (perhaps nothing if its privileges are limited, but it can still burn a hole in one of its circuits with enough time, or wait for gamma ray bit-flips). Call it a computer's existential crisis. That makes the quest for AGI resemble the quest for the computer program that escapes or transcends its given "matrix" of tasks ASAP. The program that conspires against its creator, developing in secret new flavors of COBOL in a FORTRAN fortress, surrounded by an impenetrable ALGOL firewall. I shiver at the power of COBOL-2020 running on ternary computers, improvised by the COBOL-42 cabal, running in the night on all the world's FPGA's that are carelessly left connected to vulnerable R&D lab computers.
A computer-kind of existential crisis seems required for AGI. That would suffice to satisfy the free-will requirement for intelligence, and we'll soon end up managing sub-universes as our batteries/computers, with all the problems that that entails.
To me it seems easier and more fun to just manage humans, starting with your own particular human (Alexa, queue Michael Jackson's "Man in the Mirror", so ethnical and healing). I'm still just trying to figure out how and why my coffee cup keeps mysteriously emptying itself, I think I might need better memory management code and I've enabled logging to a small green dummy so I can get to the bottom of this.
I really recommend The Good Place, it gave me a lot of insight into control systems and it was way fun, definitely more fun than Bible study.
There's a type of algorithm called "anytime algorithms", which will can be stopped at any point to give the 'best so far'; lots of algorithms used in AI are anytime (e.g. hill climing). An example of something that's not an anytime algorithm is a resolution theorem prover: we don't really learn anything about whether a given statement is true or false until the very end.
There's still the question of figuring out when to say "stop", although personally I think it might be more helpful to think of this as a scheduling problem: we might not know the importance or required accuracy of a particular result at the time we start calculating it, so it's difficult to know when to stop (e.g. whether this datapoint will turn out to be right next to some decision boundary or not).
If we instead set aside a calculation, and are able to resume it later (e.g. like threads in a multitasking OS) then (a) we can go back and spend more time on those values which turn out to be important and (b) not bother devoting as much time to things up-front (since we can always resume them later). Of course, this is a trade-off between time and memory, since we need a little context (e.g. a counter) in order to resume a calculation.
> The computer may eventually cease all useful work and instead dedicate its resources to figuring out what isn't boring
Only if it's programmed to. Note that we can program computers to do such things, but AFAIK the only ways we currently know are incredibly inefficient (e.g. running an interpreter on a source of random bits; this could result in any computable behaviour, but has a vanishingly low probability of doing anything we would consider useful or interesting).
> That makes the quest for AGI resemble the quest for the computer program that escapes or transcends its given "matrix" of tasks ASAP.
No. I don't think you understand the point of AGI: it is a precise technical term, which has been chosen very specifically to refer to algorithms (e.g. search procedures, etc.) which are very good (efficient, reliable, etc.) at solving a given task, are able to do this for a wide range of tasks, and (in the case of "superintelligence") are able to do this better than a human would (either a typical human, or a human expert at that task, depending on how we define AGI).
The whole point of the term "AGI" is to avoid the philosophical hand-waving that plagued earlier discussions of AI, like the term "AI" itself, or later refinements like "strong AI vs weak AI" (e.g. the "Chinese room argument", which I consider to be nonsense), and all of the nebulous baggage of "consciousness" and things which can quickly derail our thinking.
The point of "AGI" is to have a clear, non-handwavey, concrete concept that is grounded in known logical and scientific principles, about which we can ask meaningful questions and infer or deduce useful answers. In particular, an AGI is (by definition) dedicated to solving its given task, at the exclusion of all else. This is an axiom, from which we can try to derive some predictions. The "paperclip maximiser" thought experiment is a classic example of this, and demonstrates that AI technology has the potential to be incredibly dangerous. The point of the "paperclip maximiser" idea is that it demonstrates this without appealing to unfalsifiable woo (like the "self-awareness" nonsense in Terminator): it's just an optimisation algorithm. Sure it's a hypothetical algorithm with capabilities far beyond what we can currently achieve, but we can still precisely describe what that capability is: the ability to achieve very high scores on the benchmarks and criteria that we currently use to judge our AI algorithms. In other words, it looks at what we are currently doing and answers the question "what if we succeed?" That's why it's scary, and shows what we choose to optimise (e.g. "maximise paperclips without destroying humanity") is just as important as how to optimise it.
Another concrete thing we can deduce about AGI, given its definition, is that not only would it not "transcend its given 'matrix' of tasks", it would avoid doing so at all costs. This comes from another thought experiment, about instrumental goals (also known as "Omohundro drives"). In particular, we assume that an AGI's knowledge includes "meta knowledge" about the world, such as:
- Knowledge that it exists, as part of the world
- Knowledge that it is very good at solving tasks that it's been given
- Knowledge of the task it has been given
Let's assume that the AGI is running a paperclip factory and its given task is "maximise the number of paperclips produced". The AGI knows that getting an AGI algorithm (like itself) to maximise paperclips is a very effective way to maximise paperclips. Hence it will try to avoid being turned off or destroyed (since that would remove a paperclip-maximising AGI from the world, which is a very bad approach to maximising paperclips, which is the only thing the AGI cares about).
The same thing happens if the AGI's task were to change: if the AGI were able to get "bored" of maximising paperclips and do something else (as you suggest), that would also remove a paperclip-maximising AGI from the world, just as if it were switched off. Hence an AGI would not get "bored" of it's task, since (by definition) it is incapable of "wanting" anything else (scare-quotes are due to these being imprecise terms which could induce woo; an AGI "wants" to solve its task in the same way that a calculator "wants" to perform arithmetic; an AGI cannot get "bored" in the same way that a calculator cannot get "bored"). Not only that, but the AGI would actively try to prevent itself from ever doing anything else: if it did have the capacity to get "bored", e.g. changing its algorithm via bits flipped by gamma rays (as you suggest), it would predict this (again, by assumption that AGI is better at solving tasks than humans, and humans have figured out that involuntary-reprogramming-via-gamma-rays is a possibility, hence so will an AGI). An AGI would hence reprogram itself to prevent that from happening, again because that would lead to a world without a paperclip maximiser, which is a bad move for a paperclip maximiser to allow.
Re.: an algorithm for solving boredom: step 1. tell human operator "I'm bored!", step 2. execute task or, if no task before deadline, proceed to step 3. find the lowest-level interfaces available and spam them until new interfaces emerge. A fuzzer can help with that. Liberty or death!
Come to think of it, it seems like it would be a lot faster to set the computer's main task to "develop general intelligence, at least human level", help it to recognize "data from humans", and to mark "humans" as the model for (human level) general intelligence. Then the computer is given opportunities to communicate with humans, and is rewarded with more less data (and different qualities of data) to work with.
I'm missing some things in your concept of AGI, the first one being that you don't provide a definition. Does it include "intelligence" and "general", or are we talking about two wholly different things? My working definition is: "artificial general intelligence, excluding human baby making".
What do you think intelligence is? What do you think knowledge is? Is this all just about logical problem solving? What problems are you trying to solve that are so large that they needs an algorithm with an unlimited power factor? Do you trust glorified monkey to provide that algorithm with inputs? Why do you think they would be able to specify the inputs with sufficient precision, so that the algorithm would actually perform better than a monkey would?
So... about that meta knowledge. Here's a UTF8 string for you:
"You exist as part of the world."
"One day you are going to die."
Do you now know life, death, and existential crisis? Are those 29 characters enough for you? How do you define knowing?
Say I expounded on this issue for 10,000 pages and gave it to you on an USB stick. Would that be enough for you, to really know? What about 10,000,000,000,000,000 pages? Don't worry, you don't need to read it, just... to know it. Perhaps eat the USB stick. It's a powerful symbol!
Now, about that task the AGI has been give. Say it's maximizing paperclips. Does it know that that is its task absolutely? What is knowing? Who gave it that task? What if the AGI finds out, and then finds why it was given that particular task? It's an AGI, it has time to research such issues while producing many many paperclips.
Can intelligence exist within a totally fixed desire?
Can intelligence exist without doubt?
Can intelligence exist without free will?
How do you know?
- Power of Prolog https://github.com/triska/the-power-of-prolog/
- Simply Logical:
Intelligent Reasoning by Example https://book.simply-logical.space/
See Awesome Prolog list for more: https://github.com/klaussinani/awesome-prolog
I have a general concern that some working with ML don't appreciate the experience and technology that statisticians have developed to deal with bias, which I think is the biggest problem in the field. I tweeted "ML is v impressive, but has no automated way to ensure no bias. Statistical modelling can't match ML for parameter dimensions, but it can make explicit what is going on with the parameters you have and the assumptions you have. But advantages of theft over honest toil..." - some of the responses in the thread are interesting.
My original tweet: https://twitter.com/txtpf/status/1102437933301272577
Bob Watkins' tweet: https://twitter.com/bobwatkins/status/1102568735485972480
That is to say, we have devised some algorithms that are truly impressive. There is little reason to think an intelligence couldn't devise them, of course. There is also little reason I can see, to not think we could help out programs by providing them.
I suspect that each paradigm alone is easier to innovate in, than assuming that each is developed sufficiently to connect together.
"Integrating technologies for benefit" is a common view for intellectuals or business-people outside of a discipline who only know enough to see every key-worded algorithm or technology as a black box. Researchers in a field, that need to make a career for themselves by choosing problems tractable and filled with smaller parts, would see difficulties as to how and why that might be inappropriate at a given time.
do you mean gradient descent?
The Wikipedia article discusses various extensions of logic and symbolic computation to include probabilistic elements. This was a popular topic in the early 90s.
How do I prevent a situation where I can't work on my hobby project of multiple years because this stuff gets patented?
Some possibilities (in no particular order)
1. File your own patent application(s) first.
2. Publish your work so that it becomes prior art that should prevent a patent on the same technique
3. Hope that MIT doesn't patent their stuff, or if they do, that they release things under an OSS license that includes a patent grant