How can you leave out von Neumann from such a list?
Also, I feel like Schmidhuber tends to be German-speaking-centric.
And overall, acknowledging that this is his signature shtick and it's good to have a voice like this too, he does a lot of anachronistic reinterpretations of early discoveries in light of later results. It streamlines the story as if it was always headed towards the today, culling aspects that didn't pan out or magnifying aspects that were at the time more minor and not considered the central issue or angle.
> "How can you leave out von Neumann from such a list?"
When I read the title:
"1931: Kurt Gödel shows limits of math, logic, computing, AI"
my head automatically completed it with
"John von Neumann approves."
I always found it fascinating and a sign of von Neumann's greatness how quickly he not only understood the consequences of Gödel's discovery but also that he immediately and publicly accepted them.
von Plato, "In Search of the Sources of Incompleteness", Proc. Int. Cong. of Math, Vol. 4 (2018) (https://eta.impa.br/dl/209.pdf), p 4087:
> A shadow is cast on Gödel’s great achievement; there is no way of undoing the fact that Gödel played a well-planned trick to persuade von Neumann not to publish.
I mean, Gödel was Austrian and von Neumann Hungarian who traveled/lived a lot in many German speaking countries, so I don’t see the anglocentrism here.
That is a bit his schtick. He himself feels like he has been the victim of an anglo-centered science history (one might debate about that), so he is retelling how the story could work if you put less focus the anglo-part.
The history is well-known. Gödel started it. The "anglo-part" was mostly Church (decision problem). Turing simplified Church. At the same time, Zuse built the first real computer. I like this key part of the text:
What exactly did Post[POS] and Turing[TUR] do in 1936 that hadn't been done earlier by Gödel[GOD][GOD34] (1931-34) and Church[CHU] (1935)? There is a seemingly minor difference whose significance emerged only later. Many of Gödel's instruction sequences were series of multiplications of number-coded storage contents by integers. Gödel did not care that the computational complexity of such multiplications tends to increase with storage size. Similarly, Church also ignored the spatio-temporal complexity of the basic instructions in his algorithms. Turing and Post, however, adopted a traditional, reductionist, minimalist, binary view of computing—just like Konrad Zuse (1936).[ZU36] Their machine models permitted only very simple elementary instructions with constant complexity, like the early binary machine model of Leibniz (1679).[L79][LA14][HO66] Emil Post They did not exploit this back then—for example, in 1936, Turing used his (quite inefficient) model only to rephrase the results of Gödel and Church on the limits of computability. Later, however, the simplicity of these machines made them a convenient tool for theoretical studies of complexity.
Right, I wonder how much we are, globally speaking, becoming very Anglo-centric in our understanding of history, merely because we use the English language for international communication and therefore it's easier to consume histories and writings produced by Anglos.
Maybe not that much. At my British university (Imperial) the majority of the students at the very least don't look "Anglo" and don't have Anglo names.
When I started my PhD I was greeted with an email with the names of all other PhDs starting at the same year, cc'd. So I saw everyone's names (in Office 360- you can see the names attached to the email addresses) and I think there were two or three recognisably Anglosaxon names in about 80 names.
So I suspect it's more that English is used as a lingua franca for students and researchers from all over the world, than that students and researchers are predominantly native English speakers.
Myself, for example, am not a native English speaker :)
That Veritasium's video is great! I'd also recommend Computerphile's 3-part video series (~10 mins each) which tells the history of undecidability and the incompleteness theorem:
A good book as such. However, whereas the authors claim that they outline the proof in chapter 6, that chapter fell short of it with key aspects of the proof missing. Overall, I was disappointed.
I knew Gödel's (first) incompleteness theorem pretty well already, but I really enjoyed how much that video managed to get across without getting too bogged down.
One thing I'd like to see in a companion video would be an explanation of why Turing machines represent computation. That video (like many others) skims over the why, and only talks about what they can/can't do after we've already decided to use them.
Turing's 1936 "On Computable Numbers" paper gives a nice philosophical justification for his model, which (from my reading) boils down to the following:
- Mathematicians can do their work within a finite volume of space (their brain, or a room, or the whole Earth if we want to be conservative). Also, they could (in principle) do their work using only written communication, on standard pieces of paper, each of which also has a finite volume.
- Finite volumes can only have finitely-many distinguishable states (having infinitely many states would require them to be infinitesimally similar, and hence indistinguishable from each other within a finite amount of time)
- Hence we can label every distinguishable state of a brain with a number; and likewise for every distinguishable piece of paper (at least in principle)
- Since these are both finite, any mathematician's behaviour could (in principle) be completely described by a table detailing how one state (brain + paper) leads to another
- Given such a table, the actual content of the states (e.g. the wavefunctions of particular protons, the electrical potential of particular synapses, the placement of ink molecules, etc.) is irrelevant for the behaviour; only the transitions from one numbered state to another matter
- Hence we could (in principle) build a machine with the same number of states as one of these tables, and the same transitions between the states, and it would exactly reproduce the behaviour of the mathematician
This is the philosophical justification for why a (Turing) machine can calculate anything a human can (in fact, the same argument shows that a Turing machine can reproduce the behaviour of any physical system).
However, this is still a rather hand-wavey "in principle" thought experiment about unimaginably huge numbers. Turing managed to take it further.
For simplicity we assume all the papers are arranged in a sequential "tape", we'll call the distinguishable states of the papers "symbols" and those of the mathematician/machine "states":
- One thing a mathematician can do is read a tape with one of these tables written on it, followed by a sequence of numbers representing the symbols of another tape, and emulate what the described machine would do when given the described tape (i.e. they could keep track of the current state and tape position, and look up the transitions in the table, to see what would happen)
- Since a mathematician can emulate any given table (in principle), so can a machine. This would be a "universal machine", able to emulate any other. (The video talks about such a machine, in the proof that the halting problem is undecidable)
So the question becomes: how unfathomably complicated would such a universal machine have to be?
- These transition tables and tapes can be very big, and may contain very large numbers, but we can write them down using only a small alphabet of symbols, e.g. "start table", "new row", "the digit 7", etc.
- Reading a sequence of such symbols, and emulating the described machine, can get tricky. Turing described a universal machine "U", but he did so in a rather indirect way, which also turned out to have some mistakes and glaring inefficiencies. Davies later worked through these and ended up with an explicit machine using only 147 states and 295 symbols.
Hence we can use a machine with only a few hundred states to exactly reproduce the behaviour of any mathematician (or any physical system), as long as it's given an appropriate description (i.e. "software"). Later work has found universal Turing machines with only 4 states and 6 symbols.
One reason Turing's justification for his model is important, rather than just proposing the model and seeing what happens (like in the video), is that Alonzo Church had already proposed a model of computation (called Lambda Calculus), but didn't have such a justification.
Gödel himself dismissed Lambda Calculus, proposing his own system (General Recursive Functions) as a better alternative. When Church proved they were equivalent, Gödel took that as reason to dismiss his own system too! Yet Turing's argument did convince Gödel that a fundamental limit on computation had been found. Turing proved his machines are also equivalent to Lambda Calculus, and hence General Recursive Functions; so all of these proposed models turned out to be 'correct', but it was only Turing who could explain why.
Personally I consider this reduction of physical behaviour to transition tables and then to machines, to be the main reason to care about Turing machines. Proving the undecidability of the halting problem was also a great achievement of Turing's 1936 paper (as shown in that video), but that can also be explained (I would argue more easily) using other systems like Lambda Calculus.
Without Turing's justification of his model, undecidability comes across as simply a flaw. Sure Turing machines may be (distantly) related to our laptops and phones, but if we found a better model without this 'undecidability bug' we could make better laptops and phones! Turing's argument shows that there is no better model (just equivalents, like Lambda Calculus).
> This is the philosophical justification for why a (Turing) machine can calculate anything a human can (in fact, the same argument shows that a Turing machine can reproduce the behaviour of any physical system).
I don't think this assertion follows. I don't think an argument like this can work without delving further into reasonable description of a "physical system".
If you just just use the mathematics we employ to describe physical systems unreservedly it is possible to construct "physical systems" that exhibit non-computable behavior. For instance you can have computable and continuous initial conditions to the wave equation that produces a solution that is non-computable. See : https://en.wikipedia.org/wiki/Computability_in_Analysis_and_...
I think it's important to emphasize that Turing stated his arguments in regards to "effective procedure" (which I see you mention in a different post). I don't think the substitution of "effective procedure" with "physical system" is justified.
> For instance you can have computable and continuous initial conditions to the wave equation that produces a solution that is non-computable.
Thanks, that's a really nice example which I hadn't come across before (or at least not spent too much time studying). I may have to refine the language I use in future; although a cursory look seems to be compatible with my own understanding (my mental model is roughly: "if we found a halting oracle, we would have no way to tell for sure")
> I think it's important to emphasize that Turing stated his arguments in regards to "effective procedure" (which I see you mention in a different post). I don't think the substitution of "effective procedure" with "physical system" is justified.
Yes, Turing did not say as much (at least in his 1936 paper). He was essentially abstracting over 'whatever it is that a person might be doing', in an incredibly general way. Others have since taken this idea and applied it more broadly.
Another useful caveat is that Turing machines are framed as (partial) functions over the Natural numbers. It's quite a leap from there to a "physical system". An obvious example is that no matter how cleverly we program a Turing machine, it cannot wash the dishes; although can simulate the washing of dishes to arbitrary precision, and it could wash dishes by controlling actuators if we decided to attach some (but even that would run into problems of time constraints; e.g. if calculating the first instruction took so long that the water had evaporated).
The problem with your assumption is that you're assuming there are a finite number of states. There might be an uncountably infinite number of states, for instance if the states were the reals between 0 and 1.
Which assumption are you referring to? If it's about the states in a finite region, note that I specifically limited this to distinguishable states.
Whether or not a finite region can have an infinite number of states (countable or otherwise) is irrelevant; we can only distinguish finitely many in finite time. Two states being indistinguishable means they'll give rise to the same output behaviour (e.g. from a mathematician in a room, carrying out some procedure).
Where in there did Turing prove computation can "exactly reproduce the behavior to...(any physical system)"? Because we humans seem to be able to describe and write down the behavior of any physical system? I'm not sure that jump is airtight.
I'm not sure we can say this part even though we might believe it to be true. I mean maybe now we can since we have no deeper theories than quantum field theories, and we can posit all of physical reality reduces to QFTs and GR and simulate all of QFT computationally either with quantum computers (in real time) or classical (not in real time + need a random number generator). But the we'd still have computational limits by which observables we can measure and how big of computations we can run (think size of computer limited by black hole density for one crude example - although I guess a blackhole could be used as a quantum computer. So then I'd say the speed of light is the other main limit, we've already lost the information to run certain calculations to the depths of space).
And to focus back in on the random number generator requirement for classical to simulate quantum...nothing fundamentally random can be simulated classically in principle. Yet randomness is needed to describe the most basic layer of reality. I can't get true randomness from classical no matter what.
> Where in there did Turing prove computation can "exactly reproduce the behavior to...(any physical system)"? Because we humans seem to be able to describe and write down the behavior of any physical system?
Turing's scenario essentially applies to a person in a room, with an unlimited amount of paper (this could be a movable tape extending through the walls, to prevent the room filling up with paper!).
However, the same argument works if we replace the person with any other physical system (e.g. some mechanical device, or some physics experiment), and replace the room with any finite region of space (e.g. the Milky Way galaxy).
I don't remember Turing himself making this argument (let alone in his 1936 paper), but it's known as the "physical Church-Turing thesis" https://en.wikipedia.org/wiki/Church%E2%80%93Turing_thesis#V... (the Church-Turing thesis is the argument that all 'effective methods of calculation' can be carried out by a Turing machine)
As for your points about quantum physics, etc. that's certainly true, and it's why the Church-Turing thesis isn't a theorem (and can't be, since we can never know if any particular model of reality is correct). However, it's certainly a valid physical law, akin to nothing moving faster than light, etc. i.e. it's empirically true, theoretically sound, and we have no better explanation for the moment.
Perhaps its origin in mathematics, where it is a second-class citizen compared to provable theorems, prevented the Church-Turing thesis getting the recognition it deserves, e.g from physicists.
Well I might add the quantum complexity-theory version of the Church–Turing thesis.
I believe that quantum computation is a richer more powerful theory. In essence BQP is a superset of BPP in your link.
I think this happens with all sorts of mathematical objects. Non commutative algebra is much richer than commutative algebra. There are also more interesting non commutative things than just deformations of classical objects. Its the classical commutative case that is the weird limit.
Yeah, there are all sorts of layers we can add, especially regarding complexity (i.e. resource usage limits, rather than simple yes/no questions of decidability).
It's interesting to consider P = NP as a physical law, although that's certainly more tenuous than the Church-Turing thesis, etc.
Can you please flesh out the following? I don’t quite follow it..
> having infinitely many states would require them to be infinitesimally similar, and hence indistinguishable from each other within a finite amount of time
Sure. Let's take a finite region, like a room. We don't want to get too bogged down in the details of human psychology, anatomy, etc. so we'll stick to something very low-level like particle physics (which, in principle, can describe a person in exact detail). One state of this region is that it's completely empty (maybe quantum effects rule that out, but we can still include it if we're being conservative ;) ). Another state could have, say, a single electron, at some position we'll call (0, 0, 0). Another could have a single electron at position (0, 0, 1). Another has an electron at (25, 5, 3) and a proton at position (0, 0, 0). And so on.
Let's consider the states which just contain a single electron, somewhere in the room (I'm not bothering with quantum effects, but we would do the same thing just in Hilbert space instead of 3D space). The region is finite, so the coordinates of this electron are bounded: if the boundaries are at, say, 0 and 100 in each axis, then putting the electron at position (101, 0, 0) would be the same state as the empty one we started with (since the electron is outside the region we care about, and there's nothing else).
Now let's say we do some measurement of this electron's position, which is only accurate to whole number coordinates. That gives us 100^3 = 1,000,000 distinguishable states with a single electron. We might imagine all sorts of different states 'in between', but they can't affect that measurement due to its limited resolution.
If we increase the resolution of our measurement by 10, we get 10^3 times as many distinguishable states; another factor of 10 gives 10^3 times as many states again; and so on. However, regardless of how finely we can resolve the electron's position, we can only distinguish between finitely-many states. Any differences finer than that limit are irrelevant to the measurement, and hence to any behaviour that depends on that measurement.
If we could resolve between infinitely-many positions for the electron, that would require distinguishing between states which are infinitesimally-close together, i.e. on an infinitely-fine grid; this would require distinguishing between two numbers whose decimals only differ after infinitely many digits. This doesn't seem like a reasonable capability.
The same principle applies when we have two electrons in the room, or trillions, or any combination of electrons, protons, photons, etc. in any arrangement we like.
A similar thing happens when we reach really large numbers too, e.g. trying to put as many particles in the region as we can: at some point the relative difference between, say, a googolplex and five particles versus a googolplex and six particles becomes too small to distinguish. There will eventually be a limit. (This is like the resolution problem, but instead of adding decimal places to the right of each number, we're adding factors of 10 to their left)
One way to get around this problem of limited resolution is to avoid an explicit 'measurement', and instead have the region itself behave differently for different states. A good example is chaotic behaviour, where tiny changes in an initial state will grow exponentially until those differences eventually become distinguishable. However, the infinitesimally-similar states described above will remain infinitesimally close for all finite lengths of time; in order to distinguish between them, we would need to wait an infinite amount of time.
So putting aside Plank and all the experimental stuff, even on a philosophical level there are problems with thinking of space as continuous. For, that with which we measure space is necessarily discrete and the time-frames in which we measure are necessarily finite. So, even if space was continuous, two ‘states’ that are infinitesimally similar could for ALL practical purposes be considered identical.
As a far less formal analogy, it's similar to how people complain that digital audio in general has lower quality than analogue, just from a discrete vs continuous standpoint. When in fact there's no limit to the quality of a digital representation (if we liked, we could store a gigabyte per sample, or a petabyte, or whatever). Yet we can always draw the line somewhere, beyond which there are no meaningful distinctions.
I know I'm pretty inept when it comes to this subject, but can someone help me understand how the first three points of Turing's philosophical justification listed above can be true? For example, in computing the sequence of n values of an infinite series or the first n digits of pi, how can the claim be made that this can be done in a finite amount of space or with a finite number of states?
Turing's machine includes an "infinite" tape. The number of states, the state transitions, and the number of symbols are finite, but an unbounded number of symbols can be written, one to a cell, on the tape.
We're allowed an unlimited amount of paper, but each piece has a finite size, and hence each piece is in one of only finitely-many distinguishable states. Turing called these piece-of-paper-states "symbols", and arranged the pieces of paper in a "tape" of unlimited length (a stack of sheets, or a library of books would also work; but moving between the pieces would be more complicated).
In fact, we only ever need 2 symbols, e.g. 0 and 1. This is because any set of symbols can be replaced by a set of numbers, and those numbers can be written in binary (with a fixed width, to avoid needing 'spacers'). This can require a lot more machine states, since those states allow the machine to 'remember' the bits it's read so far (e.g. if we're using 8 bits at a time, then we need enough states to "remember" what the first 7 bits were as we're reading the 8th).
For a calculation like pi, we can use the tape to (a) write down the digits (or bits) of pi, and (b) keep track of intermediate variables needed for the calculation. One simple approach to keeping track of intermediate variables is to store them in order on the tape, where each variable is a sequence of 1 symbols, separated by 0 symbols. For example a tape like: 111010010 represents four variables with values 3, 1, 0 and 1 (the number of 1s we see before we hit a 0). This actually corresponds to Peano arithmetic, if we use the symbols S and Z instead like SSSZSZZSZ.
Start
^
Initialise variables and output/variables separator
|VARIABLES
Move variables to make space for next digit
_|VARIABLES
Calculate and write next digit
3|VARIABLES
Move variables to make space for next digit
3_|VARIABLES
Calculate and write next digit
31|VARIABLES
and so on
It might be because pi isn’t infinite in practice. E.g. we could use all the available energy in the universe and only calculate up to a finite digit (ignoring the information storage issue!).
Whether the abstract infinite pi ‘exists’ is up for metaphysical debate :)
def pi_decimal_digits():
q, r, t, j = 1, 180, 60, 2
while True:
u, y = 3*(3*j+1)*(3*j+2), (q*(27*j-12)+5*r)//(5*t)
yield y
q, r, t, j = 10*q*j*(2*j-1), 10*u*(q*(5*j-2)+r-y*t), t*u, j+1
If you mean that pi can't be written as a number to infinite precision using place-value notation (as opposed to a more useful notation like Python) then I agree that only a finite prefix can ever be observed.
What you've written is not pi, it's a procedure for computing pi. Python expressions are not a notation, they fail basic requirements for a reasonable notation (for example you cannot generally determine whether two Python expressions are equal or not).
You may find it interesting to learn that one cannot generally determine whether two real numbers are equal or not: equality on real numbers is undecidable.
I'm well aware. It's also the case that there is no good notation for the real numbers; in particular since any given notation system can only represent countably many distinct entities, almost all real numbers will be are unrepresentable in that notation system. Therefore the notion that the full set of reals actually exists is extremely dubious.
Note that the Python version of pi requires an unbounded amount of memory and, more importantly, an infinite amount of time to produce pi to infinite precision.
Not at all, the reason you may think so is that you're very used to representing numbers using a base 10 number system and in that system it is inherently impossible to represent pi. But if you change how you represent numbers, you can represent pi exactly. What OP did was represent pi using a syntax inspired by a programming language, which is just as valid a number system as any other formal representation as far as correctness goes, although of course as a human I likely would not wish to express most of my use cases involving numbers using a full blown programming language.
That said, one can perform all of arithmetic using that system and represent any property of numbers that could be represented otherwise.
Ah, I see where you are going with this. However, I can offer a version which, without loss of generality, gives another 200-to-1 compression (not counting UTF-8 multibytes): π.
I’ve wondered recently whether the brain can be represented by a Turing machine if our computation mechanisms utilize something like back propagation through time via quantum effects.
Maybe we will make progress in our understanding this millennia :)
Quantum physics can be represented and calculated by Turing machines. There are many ways to do this, depending on preference: via symbolic algebra, via probabilistic inference, via random/pseudorandom numbers, via non-deterministic functions, etc.
Note that quantum computers are no more powerful than Turing machines; although it's thought they might be faster at certain tasks. For example, we know how to factorise integers quickly using Shor's algorithm on a quantum computer, but we don't yet know a quick algorithm for factorising integers on a classical computers. We can still factorise integers on a classical computers: we can do it with a non-quantum algorithm, or we can even do it by emulating a quantum computer and using that emulator to run Shor's algorithm (in this case the emulation is so slow that it exactly cancels out the speedup from Shor's algorithm).
Backpropagation through time is also a pretty simple algorithm, which is already implemented on classical computers.
> Quantum physics can be represented and calculated by Turing machines.
1) Where does the randomness of QM come in though? You need a random number generator alongside your Turing machine.
We coud debate that one can never prove true randomness as it may always be due to some deeper determinism. But if we take the wavefunction of QM as complete, which Many Worlds does - the most compsci-like interpretation - then randomness is fundamental according to theory.
I can have all the TM's I want, but I still need this extra component. Just like the human Turing machine could have all the pen and paper and math in the entire universe and beyond, a human can't produce a truly random number (or can we, then we get into a debate about true free will and whether free will can produce a "random" number).
2) Let's say we instead use quantum computation instead of classical. There are still physical systems we can't "compute", like the state of the entire observable universe, the state of the entire universe, or anything requiring information that has forever left our light cone. I know TM's and quantum versions of TM's are infinite, but they can't bypass the speed of light. So we need some softer version of saying all of physical reality is computable imo.
> But if we take the wavefunction of QM as complete, which Many Worlds does - the most compsci-like interpretation - then randomness is fundamental according to theory.
MWI is an entirely deterministic interpretation of quantum mechanics. Absolutely no randomness is involved which is one of its most appealing properties.
It's interesting that you mention a many-worlds approach, since that doesn't actually involve any randomness. Rather than choosing a single result probabilistically, many-worlds keeps them all. This can be represented quite nicely with what I referred to above as 'nondeterministic functions', i.e. functions which may return multiple values (in some sense, those values are literally the "worlds"). This would look like a logic language a la Prolog, but wouldn't be restricted to depth-first traversal: it could run breadth-first (very slow), concurrently (nice in principle, but would swamp any finite machine with overheads), or iterative deepening (probably the nicest implementation IMHO). Haskell programmers would call this a "breadth-first list monad" (AKA LogicT, AKA Omega). Instead of "returning multiple values", we could also do a continuation-passing transform and invoke the continuation multiple times (concurrently).
This would run exponentially slowly, but that's a whole separate issue ;)
> We coud debate that one can never prove true randomness as it may always be due to some deeper determinism.
We could get a Copenhagen-like approach by doing the same as above, but only returning one of the results, chosen "at random" according to the square of the wavefunction (plus some hand-wavey Copenhagen-craziness, like defining whether or not Wigner's Friend is an 'observer')
When it comes to "randomness" my own approach is to mentally substitute the word "unpredictable", since it tends to be more enlightening. Turing machines can certainly produce results which are "unpredictable", in the sense that the only way to know what will happen is to run the program; in which case that's more like a (repeatable) observation rather than a prediction. (BTW I also think that's a nice resolution of the "free will" paradox: my behaviour may be predicted by observing what an exact copy of me does; but it was still "decided by me", since that copy essentially is "me")
In any case, the outcome of a Turing machine using some pseudorandom algorithm is indistinguishable from "quantum randomness". There is certainly a limit to the unpredictability of any pseudorandom events (given by its Kolmogorov Complexity, which is upper-bounded by the Turing machine's program), but on the other hand I think it's quite presumptuous to think "quantum randomness" has infinite Kolmogorov Complexity.
> There are still physical systems we can't "compute", like the state of the entire observable universe, the state of the entire universe, or anything requiring information that has forever left our light cone.
That's a very different question, since it involves how we choose what to put on the tape, rather than what a Turing machine is capable of computing. It's like criticising a genie for being unable to grant a wish, when in fact it's we who can't think of what to ask for.
Besides this, there is a way for a Turing machine to compute the whole universe, including what's outside our visible lightcone: we simply run every program. That's actually a pretty straightforward task (and remarkably efficient), e.g. see the FAST algorithm in section 9 of https://people.idsia.ch/~juergen/fastestuniverse.pdf
Note that if we run such a program, there is no way to know which of the outputs corresponds to our universe. However, our ignorance of where to look does not imply that such a Turing machine has not computed it!
” Turing machines can certainly produce results which are "unpredictable", in the sense that the only way to know what will happen is to run the program.”
This doesn’t seem true since all predictions are ‘running the program’
Not necessarily; lots of systems have 'shortcuts', e.g. I can predict the result of sin(1000000τ) will be 0, without anything physical/virtual having to wiggle up and down a million times. Lots of systems can be described by equations involving a time variable 't', where we can set t = whenever we like, then solve the equation to get the predicted value.
When a system is Turing-complete, equations like this will inevitably involve some expression with t-1; and solving that involves some expression with t-2; and so on back to the initial condition. Hence there's no 'shortcut' (this is a consequence of Rice's theorem BTW, which itself is a consequence of the halting problem).
“ e.g. I can predict the result of sin(1000000τ) will be 0, without anything physical/virtual having to wiggle up and down a million times. “
This seems to still involves ‘a program’ in that there are foundational mathematical principals that build your confidence/logic that the periodicity of sin(x) is so well defined that any even integer multiple of pi will be zero. e.g. you could write out the logical ‘programatic’ steps that would also teach a math newbie how to make the conclusion you just made about the sin() function.
I agree, but such a "program" wouldn't necessarily look/act anything like the "program" 'sin(1000000τ)'.
What's interesting about systems like Turing machines is that the "program" is a specific, isolated thing like a tape of symbols. In contrast, those "foundational mathematical principles" you mention are more of a diffuse web, with lots of interchangable pieces, all mixed up with unrelated facts. This gives lots of different ways to arrive at the answer. With Turing-complete systems, we always have to come back to the code.
> Note that quantum computers are no more powerful than Turing machines;
But computers with self-consistent time travel are more powerful. It's perhaps physically possible to construct one of those, depending on whether we ever end up with a more-correct timeless theory of physics that makes different predictions to existing physics.
> Backpropagation through time is also a pretty simple algorithm, which is already implemented on classical computers.
I read “backprop through time” as meaning that you have the result of the backprop at the beginning of the process – that's more powerful, if we're using continuous computation. (With discrete computation, you're right that it's no more powerful.)
That would be a major revolution to the foundations of computer science if it were true, disproving the Church-Turing thesis. It seems unlikely that such a fundamental, world-shattering result would be so obscure, so I am much more likely to believe it is false.
As I said, if someone had convincingly proved this, it would shake the field to its core. Since that didn't happen, they can't have convincingly proven this.
With all due respect, it would be very hard for me to believe so. For the simplest case, one can trivially create a Turing machine that simulates a CPU, so I’m not sure the digital computation holds any water.
Well, not time-interrupted but there absolutely is a way to schedule n steps of each core, isn’t there? Similarly to how the universal Turing machine is constructed, or how the nondeterministic TM is made deterministic.
So unbounded nondeterminism is basically hypercomputation? Then I assume one can write a program for the theoretical Actor model that computes a function on every natural number, is that right? Or that it can solve the halting problem for Turing machines, since if it is stronger then it, the halting problem of TMs is no longer true for them (is replaced by their own halting problem).
Does this difference still hold if time were discrete? I may be out of my depth, but it seems intuitive that if time were discrete, you could enumerate all possible interleavings of actors' actions, and reproduce their behaviors in a Turing machine.
Turing machine can simulate multiple Turing Machines but cannot implement multiple Turing Machines communicating with each other because there is no means to communicate.
But since Turing machines can emulate multiple Turing machines, any problem that can be completed by multiple machines communicating can be solved by 1 Turing machine. As such, the computational power is exactly the same.
In traditional HN style, I will now thumb my nose at Vertasium's excellent video due to him skipping over how godel came up with the proof.
He built up the whole damn video for that one moment, when he says "And godel went through all this trouble to come up with this card..." and flips over the card. But how godel did it is at least as cool as the proof itself.
I'm mostly being tongue in cheek with the Veritasium nose-thumbing. But if you saw it and want to understand it more deeply (and in my opinion more satisfyingly), I recommend "What Godel Discovered": https://news.ycombinator.com/item?id=25115746
It explains Godel's proof using Clojure. Quite accurately, too.
The thread is also hilarious. Shoutout to the MIT professor who took offense at someone linking to his wikipedia page: "Wikipedia is contentious and often potentially libelous." ... and then you check his wikipedia page, and it's a super positive article about all of his discoveries and life's work.
> Shoutout to the MIT professor who took offense at someone linking to his wikipedia page […] and it's a super positive article about all of his discoveries and life's work.
Carl Hewitt is no doubt a famed and accomplished professor, but why does nobody point out that he is claiming that Gödel’s proof is incorrect[0]?
I already heard of his conviction that “The Church/Turing Thesis is that a nondeterministic Turing Machine can perform any computation. […] [T]he Thesis is false because there are digital computations that cannot be performed by a nondeterministic Turing machine.”[1].
(Which, to be fair, is a strongly related argument to the one that Gödel was wrong.)
I don’t personally know whether he is right, and his digression about millions of people risking death as a result does not inspire confidence, but in my understanding, neither statement is widely accepted.
I don't want to be a professional Hewitt debunker, but his stance has been ripped to shreds; at this point, he is wrong about Direct Logic refuting Gödel and Turing, and his ActorScript can't be implemented on real machines.
I don't know anything about Hewitt or his stance, but I don't feel like this adds a whole lot to the conversation. I mean I can tell that you're sure that he's wrong, but why should the strength of your belief convince me?
It would be helpful if you added some links or some explanation, or just anything beyond what you've written there.
It’s similar to claiming someone made a perpetuum mobile — extraordinary claims going against basically every mathematician/computer scientist since Gödel require extraordinary evidence.
And the result is quasi always that the one claiming, looked over some small but crucial detail.
Isn’t the order of a proposition included in its Gödel number?
Each proposition is assigned to an increasing prime power, and the increasing list of primes has total order, such that swapping propositions yields a distinct Gödel number.
I think what ProfHewitt means here (based on other writing I've found, as he is frustratingly low on details in these conversations) is "order" in the sense of "first order logic", "second order logic"; not in the sense of "first proposition, second proposition" etc.
His claim is that the proposition "This proposition is not provable", formalized as "P equivalent to P is not provable" is not well formed in a typed logic, as "P"'s type is first-order, while "P is not provable"'s type is second-order. Therefore, his claim is that the proposition is simply not well-typed and therefore not interesting. Godel's proofs were discussing an untyped logic, but according to Hewitt that is not an accurate representation of mathematics.
I don't think anyone in the space agrees with him, though, as far as I could tell from some cursory reading.
Could you provide two statements with equal Gödel numbers and the same characters in different orders, and a detail of how you compute the Gödel numbers?
If two statements have the same characters in the same order, how are they not the same statement? And why would that make Gödel’s statement invalid in Principia Mathematica?
I know of several computer verified proofs of the incompleteness theorem. See entry 6 of https://www.cs.ru.nl/~freek/100/ for a small catalogue.
How does this relate to your objection? Are there bugs in the verifiers? Or did people verify the wrong theorem statement?
So you say that the computer verified proofs of the incompleteness theorem of verifying the wrong theorem? Because there can't be bugs in a computer verified proof.
(Of course there can theoretically be bugs in a computer verified proof. If there are bugs in the verifier. But if 4 independent proof assistants verify a proof of a theorem, this is extremely unlikely.)
Suppose that Turing is wrong. Then Turing gave an algorithm which should defeat Rice's Theorem. Therefore, I may simply ask, to Hewitt: Could you please write up a short program in a popular Turing-complete programming language which proves or disproves Goldbach's Conjecture? It should be trivial!
Of course, on actual computers, Turing's result applies, Rice's Theorem manifests, and such a program will likely run indefinitely, until we get bored and give up.
The point that Hewitt has made in another thread is typical misdirection. See, it doesn't matter which order of logic we're talking about; in general, we have results for higher order logic, all the way up to Lawvere's fixed-point theorem for any Cartesian closed category. To fight this, Hewitt has to disclaim the entirety of 20th-century mathematics. (At least he's internally consistent -- he does deny modern maths!)
Sad fact is that, ever since 1976 (http://www.laputan.org/pub/papers/aim-353.pdf) folks have recognized that Hewitt's claims about the foundations of computation are not just wrong, but flamebait. But he never had the humility to learn from his mistakes.
Loosely, because ActorScript requires some sort of genuine non-determinism where an actor explores multiple states simultaneously, we need some sort of magic non-deterministic computer. A quantum computer isn't enough; simulating an ActorScript program is NP-complete.
Maybe there can be physical machines which run ActorScript, but I would expect them to come with new laws of physics, and also I think that there are probably more comfortable programming languages. We already have https://en.wikipedia.org/wiki/Constraint_Handling_Rules for example.
Because it is? It isn't quite "ridiculous", as it is somewhat more amusing than that. It certainly isn't "funny", FWIW, as I have absolutely no positive connotations with it... "hilarious" feels like the correct term.
>> The machine was still considered impressive decades later when the AI pioneer
Norbert Wiener played against it at the 1951 Paris conference,[AI51][BRO21]
[BRU1-4] which is now often viewed as the first conference on AI—though the
expression "AI" was coined only later in 1956 at another conference in Dartmouth
by John McCarthy. In fact, in 1951, much of what's now called AI was still
called Cybernetics, with a focus very much in line with modern AI based on deep
neural networks.[DL1-2][DEC]
Yeess, that's true. John McCarthy named "AI". And he wisely set the
foundations of the field in the work of his thesis advisor, Alonzo Church, and
the mathematicians of his generation, and the ones before, all the way back to
Gödel.
That is to say, McCarthy set the foundations of AI in mathematical logic, the
branch of mathematics that began in antiquity by Aristotle, flourished in the
Middle Ages, the Rennaissance and the Enlightenment with the works of Arab and
Christian scholars, reached its peak with Leibniz and was finally exalted with
the work of Frege, Russel, Whitehead, Hilbert, Tarski, Skolem, Herbrand, Gödel
himself, Church, Turing and others. The branch of mathematics that gave us
digital computers, much like Jurgen Schmidhuber points out in his article. It is
this work that is the only, well, logical, foundation of artificial
intelligence. That suffices to explain why the earlier "cyerbnetics" branch of
research fell through: because unlike McCarthy's AI, it was not grounded in
solid theoretcial foundations that had already yielded such monumental results
in mathematics and computer science.
And it is the foundations of AI in logic that AI research must return to,
eventually, because it makes no sense to throw out two thousand years of
progress, just because an intellectual midget by the name of Sir James Lighthill
wrote a report. It would be even worse to throw out such richess of knowledge in
return for the ability to put bunny noses and ears on selfies with "neural
networks".
>> To develop and deploy Universal Intelligent System in this decade will require considerable development beyond classical mathematical logic.
I don't disagree and I don't think that mathematical logic is the (only?) way to develop human-like artificial intelligence. Mathematical logic is the branch of mathematics that preoccupies itself with the proof of statements in a formal language. The ability to prove formal statements is probably something that is very useful to have, for any intelligent entity, but whether this ability is useful to create such an intelligent entity is another matter.
But I think we're all so far away from having a clue what an "artificially intelligent" entity might look like, or what "machine intelligence" might be like, that any predication about what may be needed to achieve it and how to come about it, is futile. We'll never know because we'll all be long dead by the time anyone creates a computer that can think like a human. Let alone one that can think better than a human.
If you want my opinion, far from one another. It's an accident of history that
logic, a branch of mathematics, was lumped together with the fuzzy, hardly
scientific goals of the AI project. It's all the fault of McCarthy, of course.
Oh, sure, McCarthy is one of my science heroes. But it is because of him that
reserach that firmly belonged to mathematics and computer science in the broad
sense, particularly the research on automated theorem proving, the natural
continuation of the work of Gödel, Church and Turing, was lumbered with the goal
of somehow showing that it can compute intelligence.
I mean, why is the ability to prove sentences in a formal languages a necessary
requirement for intelligence? As far as I can tell, most humans do fine without
it and so do basically all non-human animals that have anything remotely
recognisable as intelligence.
Why is the ability to play chess a prerequisite of intelligence, for that
matter, or the ability to predict protein structure? All those are typical tasks
of AI as a field of research, because they happen to be "things that one can do
with computers", i.e. computable computations. But what do they have to do with
intelligence? What the hell even is intelligence? Turing gave us a model of
computing machines and if one assumes that intelligence is a computable
function, then it makes sense to try and compute it with a machine, but isn't
that assumption the first thing we should test, before jumping right in and
doing all the things we'd do if we knew it for sure to be true? And if it was
true, wouldn't we have seen some progress towards that goal, after 70 years of
study?
I'm speaking in this as someone who entered AI research because of my interest
in logic programming and automated theorem proving, and mathematical logic in
general. I am not interested in creating "AIs" one bit and yet, here I am,
studying for a PhD in what is traditionally an AI subject. That's just dumb.
Logic should have stayed where it belongs, in mathematics.
But... McCarthy was Church's student, so it was natural for him to set down the
foundations of the field to the seminal work of his advisor, and others like
him. Kind of like he was so keen on chess as an AI task- because he liked to
play chess.
At least McCarthy's personal preferences did help to build a foundation of AI on
theory, like I say in my comment above, so, for a while at least, it could have
been less fuzzy and throw-stuff-at-the-wall-to-see-what-sticks-y than (most of)
it is currently.
I don't understand how Gödel showed limits of AI, or even addressed AI in its modern incarnation, which has little to do with logic. Perhaps some more knowledgeable reader can elucidate?
If your AI uses a classical computer, then everything it's doing is reducible to the Boolean algebra of its component logic gates.
All of the training you've done on the GPU; all of the coordination done by the CPU; everything… it's all one (absurdly long) boolean expression.
Like all algebraic structures, Boolean algebra is built from an initial set of axioms. Under Gödel's incompleteness theorem, it is therefore true that if Boolean algebra isn't inherently broken (inconsistent), then we can craft questions that cannot be answered using Boolean algebra alone, but this is all your classical computer is capable of doing!
Now, if you're training AI using an analog system or a non-classical computer such as a quantum computer, then this opens up a totally different discussion.
> we can craft questions that cannot be answered using Boolean algebra alone
Sure, that's trivial, the travelling salesman problem for example can't be solved exactly, but approximative methods come very close. Given that brains are pattern recognition machines generating probabilities we're already in the approximative regime. Nothing is certain, even science has its revisions.
So does it matter if we can only approximate? For living things the main thing that matters is life, self reproduction, not exactness. As long as they reproduce they are still valid, with all imperfections and approximations. I think the only questions that matter are those related to life, there are plenty of uncomputable things outside self reproduction. This is for biological agents, an AI would be using evolutionary methods to reproduce their digital genes instead.
It's exponential, you can solve it for N=100, but not for N=1,000,000. You can't say "it just takes a lot of time" if that time is > age of the universe.
Actually, you can. There's a huge gulf between things that we know we could compute given enough time, even if that time is impossible to actually be given, and things we know can never be computed given infinite computing power.
Something like the perfect security of either hiding or binding in a commitment protocol can never be broken given computers of infinite capacity and speed because it's mathematically impossible. That is opposed to computational security, which only states that the property can't be broken within any usable time frame. This is often in the order of hundreds of millions of years in practical cryptographic systems. The point is, it can be technically be done.
Breaking RSA can factored just by brute forcing. It might take a lot of time, but actually doing it is not hard for sufficiently small keys. We just increase the key size until it takes a long time.
> You can't say "it just takes a lot of time" if that time is > age of the universe.
1. The difference matters a WHOLE LOT when you're studying computability. If the question you're trying to answer is "Can this be computed exactly?", then the distinction is paramount.
2. Your argument here is functionally equivalent to stating that there exists a largest number since it's trivial to construct a number which would require more digits (in any base you'd like) than subatomic particles in the entire universe to represent in its fully-expanded form… so let's pick this one as the largest and stop worrying about all of that infinity hub-bub!
There is a fundamental difference between "We don't know how to solve this practically since the best solution we have would take too much time." and "It is a problem without solution. It can not be solved at all not matter how you approach it, or even how an imaginary alien specie with technology beyond our understanding would approach it".
Also, the equivalence between the different models of computation for Turing completeness (turing machine, lambda calculus, ...) isn't about complexity classes of the problems, only wether or not the model allow to solve the same set of problems. For example searching in a sorted array is O(log n) on a Von Neumann machine but O(n) on a Turing machine.
Isn't his theorem proving that any mathematical system will be either incomplete or inconsistent ? And quantum computing is still a mathematical system so the same applies ?
Yes, but I didn't cite QC as an exception to the rule. I only mentioned it because it's governed by a different set of axioms and would result in a different discussion.
Without getting too technical, if AI must run on a computer, and there are
programs that computers cannot compute, then there are limits to what AI can
compute, or in other words think (because it's an AI; so to think, it
computes).
As a very crude example of this kind of limitation, I'm sure I've read a hundred
Sci-Fi stories where the humans defeat the AI by posing to it an unsolvable
problem, usually a variation of the "liar's paradox", e.g. "this statement is
false" (which btw is used by Gödel in his proof). So a human traipses over to
the superintelligent AI and blurts "I always lie". Then the AI spends an aeon
processing the sentence and then blows up.
Even worse, intelligence itself may turn out to be an uncomputable program that
cannot be executed on a computer. In that case- no AI.
The scenarios you described are not realistic. There's no AI getting stuck into a loop by a tricky question, not even GPT-3. Any solution would take into consideration its computation cost. AlphaGO for example would evaluate 50K board states per move, it won't go into a 3^361 recursion.
In general when the problem is so hard evolutionary methods are suitable. They naturally blend the notion of cost with that of search and cope better with deceptive objectives.
Your turn of phrase "there's no AI getting stuck..." has me stuck. Are you
assuming that there exist "AIs", as in artificially intelligent entities, like
the kind imagined by science fiction writers and (some) AI researchers, alike?
To clarify, there are no such systems. "AI" is the name of a research field, not
any ability that characterises a class of systems currently known.
This suffices to explain why there is, indeed "no AI getting stuck inot a loop
by a tricky question". Because there is "no AI" at all, certainly not of the
kind that can understand a "tricky question" sufficiently well to stumble on the
paradox inside it.
For example, GPT-3 has no ability to process "this sentence is false" in such a
way as to decide its truth. AlphaGo is not capable of processing language at
all, it is only capable of playing board games and it isn't even capable of
playing board games by reasoning, only by search. AlphaGo searches a game tree
structured as a directed graph, without loops so it's hard to see how it could
get stuck on recursive paradoxes anyway.
In general, such systems as exist today do not have the mathematical properties
of the formal systems described by Gödel, Church and Turing. They don't even
have memories. So they are, let's say immune to incompleteness, because they're
not even incomplete.
It's important though to note that we know of no problem so far that a human mind can solve that couldn't be solved by a Turing Machine. Godel's incompleteness theorems may very well apply to human minds as well, and so far this seems more likely than not (though a lot of human thinking is actually inconsistent, so perhaps it falls to the other side of the incomplete/inconsistent "choice" than formal systems).
Well... maybe human minds are themselves as limited as Turing machines? In that case we may never be able to create a machine that can overcome our, and Turing machines', limitations.
> Thus he identified fundamental limits of algorithmic theorem proving, computing, and any type of computation-based AI
Fundamental limits presuming one has arbitrarily high (but finite) quantities of time and space with which the computations can be performed. Given in real world computation we will never have either, the fundamental limits of real-world computation are a lot less (even infinitely less) than those given by Gödel's work.
Also, demonstrations of the theoretical limits of computation (Gödel, Turing, etc) often make the assumption that we only have finite (even if arbitrarily large) resources, and that true contradictions (dialetheias) are disallowed. If we give up either (or both) of those of two assumptions, we can compute beyond those limits. It may be objected that computations beyond those limits are not physically realisable; but, almost all computations within those limits are not physically realisable either, so how much significance does that objection actually have?
> almost all computations within those limits are not physically realisable either, so how much significance does that objection actually have?
If some thing is impossible with arbitrarily large finite resources, it is still impossible with "practically large" finite resources !
That's why Turing / Gödel results really tell something fundamental about computing/proving; it tells everyone that they do not need to spend time solving an unsolvable problem on computers "as we know them".
But if you prove that something is possible in a your own magic computing framework, it remains practically useless until you implement it in the real world (an example of such a situation is the framework of "quantum computer" trying to solve prime factorization).
> That's why Turing / Gödel results really tell something fundamental about computing/proving; it tells everyone that they do not need to spend time solving an unsolvable problem on computers "as we know them"
Consider something like Turing's proof that the halting problem is undecidable – does that have practical relevance, even just in telling us that there is no point in trying to solve an unsolvable problem?
That's the standard line but I doubt it. The thing is, the halting problem is practically irrelevant; what has practical relevance is the bounded halting problem. Practically, you don't care about the difference between a program that never halts, and a program that halts after a googolplex steps, but that distinction is essential to the halting problem as defined. And, the practically relevant, bounded halting problem, is in fact decidable. Now, it is still intractable, because it is EXPTIME-complete, and we also know (per the time hierarchy theorem) that EXPTIME-complete problems are not in P.
So Turing's proof doesn't actually tell us anything much useful about whether the halting problem is solvable in practice, whereas the proof that the bounded halting problem is EXPTIME-complete does. Suppose (counterpossibly) that the bounded halting problem was in P instead of EXPTIME-complete; then the halting problem might actually be tractable in practice, even though Turing's proof of the unbounded halting problem's undecidability would still hold.
The undecidability of a specific formulation of the halting problem is practically irrelevant, but you should never take published results too literally. If the halting problem is undecidable in one formalism, it is also undecidable in many other formalisms, and so is a wide class of other problems that can be reduced to the halting problem.
Rice's theorem is one immediate consequence of the undecidability of the halting problem. It effectively states that if you are writing code that analyzes (certain external behavior of) other code, there will always be edge cases your code cannot handle. It is useful to understand that the perfect algorithm is not out there, just waiting to be discovered. Instead of searching for it, you should start thinking about the trade-offs between resource usage and the ability to handle increasingly rare edge cases.
I'm not a fan of resource-bounded variants of decision problems, because the standard complexity classes are ugly beasts. Reasoning about them is so difficult that many fundamental problems remain unsolved. And even if you manage to prove something, the excessive use of quantifiers means that you only learn something about the worst-case behavior with some arbitrarily large inputs.
The resource-bounded problems also miss the point. Impossibility results are not really about the impossibility of proving something with realistic or unrealistic resources. The proof ideas are often more important than the results. One idea is particularly simple: if you have already written your code, I can write an input that fools it.
> Rice's theorem is one immediate consequence of the undecidability of the halting problem.
I wonder if one could construct a bounded analogue of Rice's theorem? Something like "all non-trivial, semantic properties of programs that use maximum resources R are undecidable by programs using maximum resources R".
> I'm not a fan of resource-bounded variants of decision problems, because the standard complexity classes are ugly beasts. Reasoning about them is so difficult that many fundamental problems remain unsolved.
It seems to be that, reasoning about actual computers is harder than reasoning about physically impossible theoretical ones, so people would prefer to study the later than the former. Fair enough, but surely the former study is more practically relevant – even if harder – than the later; and I don't know why in the later the study of one particular class of physically impossible machines (Turing-equivalent machines) gets privileged over the study of other more powerful classes of physically impossible machines (such as oracle machines, or supertask machines). All physically impossible machines are equally physically impossible.
(Privileged not in the sense of being ignored – of course there is a great deal of theoretical work done on super-Turing computation; but privileged in the sense that super-Turing computation is often presented as something only of interest to specialists, whereas work on Turing-equivalent computation gets promoted as something of practical relevance and general interest.)
> One idea is particularly simple: if you have already written your code, I can write an input that fools it.
Here's a very similar idea about computation with bounded resources: If you had a perfect program analysis program that only worked for programs consuming up to some resource limit R, then it would require more than R to apply the same analysis to itself.
And either idea may not be true for dialetheic machines [0]. Dialetheic machines are, as far as we know, not physically realisable; but Turing machines aren't either.
You are saying that the decision problem of "Is this algorithm A in P?" is undecidable for arbitrary algorithms A. We don't need a general solution to that decision problem to be able to prove that some particular algorithm is or isn't in P; just like how, we don't need a general solution to the halting problem to prove some particular program does or doesn't halt. The halting problem only means we can't have a computable procedure which will generate such a proof in every case.
The fact that we've actually proven the bounded halting problem is not in P doesn't contradict this. The undecidability of determining whether any arbitrary problem is or isn't in P doesn't prevent us from having a proof that some particular problem is or isn't.
And I said counterpossibly that if the bounded halting problem were in P not EXPTIME-complete, then the halting problem might be practically solvable even if Turing's proof of the undecidability of the unbounded halting problem still held. In this counterpossible, we don't need to be able to solve the decision problem of determining whether an algorithm is in P, all we need is a proof that this one particular algorithm is in P. The undecidability of the decision problem in general tells us nothing about whether we could know its answer for individual cases.
It's a harsh, but true reality that with only have finite quantities of time and space, especially in real world computation. The assumption of finite resources is also a harsh, but true reality.
There are many things that fundamentally defy any form of computation. Things which cannot create mathematical formulae to represent, nor create algorithms for.
For a major example, we cannot translate the mental activities of conscious, aware living beings into any sort of computable form, as these activities are not algorithmic in nature. They aren't random, nor deterministic ~ rather, they are indeterministic, following no rigid patterns.
Even the most complex, complicated computer program follows an algorithm, which is ultimately deterministic. Even if you throw in some random inputs at some parts, the algorithm still acts deterministically. Perhaps the only really indeterministic part in an algorithm's behaviour might be hardware level bugs and errata which interfere with the otherwise very predictable algorithm.
> For a major example, we cannot translate the mental activities of conscious, aware living beings into any sort of computable form, as these activities are not algorithmic in nature. They aren't random, nor deterministic ~ rather, they are indeterministic, following no rigid patterns.
How could you possibly know that? Following no known patterns, maybe.
Trivially, for any finite string (in a finite alphabet), there is a finite program (Turing machine, whatever) which outputs that string given empty input. Hence, "the mental activities of conscious, aware living beings" are trivially computable if there exists a finite string which describes them with perfect accuracy.
Furthermore, there obviously are some computable patterns in those activities, so the shortest possible computable program to generate that description will actually be shorter than the length of the description itself.
One could respond that a string describing those mental activities, no matter how accurately, is a different thing from the mental activities themselves. I think that is indeed the correct response, but it has nothing to do with any questions of computability.
(Another response would be to claim that those mental activities cannot be finitely described because they are actually infinite. Few however will want to claim that human minds are infinite.)
I guess you're right. But I was trying to defend Valmar's statement which I still think is basically right in this context. Computers can't really implement non-deterministic algorithms.
Well they can. Quantum computers are inherently non-deterministic.
Even for a classical computer, access to a true randomness source (such as a sufficiently good hardware RNG) is enough to make a classical computer non-deterministic, and hence programs that rely on that true randomness source are classified as non-deterministic.
In practice, we sometimes classify a program as nondeterministic even if it only has access to pseudorandomness, provided that pseudorandomness is "random enough". If a program uses a high quality PRNG seeded with the current time, that might be practically considered non-deterministic, even though strictly speaking the program's behaviour is a deterministic function of the current time (and other inputs).
Artificial neural networks can be either deterministic or non-deterministic.
Biological neural networks are non-deterministic.
(You can debate philosophically whether apparent non-determinism is actually fundamentally non-deterministic or ultimately reduces to some hidden determinism. It depends on one's choice of interpretation of quantum theory – many worlds and hidden variables both claim that reality is fundamentally deterministic, other interpretations do not. But, whether or not non-determinism ultimately exists, it certainly apparently exists, both in biological and technological systems.)
Yes I was asserting that algorithms are by definition deterministic. But I guess you can technically consider non-deterministic algorithms but those aren't really relevant here. Computers don't behave non-deterministically. Every program in your computer describes a deterministic process.
If a machine is deterministic, all programs for the machine must be deterministic. Gödel’s theorems don’t contradict that. If you think they do, you’ve misunderstood them.
Fun fact, on the PDP-7, there was a rare case of completeness, where the instruction "-0" not only encodes itself, but also operationally loads itself.
For me it just means that math is just like the code. If you start defining stuff and cross-calling it without special care you will end up in trouble in form of infite loop or recursion or in case of math, a paradox.
From what I understand it is that you might reason about but not prove that statement "this statement is not provable" must be true but thus unprovable.
Godel's result shows that if you are willing to assume axioms that let you do arithemetics, statement equivalent to this paradoxical one just crops up.
For me it just means that you need to put additional restrictions of what you can do in math so that kind of statement is excluded from math.
Same way that "set of all sets that are not their own elements" got excluded from consideration by more carefully defining what set theory is involved with.
>> For me it just means that you need to put additional restrictions of what you can do in math so that kind of statement is excluded from math.
The thing about Godel's proof is that he didn't just come up with one example of a "this statement is not provable" sentence; he proved that any sufficiently complex math system will have unprovable sentences (some true, some false, some you'll never be able to know). And "sufficently complex" is not all that complex! So if you want to be able to do calculus, or even most arithmetic, you can't just "put additional restrictions ... so that kind of statement is excluded from math."
He proved that any sufficiently complex math system will have unprovable sentences.
That's mostly because "sufficiently complex" is defined by mathematicians, not computer scientists. In particular, it includes infinities, which do not exist in the physical universe. Godel's proof requires allowing unlimited depth recursion, which does not exist in the physical world.
The halting problem is decidable for any deterministic system with finite memory. Either you halt, or you repeat a state. This covers most real-world computer programs.
There's a useful subset of arithmetic and logic which is completely decidable. It contains integer addition, subtraction, multiplication by constants, and all the logic operators. This covers most subscript checking in programs, which is quite useful. You can probably add modular arithmetic with a fixed upper bound to that subset.
Now, some systems may require very large amounts of work to prove, but that's different from being undecidable. "Very large" is not infinite. That heads us off into the theory of lower bounds, P = NP, and all that, where there are still many open questions.
This knocks out many of the sillier claims about undecidability making something impossible in the real world.
>In particular, it includes infinities, which do not exist in the physical universe.
Bold claim. As far as I know the centers of black holes are indeed non-removable singularities of (some) solutions to GR equations.
What's funny is I had a debate with a cosmologist where I was on your side of the debate because he was a mathematical realist (and therefore forced to reconcile the same divergences with the "reality" of math).
> The halting problem is decidable for any deterministic system with finite memory. Either you halt, or you repeat a state. This covers most real-world computer programs.
Problem being that you cannot decide whether a system with such finite memory halts without using another different system that has even more memory. And if that larger system doesn't "real-world" exist, can you truly say that the halting problem is decidable?
You can determine if a program halts by running two copies in lockstep, one at half the speed of the other. If their states are ever the same after the first step, they're in an infinite loop.
That was actually used in an early batch system for running student jobs in an interpreter. A successful student job ran for under a second; one in a loop ran for 30 seconds until it was killed for taking too long. So detecting simple loops, even with substantial extra overhead, was worth it.
Yes, but the point is, if you have an amount of memory M, there exists an amount of memory N for which there is no program that fits into M that can successfully predict whether an arbitrary that fits into N will halt. No infinites required.
(since your example is using two copies, you're essentially using 2M memory)
For the same reason, when someone uses halting problem to 'explain' why their program freezes, they are wrong. They are wrong on multiple counts actually.
For pure mathematics, this stuff is relevant. For all practical purposes, it is not (as far as infinite-precision analog memories are not involved).
Sure you can. Just ban the concept of provability from your math language (or some elements that lead to it). Same way like they banned weird kinds of sets and started talking about set families instead of sets of sets.
If you declare that set being its own element is nonsensical statement you no longer have a paradox with set of sets that are not their own elements.
Base assumption of math is that correctly constructed mathematical statement is not nonsensical. It must be true or false or not determined by current set of axioms. And I think that assumption of sensibility of any math statement leads to paradoxes.
In the case of set theory, you can exclude sets of sets and still have an extremely rich and deep mathematical theory of sets. (But it is definitely still incomplete - for example ZFC can't prove the continuum hypothesis, and if you take the continuum hypothesis or its negation as an axiom, there are still infinitely more unprovable statements in that mathematical theory)
In the case of Godel's incompleteness theorem, excluding the amount of math you need to in order to avoid unprovable sentences leaves you with only very simple axiom systems. If your mathematics is complete enough to even do simple arithmetic (i.e. Peano arithmetic), then there are unprovable statements.
It's not as easy a fix as Russell's set theory paradox, which is why it's an important and foundational aspect of nearly all mathematics even today.
A set of elements that lead to the concept of provability[1] are: 0, 1, addition, multiplication, quotient, remainder and inequality.
Quotient and remainder are definable implicitly in the language of number theory (the language of Peano Arithmetic). Sometimes inequality is explicitly defined, and sometimes it is implicitly defined using exists and addition.
From these elements you can go on to define the Goedel beta function, which allows you to encode lists of numbers and extract the nth element from such an encoding. From there you can go on to define arbitrary primitive recursion (and even more). From there provability can be defined as a primitive recursive function.
I'm not sure which element there you want to ban. Maybe you want to ban multiplication?
You can ban the way how you combine the elements. When you reach the step where you define primitive recursion you could say that statements that involve recursion of depth of more than hundred are not true or false or undecided but just meaningless and excluded from mathematical consideration.
I know it sounds silly but the statements that prevent infinite recursion in programming languages often do look silly. They look like a hackish stopgap that doesn't fit the pristine recursive algorithm. Yet they work and protect you at the cost of the recursion not to be able to correctly deal wit stuff that would need a deeper recursion.
These statements are not phrased in terms of recursion.
Once you inline all the definitions then they are phrased in terms of arithmetic. That's the whole point of all this!
Have a look at http://tachyos.org/godel/Godel_statement.html and tell me exactly why that formula is banned? Is the formula too long? To many uses of multiplication? Give me a decision procedure.
Yes. To avoid paradoxes you need to limit nesting, inlined or not. This statement reeks of uncontrolled infinities with all the quantifiers and I don't even know what some symbols mean like 0''' Because of my ignorance I can't point the exact problem here but I don't think it is the multiplication.
Alternatively you might just redefine the concept of something being true such that you only provable things are true and unprovable ones are either false, undecidable or nonsensical.
And nonsensical thing is defined as a statement thats unprovable but seemingly true.
I think there many ways to fix this just by restricting yourself with how you define things. And it's not about restricting arithmetc because that's not the core of the issue, that's just the (simplest?) example.
So your problem is with the use of unbounded quantifiers that range over all natural numbers?
So for example you would consider "∀x. ∀y. x + y = y + x" a nonsense statement because we are quantifying over all natural numbers, and there are an infinite number of natural numbers, so we cannot quantify over them?
(For the record the ' in 0' or x1' is a post-fix notation for the successor operation. See http://tachyos.org/godel/proof.html for details).
I don't know what the problem with that particular long statement is.
You might just say that a thing is nonsensical if it's not provable but is not false either.
This might be sensible 'stack overflow' exception if we really are unable to provide reasonable limits on self reference reference and reasoning relying on infinities.
Nonsensical if it's not provable with respect to what theory exactly? Elementary function Arithemetic? Primitive Recursive Arithemetic? Peano Arithemtic? Martin-Löf type theory? ZF set theory? ZFC+"there exist an infinite number of Woodin cardinals"? "The set of true statements of number theory"?
Each of these logical theories are each able to prove an increasing number of arithmetic propositions. What is or is not provable is relative the deduction system or selection of axioms.
For example, that big expression that I linked to is designed so that isn't provable in Peano Arithmetic, but it will be provable Martin-Löf type theory, ZF set theory, etc.
I'm sorry that no I don't get the point. You keep talking about [Gödel 1931] proposition I'mUnprovable, but today is 2021 not 1931, and I'm talking about a specific proposition that I've linked to which is, what appears to me to be a clearly well defined first-order logical proposition involving classical logic, zero, successor, plus and times.
I want to know if you object to the existence of that formula that you can see on your screen with your own eyes, and if you do object to it why you do.
Because I contend that that formula is of exactly the same character as Goedel's statement, with the difference being that with a computer, and modern encoding functions, we can actually strip away all the definitions and compute the fixed point and literally print it out onto your screen. It is right there in front of your eyes. Look at it!
Yet you keep on insisting that fixpoints don't exist despite the fact that I have pointed you directly to a formula that has been specifically computed to satisfy a logical fixed point equation.
It's like saying that there cannot be a fixed point of the function f(x) := 3x-10 by waiving your hands and claiming functions don't have fixed point because numbers need to be ordered. But that is daft. All you have to do is compute f(5) = 3*5-10 = 5 to see that 5 is a fixed point of f.
Take a look at this. The proposition 0=0 is a fixed point of phi(X) = ¬(¬X). Just do the logical deduction to see 0=0 ⇔ ¬(¬ 0=0) is a true arithmetic statement. See even fixed points for propositions sometimes exist!
Perhaps a better explanation - it's similar to the halting problem in computer science. There are programs p where "Program p halts on input x" is unprovable (equivalently there is no computable way to determine which p,x halt), and the key is that we have no way of knowing which programs p and x this is true for, of the ones we don't yet have proofs/halt-prediction-programs about.
You may want to argue that this can be "solved" by never letting programs take other programs as input, which isn't wrong, exactly, but you're left with not very much you can compute in that paradigm for computation.
I think it's important to point out that you can do calculus and arithmetic and only construct proofs that are provable. The possibility of deriving unprovable but true statements in an axiomatic system doesn't mean they're common or you can't prove all the satatements one has encountered thus far.
Schmidhuber has done a lot of important, fundamental work, at the time when none of it was “cool”, but whenever I stumble upon his writing it always reads like a bad history book, where when and who are more important than what or why. I get (and moderately support) his obsession with proper attribution, but I wish he first talked about ideas, and then delved into the intricate historical details.
What most ppl don't notice is that Gödels incompleteness theorems are themselves expressible only in a logic system capable of expressing Peano arithmetic. Now this means that they apply to themselves which means that we cannot know if they are made up from a axiomatic system that can prove anything.
This is not quite true, Gödel's incompleteness theorems can luckily be formalized in extremely weak fragments of Peano arithmetic such as primitive recursive arithmetic (PRA) with its very limited induction principle. :-)
The only position on the philosophy of mathematics I know which does not accept PRA is ultrafinitism.
It's ok to use logic to prove theorems in logic. We also use thinking to deduce some facts about our thinking. Self-reference is not always paradoxical. In a higher-order logic some self-referential definitions are legitimate.
Yes and then you can do self-referential statements again which leads you to the conclusion that Gödels theorems are provable only in a system that cannot prove its own consistency.
True, but it's not a logical paradox. The fact that a system can't prove its own consistency doesn't imply that this system is inconsistent. The fact that Godel's theorems are provable only in such systems also doesn't imply that they are wrong.
From the common sense the whole situation looks paradoxical, indeed. To prove consistency of some theory we have to use some stronger theory, to prove consistency of that theory we need even stronger theory and so on.
Perhaps, the best way to realize why there is no paradox is the following:
Our memory is finite. As well as our thinking. But the total number of true facts about mathematics is infinite. By constructing theories we are trying to compress the infinite number of true facts into some finite form. Godel's theorem says it's impossible. And it looks quite natural from this perspective.
I never said it is a paradox. I never said that PA or Q is inconsistent. I said that they cannot prove their own consistency by Gödels theorem. Hence we don't know if Gödels theorem was formalized in an inconsistent system.
Honestly it would be weird if the incompleteness theorems don't apply to themselves.
It's not Peano arithmetic, it is basically 'enough of' arithmetic for Goedel's methods to apply. Robinson arithmetic is weaker than PA but Goedel still applies. Goedel's argument is basically a meta-argument about any mathematical system which is rich enough to describe useful mathematics, it does not rely on any particular axiomatisation, rather it applies to all axiomatisations with a few simple features.
While it is true that Goedel's theorem applies to weak systems such as Robinson Arithmetic (and any decidable extensions there of), The proof of Goedel's result itself requires at least some amount of induction.
As a consequence the minimum system that Goedel's second incompleteness applies to is stronger than the minimum system that the first incompleteness theorem applies to.
Q cannot prove its own consistency. Which means there is no way of telling that Gödels theorems are proved in a theory that is inconsistent (where everything is true).
I’m not too knowledgeable on the topic, but Gödel’s proof uses a smaller logic system — on which a meta-language can be used to prove consistency/completeness. It is precisely about not being able to prove these properties “from within”.
Schmidhuber certainly can't be accused of not being consistent ... article is basically yet another long rant about people not getting the credit they deserve.
At this point, it looks like this theme has become a very central piece of his mental world.
Veritasium missed that existence of the [Gödel 1931]
proposition I'mUnprovable leads to inconsistency in
foundations by the following simple proof:
Ever since Euclid, it has been a fundamental principle
that a theorem can be used in a proof (the principle of
TheoremUse), that is, {⊢((⊢Ψ)⇒Ψ)} [cf. Artemov and
Fitting 2019]. However, by [Gödel 1931],
⊢(¬I’mUnprovable⇔⊢I’mUnprovable). Consequently,
⊢(¬I’mUnprovable⇒I’mUnprovable) by TheoremUse.
• Therefore ⊢I’mUnprovable using
ProofBySelfContradiction {⊢((⊢(¬Ψ⊢Ψ)) ⇒ ⊢Ψ)} with
Ψ being I’mUnprovable.
• Thus, I’mUnprovable using TheoremUse {⊢((⊢Ψ)⇒Ψ)}
with Ψ being I’mUnprovable. Consequently,
⊬I’mUnprovable using ⊢(I’mUnprovable⇔⊬I’mUnprovable)
Having both ⊢I’mUnprovable and ⊬I’mUnprovable is a
contradiction in foundations.
I’m sorry for my comment at https://news.ycombinator.com/item?id=27537500. In hindsight, I realize now that it sounded like I was making fun of you. I didn’t intend to do that at all. I think your concerns about Wikipedia were surprising, and I only meant to express surprise.
Thank you for taking the time to patiently explain to everyone your ideas about improbability. It takes a lot of courage to present an argument the way you’ve been doing in these threads. And you present your proofs in mathematical form here, which is rare, and worthy.
(I’m posting this here simply because it’s your most recent comment, so that you’re likely to see it. I wasn’t quite sure where to put it.)
(I meant “Your ideas about provability,” not “Improbability.” My phone changed it, so it probably sounded like nonsense. Point is, it seems to me that your ideas about the incompleteness theorem might be true, and I am surprised and impressed that you’re in here explaining to everyone mathematically why they might be true. I don’t have the expertise to judge, but I recognize excellence when I see it. And your explanations are clearly excellent, because if they were mistaken, someone could check your math and point out the mistake. But no one has done that, so it seems to me you might be correct, even though everyone seems to think it’s guaranteed you’re wrong.)
If there’s somewhere I can follow your work in this area, I would be interested to see how it turns out. For example, if you’ve written a paper on this subject, or if there’s some place online where you post updates related to this.
For the record, Gödel proposition can be constructed[1] in the same manner that Quines[2] can be constructed. I myself have formalized[3] the first incompleteness theorem in the Coq proof assistant, and many others have as well in various proof assistants.
You can find specific and concrete instance of a Gödel proposition written by Stephen Lee at <http://tachyos.org/godel/Godel_statement.html>. It really is not that large of a proposition. I'm not sure how you can claim that proposition that I can see with my own eyes does not exist.
Since you allege the existence of I'mUnprovable propositions renders Peano Arithmetic (see <http://tachyos.org/godel/Godel_statement.html> cited above), Coq, HOL Light, Isabelle and every other proof assistant that has proven the incompleteness theorem inconsistent, I look forward to you facilitating the development of a formal contradiction in any one of these systems. In particular a proof of False in Coq[1] would greatly aid me in getting through some of my troublesome proofs.
Very good. Then surely the paper proof can be translated into an inconsistency of Coq, Isabelle or HOL-light, since all of these proof assistants already prove the existance of such an "I'mUnprovable" proposition.
Nice to hear from you again, Carl. Your comments about inconsistent foundations brings to mind the Univalent Foundations approach to mathematics--can you comment on how your research relates to univalent foundations and homotopy type theory?
I have relied on Douglas Hofstadter and Roger Penrose for explanations of Gödel's theorem. Hofstadter is easier - Gödel Escher and Bach. Penrose's explanation is dryer, but more direct - The Emperor's New Mind.
I understand the idea of Gödel numbering; I understand Cantor's diagonal slash; but I struggle to assemble the components into a proof that convinces me. I think I just lack the cerebral capacity.
I'm not sure about "claims to". I have misplaced my copy of the Penrose book. Hofstadter's book is humorous, despite its weight. It doesn't purport to be a canonical explanation of the theorem.
Both books are ultimately focused on the nature of consciousness, although from quite different perspectives. In both cases, the theorem is presented as one element of an argument.
It's interesting that two sophisticated writers with wildly different opinions about consciousness have both leaned on Gödel to back their arguments. There's nothing in the theorem that shouts "This is about consciousness!"
A strong AI would need a theory of incompleteness in order to save time when solving problems, and that's not as simple as saying "this program is taking more than a second to finish, I quit!". There must be some intuition to make the decision.
I'm puzzled as to why we'd somehow expect so much more of artificial intelligence than we expect of natural intelligence, or that we somehow don't expect to be able to develop artificial intuition systems.
My own intuition ;-) is that intuition seems to derive more from pattern matching than from logic.
Tangentially related: the book "Gödel, Escher, Bach: an Eternal Golden Braid" by Douglas Hofstadter is a computer science classic that deals with incompleteness among other things (mostly self-references in formal systems) in a fun way.
I appeciate how Schmidhuber puts focus on non-US-centric history of Deep Learning and modern AI.
He sometimes overdoes these. Even when something is not related to Deep Learning and modern AI. He tries to push very narrow and one-sided views without much room for nuance.
For example- "Turing used his (quite inefficient) model only to rephrase the results of Gödel and Church on the limits of computability."
He is trying to say that Turing did nothing new, his work is a mere extension of Godel's (!), and he only accounted for practical computational reality. That is not true.
He goes on further by saying, "Later, however, the simplicity of these machines made them a convenient tool for theoretical studies of complexity." Like, Turing and Church's work was only good for that. I digress. I believe that Turing's work is more seminal and paved the way for modern general computers.
Another bit here says "The formal models of Gödel (1931-34), Church (1935), Turing (1936), and Post (1936) were theoretical pen & paper constructs that cannot directly serve as a foundation for practical computers."
It is putting the works of Turing, Post, Church, and Godel all together. Like they are the same. I see the logic as they are all impractical, but they are different levels of impractical. Putting them all under one bracket like that makes no sense.
I respect Schmidhuber a lot, but some of his behaviour is borderline childish which are more pronounced in his claims about AI history. You should be skeptical about what he says (and what anyone says).
Despite all this, I did not know many things written in this article. I learned a lot and had fun reading it.
I first read about Zuse in Isaacson's Innovators book. Did not know that he used a high-level language for a chess program. Fascinating. He is very underrated.
This is classic Schmidhubris, a mixture of truth and exaggeration.
Turing's paper is a landmark, and includes even an equivalence proof of the power of the lambda calculus and his universal machine. In doing so, he came up with a fixed-point combinator, an applicative combinator, now called the Turing fixed point combinator [1]. Especially appropriate on a website run by the Y combinator.
Turing's paper should run as a TV ad "But wait, there's more..." Many applications of topology (he argues why the alphabet should be finite, rather than simply saying that it is finite, by using the Bolzano-Weierstrass theorem, essentially), combinatorics, logic, functional programming - when FP was hardly a decade old and largely unknown outside Princeton - and all from an undergraduate course project report!
It's an honor to receive a response from you. I agree on all points, especially the lack of combinators for strongly typed systems. (As I understand it, it is impossible to assign types to such terms.) My main point was that disrespect for Turing's paper is largely unjustified.
I see neither hubris nor exaggeration. Before Turing, it was Church who proved the equivalence of the power of his lambda calculus and Gödel's universal model.
Agree, the quote lumping the models of computation together is very misleading. Turing’s machines are immediately obvious how to mechanize and automate, even with very old technology. That was the key difference. Then proving them equivalent to mu-recursive functions and lambda calculus showed that these models of computation actually deserved to be called such. It proved that they could be implemented on a machine with no human ingenuity or creativity required to carry out the steps, which is the heart of what it means to be computable.
Neither did Gödel's automatic theorem prover require "human ingenuity or creativity required to carry out the steps" to enumerate all the theorems from the axioms, which are also enumerable. Gödel really described "the heart of what it means to be computable."
The problem is that if you don't like Schmidhuber's interpretation of the history of logic, computer science and AI, and you want to propose a better interpretation, you need to know that history at least as well as he does.
And "know that history as well as he does" approach is not suited for knowledge fields. The notion that you need to know as much as a creator to critic their works is childish and immature.
I might not know as much as him about the field, but I am sufficiently well versed about the topics discussed in this article to point out his biases and fallacies.
I am a fan of people like LeCun and Schmidhuber, but I don't worship them.
I have read many articles and listened to many talks by JS, and the kind of biases present in this article is nothing new.
>> The notion that you need to know as much as a creator to critic their works
is childish and immature.
Schmidhuber is not a "creator" in the sense that Pekinpah, or Coppola, are
"creators" and his works are not works of art that can be experienced and
criticised by anyone based on aesthetics alone (even though aesthetics also can
take a lot of honing).
In any case my comment was that, to propose a better interpretation of the work
described by Schmidhuber (not "to point out their biases and fallacies", as you
say), you need to know that work at least as well as him. Otherwise, you find
yourself criticising the work of someone who knows more than you know.
I don't know about you, but I've done that in the past and it made me look and
feel like a fool. You may think you are "sufficiently well versed about the
topics discussed in this article" but if you so dismiss the need for general
knowledge, beyond the contents of one article, then there may easily be any
amount of knowledge that you're missing and that makes you think you know all
you need, simply because you don't know what you don't know.
Remember what Socrates, the wisest of men, said: "I know one thing, that I know
nothing, Jon Snow".
I may have garbled that a bit there. But you get the picture. Arrogance cannot
replace knowledge. Never assume that you know more than you need to know to
prove wrong an expert in a field of knowledge as deep and with as long a history
as AI, just because you have an internet connection.
Finally, nobody "worships" LeCun or Schmidhuber. What people admire is their
knowledge and the hard work they put into acquiring that knowledge. And the
reason they admire it is because anyone who's ever try to get to grips with a
scientific subject understands how much hard work it takes to acquire
knowledge.
You write that he is "trying to say that Turing did nothing new, his work is a mere extension of Godel's (!), and he only accounted for practical computational reality. That is not true." However, Schmidhuber's text on Gödel/Church/Turing/Post/Zuse is much more nuanced than that:
"In 1935, Alonzo Church derived a corollary / extension of Gödel's result by showing that Hilbert & Ackermann's famous Entscheidungsproblem (decision problem) does not have a general solution.[CHU] To do this, he used his alternative universal coding language called Untyped Lambda Calculus, which forms the basis of the highly influential programming language LISP.
In 1936, Alan Turing introduced yet another universal model which has become perhaps the most well-known of them all (at least in computer science): the Turing Machine.[TUR] He rederived the above-mentioned result.[T20](Sec. IV) Of course, he cited both Gödel and Church in his 1936 paper[TUR] (whose corrections appeared in 1937). In the same year of 1936, Emil Post published yet another independent universal model of computing,[POS] also citing Gödel and Church. Today we know many such models. Nevertheless, according to Wang,[WA74-96] it was Turing's work (1936) that convinced Gödel of the universality of both his own approach (1931-34) and Church's (1935).
What exactly did Post[POS] and Turing[TUR] do in 1936 that hadn't been done earlier by Gödel[GOD][GOD34] (1931-34) and Church[CHU] (1935)? There is a seemingly minor difference whose significance emerged only later. Many of Gödel's instruction sequences were series of multiplications of number-coded storage contents by integers. Gödel did not care that the computational complexity of such multiplications tends to increase with storage size. Similarly, Church also ignored the spatio-temporal complexity of the basic instructions in his algorithms. Turing and Post, however, adopted a traditional, reductionist, minimalist, binary view of computing—just like Konrad Zuse (1936).[ZU36] Their machine models permitted only very simple elementary instructions with constant complexity, like the early binary machine model of Leibniz (1679).[L79][LA14][HO66] They did not exploit this back then—for example, in 1936, Turing used his (quite inefficient) model only to rephrase the results of Gödel and Church on the limits of computability. Later, however, the simplicity of these machines made them a convenient tool for theoretical studies of complexity. (I also happily used and generalized them for the case of never-ending computations.[ALL2])"
You also write that Turing's work "paved the way for modern general computers." According to the text, however, this practical part really started with Konrad Zuse's patent application of 1936:
"The formal models of Gödel (1931-34), Church (1935), Turing (1936), and Post (1936) were theoretical pen & paper constructs that cannot directly serve as a foundation for practical computers. Remarkably, Konrad Zuse's patent application[ZU36-38][Z36][RO98] for the first practical general-purpose program-controlled computer also dates back to 1936. It describes general digital circuits (and predates Claude Shannon's 1937 thesis on digital circuit design[SHA37]). Then, in 1941, Zuse completed Z3, the world's first practical, working, programmable computer (based on the 1936 application). Ignoring the inevitable storage limitations of any physical computer, the physical hardware of Z3 was indeed universal in the "modern" sense of Gödel, Church, Turing, and Post—simple arithmetic tricks can compensate for Z3's lack of an explicit conditional jump instruction.[RO98]"
So you don't agree with the text but can you point out a concrete error instead of referring to "Many of the child parents to my main address to that"? I could not really find anything convincing in those child parents.
I agree with much of what he says, I just pointed out things I don't agree with.
And what's up with the wall of quote? I read the article in full.
On top of what I already said, I add that crediting Lambda Calculus as only the basis of a high-level programming language does not do it justice.
He is giving Turing and Post credit in one paragraph, and taking it away in another. As if like he cannot decide.
More importantly, Turing's approach is completely reduced to adopting "a traditional, reductionist, minimalist, binary view of computing—just like Konrad Zuse". It is not giving even close to the credit deserved. Many of the child parents to my main address to that.
I could not really find anything convincing in those child parents. What exactly is the important contribution of Turing that is not mentioned in the text? Furthermore, the text does not credit "Lambda Calculus as only the basis of a high-level programming language." It says that in 1935, Alonzo Church used it to derive "a corollary / extension of Gödel's result by showing that Hilbert & Ackermann's famous Entscheidungsproblem (decision problem) does not have a general solution." That was the important contribution of Church before Turing proved the same thing in a different way.
Both the discovery of Lambda Calculus and the Turing Machine as models of computation are quite important and immediately useful, I'd say. Though very formal, just a little bit of syntactic sugar and eye-squinting make for useful programming constructs and more or less direct lines to early as well as modern programming languages.
Only true for nondeterministic systems. Which all computers are as there is physical nondeterminism. However, it can (mostly) be ignored in any practical application of these theories, because the chances of you ending up in a nondeterministic computation at all are minute, never mind an unbounded one. And we are talking about practical applications here.
Many-core and networked computer systems fight indeterminacy. My networking algorithms would be a lot simpler if I could guarantee that everything operated in lockstep; the fact that I have to discard lockstep to get performance is because lockstep is a high-cost abstraction over a high-entropy (so, basically non-deterministic) underlying reality, not because indeterminacy is somehow inherently better.
For networking, yes. For multicore, it's merely a concession to the fact that instructions on my architecture are variable-length, and there's a transparent cache mechanism (requiring knowledge of memory access patterns, which requires knowing the result of the computation ahead of time).
Not necessarily? If you have a real-time OS and you write your program well, you can synchronise timings and have cores send messages to each other without queues. It's hard to write fast code that does that, but in narrow circumstances, it's possible (I'm thinking embedded applications) – and when it is possible, it's faster than the equivalent algorithm with indeterminacy.
Indeterminacy slows things down. It's a concession.
And natural determinacy, if you can get it, speeds up processing more than indeterminacy. The useful question is “what's the lowest-level model I can usefully use?”, not “can I do it with indeterminacy?”.
Do you think the above statement is wrong? If so, why?
I've tried really hard to understand which part of that paper is an example of a computation that a Turing machine couldn't implement, and I have been completely unable to do so. I will say that the writing in the paper is in serious need of some basic copy editing - it is full of typos, repeated phrases and entire paragraphs (especially in the abstract).
The repeated paragraphs are in the abstract on the site, not in the actual pdf. Here is a sample:
> “Monster” is a term introduced in [Lakatos 1976] for a mathematical construct that introduces inconsistencies and paradoxes. Euclid, Richard Dedekind, Gottlob Frege, Bertrand Russell, Kurt Gödel, Ludwig Wittgenstein, Alonzo Church, Alan Turing, and Stanisław Jaśkowski all had issues with mathematical monsters as discussed in this article. Monsters can lurk long undiscovered. For example, that “theorems are provably computational enumerable” [Euclid approximately 300 BC] is a monster was only discovered after millennia when [Church 1934] used it to identify fundamental inconsistency in the foundations of mathematics that is resolved in this article.
> Euclid, Richard Dedekind, Gottlob Frege, Bertrand Russell, Kurt Gödel, Ludwig Wittgenstein, Alonzo Church, Alan Turing, and Stanisław Jaśkowski all had issues with mathematical monsters as discussed in this article. This article explains how the theories Actors and Ordinals recraft classical foundations without limiting mathematical power.
> Euclid, Richard Dedekind, Gottlob Frege, Bertrand Russell, Kurt Gödel, Ludwig Wittgenstein, Alonzo Church, Alan Turing, and Stanisław Jaśkowski all had issues with mathematical monsters as discussed in this article. Computer Science brings new concerns and considerations to foundations beyond earlier work requiring new machinery. This article explains how the theories Actors and Ordinals recraft classical foundations without limiting mathematical power.
> “Monster” is a term introduced in [Lakatos 1976] for a mathematical construct that introduces inconsistencies and paradoxes. Since the very beginning, monsters have been endemic in foundations. They can lurk long undiscovered. For example, that "theorems are provably computational enumerable" [Euclid approximately 300 BC] is a monster was only discovered after millennia when [Church 1934] used it to identify fundamental
For typos in the actual PDF, here are a few :
- teams of Wrens (female
Naval Officers) operated large-scale
simulations in [sic!] that discovered ways
to defeat U-boat attacks that were
crippling Britain.
- The reason that in practice that [sic!] an Actor can be
hundreds of times faster is that in order to carry out a concurrent
computation, the parallel [...]
- close request when received, send [...] [not a typo, just strange phrasing?]
- if it is not recorded as that the engine is [...]
- Actor event induction (cf. [Turing 1949]) can used to
prove [...]
- Suppose to obtain a contradiction that there is [...]
De-emphasizing Turing as an instance of focusing on non-US-centric history? Turing was not American! Furthermore, Church was American, but that quote is aggrandizing to Church while diminishing of Turing. I don't think that quote is an example of what you claim.
I'm puzzled as to why the author seems to think that the incompleteness theorem says anything important about AI (rather than just pointing out that self-contradictions can be a potentially irritating problem in logical systems - something that was known since antiquity.)
If you want to learn thoroughly about Godel's Incompleteness Theorem, and all its implications, nothing is better than Godel, Escher, Bach: An Eternal Golden Braid by Hofstadter.
This book is quite outdated in both approaches and ideas about computers, and computer-aided approaches to Artificial Intelligence, this book still has to offer a lot. It is one of the most impactful books I have read.
If you are mathematically-inclined and are looking forward to understanding the proof itself, the book won't do much better than the following video [1] taken from a comment in this thread [2]. Disappointed, I attempted to read Godel's original papers. That has it's own challenge as it refers to Principia Mathematica. I started reading that. Only to fimd that it uses older mathematical notation not in use today.
In short, I haven't found any source that explains the proof in reasonable detail.
If you want to understand the thought process of axiomatic reasoning (and with less extrapolations to consciousness et al) it is a good book.
Note: I had read a long time back, so do not remember the details.
I feel like GEB can't even properly be described as being "about" anything in particular. It seems to evade capture in the same way that the formal systems that it describes evade capture. It is almost "about" what "meaning" itself is... Nothing quite like it.
Fascinating book indeed. I read it when I was a student some twenty years ago, I had been astonished by simple and clever trick Godel used to prove his theorem - to digitize everything, then operate with just numbers.
Also, I feel like Schmidhuber tends to be German-speaking-centric.
And overall, acknowledging that this is his signature shtick and it's good to have a voice like this too, he does a lot of anachronistic reinterpretations of early discoveries in light of later results. It streamlines the story as if it was always headed towards the today, culling aspects that didn't pan out or magnifying aspects that were at the time more minor and not considered the central issue or angle.