What’s wild to me is that Donald Hoffman is also proposing a similar foundation for his metaphysical theory of consciousness, ie that it is a fundamental property and that it exists outside of spacetime and leads via a markov chain of conscious agents (in a
Network as described above)
Ie everything that exists may be the result of some kind of Uber Network existing outside of space and time
It’s a wild theory but the fact that these networks keep popping up and recurring at level upon level when agency and intelligence is needed is crazy
I don't see how that follows. Almost nothing about what we've discovered is human-centric, we've had to build machines to sense beyond our human perceptual limits. In what ways are gravitational waves detected by LIGO human-centric?
In that humans built LIGO to bring forth phenomena for them. There is no science without an observer. The universe might exist without us, but without an observer there is no-one to describe it.
I don't see how this has any bearing on the original point. Why would universe where everything can be empirically tested be conveniently and suspiciously human-centered?
That was my reply to you, not the original point. If it helps:
I think the GP implied with the statement that we should subscribe to some non-empirical theories ("outside spacetime"), because the endeavour is not supposed to be human-centered.
1. You said that the observations are not human-centered since we need to make machines, so the GP does not need to be suspicious and should stick to empiricism.
2. Both you and the GP seem to assume that the endeavour is not supposed to be human-centered
3. I say it is human-centered because we make the machines for us as observers at the center, so your argument doesn't quite work
4. I think our science is entangled with us as observers on many levels, and it should be. So the GP's statement should actually be pushed through suspicion towards neccessity.
> You said that the observations are not human-centered since we need to make machines
I didn't say LIGO was not human centered, but that the gravitational waves which LIGO detected are not human-centered.
I just don't think your take is equivalent to the OP's claim. Yes, our instruments are human-centered because humans are the observers, and so phenomena outside of our perceptual range has to be projected into our perceptual range. That doesn't imply that the underlying phenomena are human-centered, or that the theories formed from those observations are semantically human-centered (syntactically they are because humans have to be able to read them).
I frankly don't even understand the OP's claim: how does the proposition "this is a universe where everything can be empirically tested" logically entail "this universe is human centered". I can agree with being suspicious of the claim that we can empirically test everything, I just don't get how that entails human centeredness.
Well, my take is not equivalent to OP, in that I actually don't agree :) I am a human-centered empiricist.
I think OP's logic flows the other way: skeptical of human-centeredness position (like you) and from that skeptical of going purely empirical (not like you/me).
I'd say LIGO described phenomena as gravitational waves through the interaction of the experiment. They didn't detect anything because that would go beyond empiricism into assuming the existence of a thing beyond the interaction
I was going to write something up but honestly the top two answers in this physics stack exchange do it more competently and comprehensively than I'm able
My takeaway/summary is 'AdS/CFT let's you temporarily change your space/model to make the math easier and then map it back into the original model. Although we don't currently have such a model specified for our particular observed universe, it still allows us to study functionality equivalent behavior and make some determinations over what is/isn't/could be possible'
> Ok, so if I understand correctly, AdS spaces aren't meant to directly model the universe, but rather are used as a tool to make certain calculations in CFT easier?
GP's comment was modelling the universe as AdS/CFT, though, which is why my question addressed precisely that part:
> How is AdS/CFT (as it pertains to describing spacetime & the Standard Model)
Sorry, my point was to (almost facetiously) interject the topological concept of "boundary/bulk" correspondence (as also exists with holograms & AdS/CFT) rather than suggesting the correctness of AdS/CFT.
The idea that "outside" becomes available via a change in perspective from boundary to bulk.
For those who, like me, are not familiar with AdS/CFT, it's "anti-de Sitter/conformal field theory correspondence" which is "a conjectured relationship between two kinds of physical theories", allowing quantum field theories to be made more mathematically tractable.
2. Consider that (according to science anyways) there is no central broadcaster of reality (it is at least plausible)
3. Consider each agent (often/usually) "knows" all of reality, or at least any point you query them about (for sure: all agents claim to know the unknowable, regularly; I have yet to encounter one who can stop a "powerful" invocation of #3 (or even try: the option seems literally unavailable), though minor ones can be overridden fairly trivially (I can think of two contrasting paths of interesting consideration based on this detail, one of them being extremely optimistic, and trivially plausible))
Simplified: what is known to be, is (locally).
4. Consider the possibility (or assume as a premise of a thought experiment) that reality and the universe are not exactly the very same thing ("it exists outside of spacetime"), though it may appear that they are (see #3)
Is it not fairly straightforward what is going on?
A big part of the problem is that #3 is ~inevitably[1] invoked if such things are analyzed, screwing up the analysis, thus rendering the theory necessarily "false" (it "is" false...though, it will typically not be asserted as such explicitly, and direct questions will be ignored/dodged).
[1] which is...weird (the inevitable part...like, it is as if consciousness is ~hardwired to disallow certain inspection (highly predictable evasive actions are invoked in response), something which can easily be tested/demonstrated).
in #2 you are claiming there is no objective reality or no 'broadcaster' of reality
We must assume some things as being objective such as a rational universe in order to make any claims at all.
-if you are saying in #3 that humans as conscious agents make subjective claims about reality but that those claims are in fact 'the reality' for that agent or person, that is a subjective claim. (I'm not saying that that subjective reality isn't true for that person)
Also, Hoffman doesn't make a 'supernatural' claim per se, his claim is simply that reality as 'we all see it' is NOT the whole story, and that it is in fact only the projection of a vast, infinitely complex network of conscious agents that creates what we perceive as the material universe and time. He starts with the idea that consciousness as a property is fundamental, existing outside of space and time and that if you apply reasoning and mathematics that networks of agents acting as UANs in a sense project that material universe into being, with that assumption, ie that it extrapolates to our entire universe.
I'm not sure I'm (or anyone for that matter) is really qualified to answer that claim..it's so big that it does verge on mysticism. that's why I said its such a wild idea, but I found the article above another interesting piece of evidence for Hoffman, because it talks about a general theory underlying such networks:
whose "repeated and recursive evolution of Universal Activation Networks (UANs). These networks consist of nodes (Universal Activators) that integrate weighted inputs from other units or environmental interactions and activate at a threshold, resulting in an action or an intentional broadcast"
ie this is very similar to Hoffmans system of Conscious Agents -which is an extreme theory of such networks that I described above
These are good questions, I am on mobile at the moment so won't be able to make a response that does them justice for 2 days or so.
I'd think my other post provides some relevant content though?
In the meantime it may help...an important axiom in my model/theory is that the universe exists independent of us, but reality is downstream of us. I think Donald's theory is based on Idealism maybe, where he disagrees and thinks reality is downstream of us, and the universe is downstream of reality? But that raises some very tricky paradoxes, more so than the one main paradox/problem that all models have (I think? Maybe not, maybe I just lack adequate imagination! And it doesn't make him necessarily wrong, but it puts it into the same category as God(s) imho: anything is possible, including the "impossible". Which is fine, but please acknowledge it explicitly, Donald.)
I'm not terribly hung up on which model one subscribes to (or has been subscribed to) in general, but I am extremely hung up on logical inconsistencies and paradoxes within them, that are not explicitly acknowledged in a non-dismissive manner...this is fundamentally important to my model, as mine has an opinionated ~ethical component (Utopianism), and an extremely strong dislike for "imposters" in this regard.
"knowledge" = belief (possibly true but not necessarily, but sincerely perceived as "true")
(I'm considering this from an abstract / autistic / "That's pedantic! [so stop doing it]" perspective, so I include quotation marks to note the technical distinction...in phenomenological analysis, perhaps they'd be left out, to better illustrate the local experience of reality, the true "is-ness" as it is. In normative discussions ("anything that good hackers would find interesting"), these things are generally rather taboo.)
There's lots of nuance I'm leaving out, but that's the general idea.
A popular though terminating description for the phenomenon is "that's just people expressing their opinion, everyone does it, that's what everything boils down to" (which can make it not only not obvious, but damn near invisible)...but consider the semantic differences of that with and without the inclusion of the word "just". (Also: watch out for #3, it's recursively self-referential, and has substantial cloaking / shape-shifting abilities. It is almost always and everywhere.)
An alternate perspective: consider what an uneducated person "sees" in "reality" (aka: what "is", and "is not") as they go about their day, compared to highly educated (as opposed to knowledgeable) people from very distinct disciplines.
We've already invalidated hidden variable theories in Physics so I find it hard to believe consciousness has some separate class of hidden effects still undiscovered and allowable in our universe.
The only theories not ruled out by Bell are non local. You have to accept such a mountain of nonsense for any non local theory to be valid that I don't think anyone takes them seriously.
Aside from non-local theories which have been known since the 1950s and are regularly used in quantum chemistry, superdeterministic theories have seen virtually no development. Claims that they either of these approaches entail a mountain of nonsense are based on no evidence.
I think you are misunderstanding. These are typically mathematical tricks used for computation and don't lead to actual non-local interpretations of QM. I don't think any major work is done without standard QFT.
I doubt you can find any serious Theoretical Physicist who believes retrocausality or FTL information transmission is compatible with the Universe we observe.
Anyway we've diverged off the actual crux which is that Quantum Conscious Woowoo theories require types of hidden variable theories which are not coherent with this Universe.
> These are typically mathematical tricks used for computation and don't lead to actual non-local interpretations of QM.
Bohmian mechanics is an explicitly non-local theory of QM, not merely a computational trick. It actually makes predictions that differ from QM in non-equilibrium regimes, but is consistent with QM in all regimes currently accessible to us.
> I don't think any major work is done without standard QFT.
I don't know what "major work" means. Work in quantum foundations is done without QFT all of the time. Yes, QFT is a useful effective theory, but that's a far cry from the original claim about the nature of reality, ie. that there are necessarily no hidden variables. Literally no experiment has ruled out hidden variables wholesale, only certain types of hidden variables.
> Anyway we've diverged off the actual crux which is that Quantum Conscious Woowoo theories require types of hidden variable theories which are not coherent with this Universe.
Maybe consciousness would require hidden variables, but your conclusion is still based on a fundamental error, which is that we have observations that are inconsistent with hidden variables. Since this claim is in error, then the conclusion does not follow.
I think Hoffmans path and the idea of consciousness being fundamental comes from a few conceptual
Leaps - let me go through the high level of each one:
1. Current physics shows via quantum mechanics that spacetime has a definite limit in measurement (Planck scale)
2. Relatively also applies a similar limit on our ability to measure time and space (infinite energy/black holes)
3. Latest work in high energy physics has led to some interesting new findings (in the last 10 years or so) regarding an approach to calculate particle scattering amplitudes in supercolliders <= that is: when you apply nonlocal assumptions and certain mathematical simplification and that new approach simplifies the scattering amplitude calculations and ALSO just happens to map to a new conceptual framework where you think “outside of space and time” and then you can come to a geometric “structure” of immense complexity (let’s call one conception of that geometry the ‘amplitudehedron’) which is static, and immense encoding the universe itself, this polytope encodes the scattering amplitudes
4. Given the hard problem of consciousness, ie “only awareness is aware” (ie we cannot break down the qualia of awareness) (Now this is where Hoffman goes wild: Hoffman says: “ Ok well, given that space time as we know it doomed (not fundamental - again see point 3 above)” then, let’s propose that consciousness IS defined to be FUNDAMENTAL and that it exists as a ‘network of conscious agents’:
Ie he preposes a “formal model of consciousness based on a mathematical structure called conscious agents”. then he proposes how time and space emerge from the interactions of conscious agents via the structure mentioned in point 3 above..
Hoffman then claims his math for these models implies that we are in a universe that emerged out fundamental consciousness and that he is working on a mathematical model he hopes can be tied the new physics that emerge out of the amplitudehedron through networks of these agents
5. Finally it was my observation that the general theory of Neural Networks in the article had some interesting similarities with all of this (ie maybe Nature uses such networks at all scales to represent intelligence )
Feel free to be skeptical- I am but I get all sorts of weird feeling he’s onto something here…
I don’t know if this exists outside of spacetime, but I have a suspicion that UACs didn’t begin with gene regulatory networks, but are more fundamental part of a computational universe hypothesis.
More generally there’s graph neural networks, for instance, but not you’re including many dynamic networks that are not open-ended or evolvable. The idea is to identify common dynamics and add constraints on the types of networks that are included to find general principles within that class. Kisen the constraints, you make the class too broad and can’t identify common principles.
weren't the sophia/gnosis, emnations and eons were from greek philosophy? also any philosophy/hottakes that stress on duality (what's seen here and what's out there that is causing what's seen here - such as manichean, advaita etc).
This sits in a larger field of complexity theory and complex adaptive systems. There was also some interesting work on “Artificial Life” although that research program seems to have fallen out of favor. My introduction in 1995 was the book Chaos and then Stuart Kauffman’s At Home in the Universe. Wolframs New Kind of Science was also interesting.
The existence of a universal function approximator or function representation is not particularly unique to neural networks. Fourier transforms can represent any function as a (potentially) infinite vector on an orthonormal basis.
What would be particularly interesting is if there were a proof that some universal approximators were more parameter efficient than others. The simplicity of the neural representation would suggest that it may be a particularly useful - if inscrutable approximator.
I'm not arguing that this approximator is necessary (not sufficient) for this class of networks. I've proposed some conjectures on what we might expect to see, but there are certainly other salient ingredients and common principles that we haven't discovered, and I think it's important to hunt for them.
Oh absolutely, the article gave me quite a bit to think about. It wasn't until I sat down and tried swapping a fourier transform/representation into the conjectures that I was able to think critically on the topic.
I suspect that the pruning operation is useful to consider mathematically. A fourier transform is a universal approximator - but only has useful approximation power when the basis vectors have eigenvalues which are significant for the problem at hand (PCA). If NN's replace that condition with a topological sense of utility. Then that is a major win (if formalized).
The author is extrapolating way too much. The simplest model of X is similar to the simplest model of Y, therefore the common element is deep and insightful, rather than mathematical modelers simply being rationally parsimonious.
Nice list and history of common activation units used today.
Small note though, the heaviside function used in the the perceptron is non-linear (it can tell you which side of a plane the input point lies), and a multi-layer perceptron could classify the red and blue dots in your example. But it cannot be used with back-propagation because its derivative is zero everywhere, except at f(0), where it's non-differentiable.
I think I should clarify...
A multilayer perceptron can classify the red and blue dots if it uses a non-linear activation function for some or most of its layers correct?
If its perceptrons all the way down, it will fundamentally reduce down to a linear function or single linear layer and will not be able to classify the dots.
So there's the downside of not being able to linearly separate certain datasets, and the inability to scale weights or thresholds by differences in expected and observed data (e.g. using backpropagation)
You're right that if the activation function is linear, like the identity function, then it doesn't matter how many layers you have. But with the step function two layers is enough.
We can manually derive a network that can classify the sample data using the step function:
The four nodes in the first layer define four lines, tangents to the square 0.2 < x1 < 0.8 and 0.2 < x2 < 0.8, and the step function effectively checks which side of the line the point lies. The second layer just counts the number of "successful" line checks and yields True if all four pass. If the square is too rough of a shape then we can add more lines to the first layer to approximate any convex shape.
If the regions are concave then we can split them up into convex parts and add nodes to the second layer, one for each convex region. A third layer could then check if any of the convex region neurons activate. While in theory two layers with a non-linear activation function is enough to approximate this function, its structure would be harder to interpret.
But how do you find the right parameters without back propagation? The reason we don't use the step function is because its derivative is zero.
Despite vast implementation constraints spanning diverse biological systems, a clear pattern emerges the repeated and recursive evolution of Universal Activation Networks (UANs). These networks consist of nodes (Universal Activators) that integrate weighted inputs from other units or environmental interactions and activate at a threshold, resulting in an action or an intentional broadcast. Minimally, Universal Activator Networks include gene regulatory networks, cell networks, neural networks, cooperative social networks, and sufficiently advanced artificial neural networks.
Evolvability and generative open-endedness define Universal Activation Networks, setting them apart from other dynamic networks, complex systems or replicators. Evolvability implies robustness and plasticity in both structure and function, differentiable performance, inheritable replication, and selective mechanisms. They evolve, they learn, they adapt, they get better and their open-enedness lies in their capacity to form higher-order networks subject to a new level of selection.
> 2-UANs operate according to either computational principles or magic.
Given that quantum effects do exist, does this mean that the result of quantum activity is still just another physical input into the UAN and does not change the analysis of what the UAN computes? It seems difficult to think that what a UAN computes is not impacted by those lower level details (meaning specifically quantum effects, I'm not thinking of just alternate implementations).
> 4-A UANs critical topology, and its implied gating logic, dictate its function, not the implementation details.
Dynamic/short term networks in brain:
Neurons in the brain are dynamically inhibited+excited due to various factors including brain waves, which seems like they are dynamically shifting between different networks on the fly. I assume when you say topology, you're not really thinking in terms of static physical topology, but more of the current logical topology that may be layered on top of the physical?
Accounting for Analog:
A neurons function is heavily influenced by current analog state, how is that accounted for in the formula for the UAN?
For example, activation at the same synapse can either trigger an excitatory post synaptic action potential or an inhibitory post synaptic action potential depending on the concentration of permeant ions inside and outside the cell at that moment.
I'm assuming a couple possible responses might be:
1-Even though our brain has analog activity that influence the operation of cells, there is still an equivalent UAN that does not make use of analog.
or
2-Analog activity is just a lower level UAN (e.g. atom/molecule level)
I don't think either of those are strong responses. The first triggers the question: "How do you know and how do you find that UAN?". The second one seems to push the problem down to just needing to simulate physics within +/- some error.
> Given that quantum effects do exist, does this mean that the result of quantum activity is still just another physical input into the UAN
Yeah, it could be a spurious input though. My understanding is that quantum mechanics doesn't really matter at biological scale, and that kinda makes sense right? Like, if this whole claim about biology being reducible to the topology of the components of the network is true, then the first thing you'd do is try to evolve components that are robust to quantum noise or leverage it for some result (ie: one can imagine some binding site constructed in such a way that it requires a rare event that none-the-less actually has a very specific probability of occurring).
> and does not change the analysis of what the UAN computes? It seems difficult to think that what a UAN computes is not impacted by those lower level details (meaning specifically quantum effects, I'm not thinking of just alternate implementations).
What the UAN computes is impacted by those lower level details, but it is abstractable given enough simulation data.
ie, imagine if you had a perfect molecular scan of a modern CPU that detailed the position of every atom. While it would be neat to simulate it physically, for the purpose of analysis, you'd likely want to at least abstract it to the transistor level. The 'critical topology' is I guess, the highest possible level of abstraction before a CPU tester can tell your simulation from an atom-level simulation.
Now for CPUs, we designed that model first and then built the CPU. In biology, it evolved on the physical level, but still maps to a 'critical topology'.
"Topology is all that matters" --> bold statement, especially when you read the paper. The original authors were much more reserved in terms of their conclusions.
Yes, on its face it looks like he's saying that you can throw out the weights of any network and still expect the same or similar behaviour, which is obviously false. It's also contradicted in that very section where he reports from the cited paper that randomized parameters reproduced the desired behaviour in about 1 in 200 cases. All these cases have the same network topology so while that might be higher than expected probability for retaining function with randomized paramteres (over 2-3 orders of magnitude), it's also a clear demonstration that more than topology is significant
The topology needs to be information bearing. Weights of 0.0001 are likely spurious and if other weights are so relatively big they can effectively make the other fan in weights spurious as well.
God this grandiose prose style is insufferable. Calm down.
Anyway, this doesn't even try to make the case that that equation is universal, only that "learning" is a general phenomena of living systems, which can be modeled probably in many different ways.
Personally, I find the writing to be just fine. It is clear and cogent. I don’t have enough background to follow all the details, but I certainly hope you are not discouraged from pursuing big ideas by negative comments on style!
The excitement of new horizons is necessary for innovation, and a substack article is a safe way to express that excitement. It's clearly understood by the choice of medium that this is meant to be speculation, so there aren't any significant risks in engaging with the text on its own terms.
architecture astronauts let loose on unified field theories..
talking warm and fuzzy - big bold ideas.
let them, i say, until, the tide shifts to something else tomorrow, and a new generation of big-picture thought leaders take over dumping their insufferable text on the populace.
How does the attention operator in transformers, in which input data is multiplied by input data (as opposed other neural network operations in which input data is multiplied by model weights) fit into the notion of a universal activator?
This is a great question, and I don't yet have an answer. I'm going to butcher this description, so please be charitable, but functionally, the attention mechanism reduces the dimensions and uses the coincidence between the Q and K linear layers to narrow down to a subset of the input, and then the softmax amplifies the signal.
One unsatisfying argument might be that this might fall into implementation details for this particular class. Another prediction might be that an attention mechanism is an essential element of these networks that appears in other networks of this class. Another is that this is a decent approximation, but has limitations, and we'll figure out how the brain does it and replace it with that.
People seem to be obsessed with finding fundamental properties in neural networks, but why not simply marvel at the more basic incredible operations of addition and multiplication, and stop there?
Evolutionary pressure is such that, generally speaking, individuals who "stop there" are less successful than ones who always crave more. We are all descendants of those who were "obsessed" with: mating, hoarding, conquering and yes, finding patterns and fundamental properties.
There are some interesting parallels to ideas in this article and IIT. The focus on parsimony in networks, and pruning connections that are redundant to reveal the minimum topology (and the underlying computation)is reminiscent of parts of IIT: I’m thinking of the computation of the maximally irreducible concept structure via searching for a network partition which minimizes the integrated cause-effect information in the system. Such redundant connections are necessarily severed by the partition.
The scientist ambition is a grand unifying theory of minimalistic, reductionist and elegant principles that explain everything. Some even argue that we are already there.
But the truth is: when it comes to neurons, all those theories are effectively inferior to what evolution has achieved. They can explain some of what is going on, but they cannot reproduce the results of the biological counterparts.
The artificial results either require orders of magnitude more power, or examples, or has to be hardwired or trained in advance, or requires a billion dollars facility to manufacture the hardware involved.
Biological neurons get trained as they do inference, require fewer examples, use less power and the agent can get drunk and high and lose millions of neurons and synaptic connections and their brain will either keep working as usual, or everything will get rewired after a while.
We don't understand as much as we claim to do yet, if we did, we would have the same results at least.
Those neurons are being trained the day we were born. Reality corresponds to about 11 million bits per second. What I suspect’s happening is that we train higher and higher levels of abstraction and we get to a point where new knowledge is involves training a new permutation of a few high level neurons.
"Prokaryotes emerged 3.5 billion years ago, their gene networks acting like rudimentary brains. These networks controlled chemical reactions and cellular processes, laying the foundation for complexity."
... for which there is no evidence at all. Psuedo-science, aka Fantasy.
I could have bogged the essay down with qualifiers to address all the potential straw man objections, but that didn't seem productive. It's easy to take an uncharitable view on this, but I do explain more about GRNs later in the essay. I worked with them for 8 years, and yes, they do act like the rudimentary brains of the cell, and that's the reason this system is selected again and again by evolution.
If there is a 'god equation' it will almost certainly include a+b=c because we use it all the time to describe "diverse biological systems with vast implementation constraints".
This article is lacking originality and insight to such degree that I susupect it is patentable.
I love your hot take, but you forgot the nonlinear transformation which lets the "god equation" represent literally everything.
The post makes a nice point but it's not really surprising that everything can be modeled by an equation capable of universal approximation.
What I don't get is how genetic systems relate to this. They don't hook into it cleanly and the author just jumps right past them even though they're the most fundamental (biological) system of all those described.
Ie everything that exists may be the result of some kind of Uber Network existing outside of space and time
It’s a wild theory but the fact that these networks keep popping up and recurring at level upon level when agency and intelligence is needed is crazy
https://youtu.be/yqOVu263OSk?si=SH_LvAZSMwhWqp5Q