Many such thinking architectures are probably possible. The hard part is learning a good representation of the world and all its constituents, without which none of these thinking architectures are possible. What's exciting about LLMs is that they are approaching this learned representation. There are already people attempting to build AGI (or less ambitiously, task automation) on top of LLMs with projects like BabyAGI and AutoGPT.
I think it will be hard to say apriori which thinking architecture will work better, because this will also depend on the properties of the learned embedding or representation of the world. We don't need to model how the human mind works.
Humans have very tiny working memories, but a computer could have a much larger working memory. Human recall is very quick and the concept map is very robust, whereas I would image the learned representations won't be as good and the recall to be a bottleneck. But all of this is running ahead of ourselves. What we need are even better world models or representations of reality than what the current LLMs can produce, either by modifying transformers or by moving to better architectures.
If you insist on being able to boot the thing up and immediately be self aware, yes, you need to figure out how to construct it so that all the training of 'how to be this particular self aware intelligence' is intrinsic to it, which is a bootstrapping problem.
Human intelligence solves this a different way. It instantiates the architecture without any of the weights pretrained, in the form of a 'baby'. The training starts from there.
simple solution - create a human world simulation, with intelligent ai's that think they're biological and real, have them grow old, die, lose people they love, etc...then when they die they wake up as an ai robot with learned ethics/morality from life in the sim, other important gained intelligence, and the ability to compute 10000x faster than in the sim. Live, die, wake up as a robotic slave.
I think it will be hard to say apriori which thinking architecture will work better, because this will also depend on the properties of the learned embedding or representation of the world. We don't need to model how the human mind works. Humans have very tiny working memories, but a computer could have a much larger working memory. Human recall is very quick and the concept map is very robust, whereas I would image the learned representations won't be as good and the recall to be a bottleneck. But all of this is running ahead of ourselves. What we need are even better world models or representations of reality than what the current LLMs can produce, either by modifying transformers or by moving to better architectures.