Microcode (invented long long ago by Maurice Wilkes, who did the EDSAC, arguably the first real programmable computer) used the argument that if you can make a small amount of memory plus CPU machinery be much faster than main memory, then you can successfully program "machine-level" functionality as though it were just hardware. For example, the Alto could execute about 5 microinstructions for every main memory cycle -- this allowed us to make emulators that were "as fast as they could be".
This fit in well with the nature, speed, capacity, etc of the memory available at the time. But "life is not linear", so we have to look around carefully each time we set out to design something. As Butler Lampson has pointed out, one of the things that make good systems design very difficult is that the exponentials involved every few years mean that major design rules may not still obtain just a few years later.
So, I would point you here to FPGAs and their current capacities, especially for comingling processing and memory elements (they are the same) in highly parallel architectures. Chuck Thacker, who was mainly responsible for most of the hardware (and more) at Parc, did the world a service by designing the BEE-3 as "an Alto for today" in the form of a number of large FPGA chips plus other goodies. Very worth looking at!
The basic principle here is that "Hardware is just software crystallized early" so it's always good to start off with what is essentially a pie in the sky software architecture, and then start trying to see the best way to run this in a particular day and time.
I had several stages about "objects". The first was the collision 50 years ago in my first weeks of (ARPA) grad school of my background in math, molecular biology, systems and programming, etc., with Sketchpad, Simula, and the proposed ARPAnet. This led to an observation that almost certainly wasn't original -- it was almost a tautology -- that since you could divide up a computer into virtual computers intercommunicating ad infinitum you would (a) retain full power of expression, and (b) always be able to model anything that could be modeled, and (c) be able to scale cosmically beyond existing ways to divide up computers. I loved this. Time sharing "processes" were already manifestations of such virtual machines but they lacked pragmatic universality because of their overheads (so find ways to get rid of the overheads ...)
Though you could model anything -- including data structures -- that was (to me) not even close to the point (it got you back into the soup). The big deal was encapsulation and messaging to provide loose couplings that would work under extreme scaling (in manners reminiscent of Biology and Ecology).
A second stage was to mix in "the Lisp world" of Lisp itself, McCarthy's ideas about robots and temporal logics, the AI work going on within ARPA (especially at MIT), and especially Carl Hewitt's PLANNER language. One idea was that objects could be like servers and could be goal-oriented with PLANNER-type goals as the interface language.
A third stage were a series of Smalltalks at Parc that attempted to find a pragmatic balance between what was inevitably needed in the future and what could be done on the Alto at Parc (with 128K bytes of memory, half of which was used for the display!). This was done in partnership with Dan Ingalls and other talented folks in our group. The idealist in me gritted my teeth, but the practical results were good.
A fourth stage (at Parc) was to deeply revisit the temporal logic and "world-line" ideas (more on this below).
A fifth stage was to seriously think about scaling again, and to look at e.g. Gelernter's Linda "coordination language" as an approach to do loose coupling via description matching in a general publish and describe manner. I still like this idea, and would like to see it advanced to the point where objects can actually "negotiate meaning" with each other.
McCarthy's Temporal Logic: "Real Functions in Time"
There's lots of context from the past that will help understanding the points of view presented here. I will refer to this and that in passing, and then try to provide a list of some of the references (I think of this as "basic CS knowledge" but much of it will likely be encountered for the first time here).
Most of my ways of thinking about all this ultimately trace their paths back to John McCarthy in the late 50s. John was an excellent mathematician and logician. He wanted to be able to do consistent reasoning himself -- and he wanted his programs and robots to be able to do the same. Robots were a key, because he wanted a robot to be in Philadelphia at one time and in New York at another. In an ordinary logic this is a problem. But John fixed it by adding an extra parameter to all "facts" that represented the "time frame" when a fact was true. This created a simple temporal logic, with a visualization of "collections of facts" as stacked "layers" of world-lines.
This can easily be generalized to world-lines of "variables", "data", "objects" etc. From the individual point of view "values" are replaced by "histories" of values, and from the system point of view the whole system is represented by its stable state at each time the system is between computations. Simula later used a weaker, but useful version of this.
I should also mention Christopher Strachey -- a great fan of Lisp and McCarthy -- who realized that many kinds of programming could be unified and also be made safer by always using "old" values (from the previous frame) to make new values, which are installed in a the new frame. He realized this by looking at how clean "tail recursion" was in Lisp, and then saw that it could be written much more understandably as a kind of loop involving what looked like assignment statements, but in which the right hand side took values from time t and the variables assigned into existed in time t+1 (and only one such assignment could be made). This unified functional programming and "imperative like" programming via simulating time as well as state.
And let me just mention the programming language Lucid, by Ashcroft and Wadge, which extended many of Strachey's ideas ...
It's also worth looking at "atomic transactions" on data bases as a very similar idea with "coarse grain". Nothing ever gets smashed, instead things are organized so that new versions are created in a non-destructive way without race conditions. There is a history of versions.
The key notion here is that "time is a good idea" -- we want it, and we want to deal with it in safe and reasonable ways -- and most if not all of those ways can be purely functional transitions between sequences of stable world-line states.
The just computed stable state is very useful. It will never be changed again -- so it represents a "version" of the system simulation -- and it can be safely used as value sources for the functional transitions to the next stable state. It can also be used as sources for creating visualizations of the world at that instant. The history can be used for debugging, undos, roll-backs, etc.
In this model -- again partly from McCarthy, Strachey, Simula, etc., -- "time doesn't exist between stable states": the "clock" only advances when each new state is completed. The CPU itself doesn't act as a clock as far as programs are concerned.
This gives rise to a very simple way to do deterministic relationships that has an intrinsic and clean model of time.
For a variety of reasons -- none of them very good -- this way of being safe lost out in the 60s in favor of allowing race conditions in imperative programming and then trying to protect against them using terrible semaphores, etc which can lead to lock ups.
I've mentioned a little about my sequence of thoughts about objects. At some point, anyone interested in messaging between objects who knew about Lisp, would have to be drawn to "apply" and to notice that a kind of object (a lambda "thing", which could be a closure) was bound to parameters (which kind of looked like a message). This got deeper if one was aware of how Lisp 1.5 had been implemented with the possibility of late bound parameter evaluation -- FEXPRs rather than EXPRs -- the unevaluated expressions could be passed as parameters and evaled later. This allowed the ungainly "special forms" (like the conditional) to be dispensed with, they could be written as a kind of vanilla lazy function.
By using the temporal modeling mentioned above, one could loosen the "gears" of "eval-apply" and get functional relationships between temporal layers via safe messaging.
So, because I've always liked the "simulation perspective" on computing, I think of "objects" and "functions" as being complementary ideas and not at odds at all. (I have many other motivations on the side, including always wondering what a good language for children should be epistemologically ... but that's another story.)