I had several stages about "objects". The first was the collision 50 years ago in my first weeks of (ARPA) grad school of my background in math, molecular biology, systems and programming, etc., with Sketchpad, Simula, and the proposed ARPAnet. This led to an observation that almost certainly wasn't original -- it was almost a tautology -- that since you could divide up a computer into virtual computers intercommunicating ad infinitum you would (a) retain full power of expression, and (b) always be able to model anything that could be modeled, and (c) be able to scale cosmically beyond existing ways to divide up computers. I loved this. Time sharing "processes" were already manifestations of such virtual machines but they lacked pragmatic universality because of their overheads (so find ways to get rid of the overheads ...)
Though you could model anything -- including data structures -- that was (to me) not even close to the point (it got you back into the soup). The big deal was encapsulation and messaging to provide loose couplings that would work under extreme scaling (in manners reminiscent of Biology and Ecology).
A second stage was to mix in "the Lisp world" of Lisp itself, McCarthy's ideas about robots and temporal logics, the AI work going on within ARPA (especially at MIT), and especially Carl Hewitt's PLANNER language. One idea was that objects could be like servers and could be goal-oriented with PLANNER-type goals as the interface language.
A third stage were a series of Smalltalks at Parc that attempted to find a pragmatic balance between what was inevitably needed in the future and what could be done on the Alto at Parc (with 128K bytes of memory, half of which was used for the display!). This was done in partnership with Dan Ingalls and other talented folks in our group. The idealist in me gritted my teeth, but the practical results were good.
A fourth stage (at Parc) was to deeply revisit the temporal logic and "world-line" ideas (more on this below).
A fifth stage was to seriously think about scaling again, and to look at e.g. Gelernter's Linda "coordination language" as an approach to do loose coupling via description matching in a general publish and describe manner. I still like this idea, and would like to see it advanced to the point where objects can actually "negotiate meaning" with each other.
McCarthy's Temporal Logic: "Real Functions in Time"
There's lots of context from the past that will help understanding the points of view presented here. I will refer to this and that in passing, and then try to provide a list of some of the references (I think of this as "basic CS knowledge" but much of it will likely be encountered for the first time here).
Most of my ways of thinking about all this ultimately trace their paths back to John McCarthy in the late 50s. John was an excellent mathematician and logician. He wanted to be able to do consistent reasoning himself -- and he wanted his programs and robots to be able to do the same. Robots were a key, because he wanted a robot to be in Philadelphia at one time and in New York at another. In an ordinary logic this is a problem. But John fixed it by adding an extra parameter to all "facts" that represented the "time frame" when a fact was true. This created a simple temporal logic, with a visualization of "collections of facts" as stacked "layers" of world-lines.
This can easily be generalized to world-lines of "variables", "data", "objects" etc. From the individual point of view "values" are replaced by "histories" of values, and from the system point of view the whole system is represented by its stable state at each time the system is between computations. Simula later used a weaker, but useful version of this.
I should also mention Christopher Strachey -- a great fan of Lisp and McCarthy -- who realized that many kinds of programming could be unified and also be made safer by always using "old" values (from the previous frame) to make new values, which are installed in a the new frame. He realized this by looking at how clean "tail recursion" was in Lisp, and then saw that it could be written much more understandably as a kind of loop involving what looked like assignment statements, but in which the right hand side took values from time t and the variables assigned into existed in time t+1 (and only one such assignment could be made). This unified functional programming and "imperative like" programming via simulating time as well as state.
And let me just mention the programming language Lucid, by Ashcroft and Wadge, which extended many of Strachey's ideas ...
It's also worth looking at "atomic transactions" on data bases as a very similar idea with "coarse grain". Nothing ever gets smashed, instead things are organized so that new versions are created in a non-destructive way without race conditions. There is a history of versions.
The key notion here is that "time is a good idea" -- we want it, and we want to deal with it in safe and reasonable ways -- and most if not all of those ways can be purely functional transitions between sequences of stable world-line states.
The just computed stable state is very useful. It will never be changed again -- so it represents a "version" of the system simulation -- and it can be safely used as value sources for the functional transitions to the next stable state. It can also be used as sources for creating visualizations of the world at that instant. The history can be used for debugging, undos, roll-backs, etc.
In this model -- again partly from McCarthy, Strachey, Simula, etc., -- "time doesn't exist between stable states": the "clock" only advances when each new state is completed. The CPU itself doesn't act as a clock as far as programs are concerned.
This gives rise to a very simple way to do deterministic relationships that has an intrinsic and clean model of time.
For a variety of reasons -- none of them very good -- this way of being safe lost out in the 60s in favor of allowing race conditions in imperative programming and then trying to protect against them using terrible semaphores, etc which can lead to lock ups.
I've mentioned a little about my sequence of thoughts about objects. At some point, anyone interested in messaging between objects who knew about Lisp, would have to be drawn to "apply" and to notice that a kind of object (a lambda "thing", which could be a closure) was bound to parameters (which kind of looked like a message). This got deeper if one was aware of how Lisp 1.5 had been implemented with the possibility of late bound parameter evaluation -- FEXPRs rather than EXPRs -- the unevaluated expressions could be passed as parameters and evaled later. This allowed the ungainly "special forms" (like the conditional) to be dispensed with, they could be written as a kind of vanilla lazy function.
By using the temporal modeling mentioned above, one could loosen the "gears" of "eval-apply" and get functional relationships between temporal layers via safe messaging.
So, because I've always liked the "simulation perspective" on computing, I think of "objects" and "functions" as being complementary ideas and not at odds at all. (I have many other motivations on the side, including always wondering what a good language for children should be epistemologically ... but that's another story.)
This reminds me of the way Datomic stores data.
John Carmack came across this idea at some point - in 'Functional programming in C++' , he writes:
"When you start thinking about running, say, all the characters in a game world in parallel, it starts sinking in that the object-oriented approach of updating objects has some deep difficulties in parallel environments. Maybe if all of the object just referenced a read only version of the world state, and we copied over the updated version at the end of the frame… Hey, wait a minute…"
I don't know if he got this explicitly from the Strachey lineage, or if it's the inevitably approach if you try to use immutability to tackle concurrency.
Again: it's really worth looking at Dave Reed's MIT thesis ...
I think that as many ideas are crystallized, it is no longer helpful to characterize every pure and/or temporal-logic approach as functional; indeed, the two notions may often be at odds. (Pure) functional programming, at least as (sort-of) defined today. FP is not directly about modeling time and certainly does not impose synchronicity (a clock), but is more about the idea of denotational semantics, where computations are imagined as functions, and function application imposes an equivalence relation (the function application is said to be equal to its result in this relation). This is very different from the philosophy of synchronous languages.
: The modern language Céu is based on Esterel: http://ceu-lang.org/
Terms often lose their meanings as they become religions and more rigid choices and styles. The two in question here are certainly "object-oriented" and "functional programming"!
Absolutely, especially when they're vague. But I still think that the synchronous or temporal-logic based languages (which I love!) are very much distinct from FP (at least in the Haskell "tradition"), although they share with each other -- and not with the classical imperative approach -- the explicitness of state transitions: the synchronous/TL model with its clocks/steps, and the FP model with its (hideous) monads. They also "feel" very different (Esterel/Céu and even TLA+ feel more imperative than functional). It is not surprising that the killer-app for the FP approach is compilers (and other code analysis tools), while the killer-app for synchronous/TL approach is reactive systems, two application classes that are at the opposite, extreme ends of Pnueli's (I think) classification of programs (transformational, interactive and reactive).
Most people building commercial distributed systems today accept a different set of tradeoffs, generally preferring fast performance and the avoidance of possible deadlocks even though it means giving up some of the nice semantics and consistency of the operations.
Do you still think it is possible to build a system with the organically-growable scale of the internet but with some of the nice, predictable pseudotime-like semantics?
And the next round about 15 years ago -- for Croquet -- which included Dave Reed, David A Smith, Andreas Raab, myself, etc. There were three or four different versions of this mechanism, all of which worked, and the designs progressively got better.
There are several sets of issues and tradeoffs. Reed's initial ideas in the 70s were about what an operating system for the whole Internet should look like. One of the many interesting ideas he proposed was to deal with the slow latencies and large numbers of potential users by distributing cloned computations organized by pseudo-time, and using the slow Internet for just input and synch deltas. This is what we first implemented in Croquet in the early 2000s when a typical good Internet round trip for a ping was about 80-100 milliseconds. This is good enough to create a massively parallel game like World of Warcraft (or even a flight simulator game) without any servers at all by just using the distributed machines that the game players are using to play on.
Further versions did more, and more comprehensively.
(A side note is that Croquet without the real-time graphics and interaction, etc. automatically provides a distributed object-oriented data base, etc. Pseudo-time is the larger idea underneath protected distributed atomic transaction data-bases, etc.)
Your last question is a good one, and it has been discussed over the years. Can it be done? How much would need to be done, and where? Etc.
The object work we did at Parc was a part of the network thinking going on in the ARPA/Parc community. And there was an eventual intent to think of higher-level network entities. A quite good intermediate real implementation of "net level entities" was the LOCUS system by Gerry Popek and his group at UCLA after he spent a year at Parc thinking about what network systems should be like (this was really good, and the first several chapters of his book are well worth reading). It was a migratory load balancing idea over heterogeneous machine types, which was quite complementary to the pseudo-time ideas.
I would like to see a highly talented group take another pass at your last question. Resilience to scaling is quite difficult to predict ahead of time, and it's good to do real implementations. Still, the 9 orders of magnitude that the Internet needed was possible with the scheme that the ARPA/Parc community had, and Croquet worked perhaps better than it should have (especially with just a few people working on it).
In any case, I think it is a bit crazy to not use pseudo-time if you know about it, and it's a bit amateurish not to be aware of it. In the middle are religions whose identities are built on various "one true way"s (and this is the opposite of the way any science should be conducted).
I only recently read David Reed's thesis and did not know about this later work until you mentioned it. Almost all of the links I can find about Croquet no longer work. (OpenCroquet and OpenCobalt websites seem to have disappeared.) Can you recommend reading for someone who wanted to understand the design decisions made in the several versions of Croquet?
I have been experimenting with layering pseudo-time on top of asynchronous message-passing in Erlang. One useful trick which I have not yet seen published is that you can have two-dimensional pseudo-times, so that one running system can track multiple chains of causation. (You would not want this happening all the time, since the whole point is to get a consistent history everywhere. But it is very useful for developers.)
An example: Let's say I have an existing object and a clone of it where I have changed some of the methods' behavior, but I want to understand all the consequences of these differences across all the distributed objects they cooperate with. I can send them the same message with different y-coordinate pseudo-times. In a system that relied on wall clock time, the side-effects of these two message sends would merely step on each other non-deterministically. But with pseudo-time, their effects are causally distinct throughout the system and can be dynamically inspected, making it a lot easier to build a human understanding of the behavior of the distributed system.
But what you are doing is good and important. We like a quotation by Goethe: "We should all share in the excitement of discovery without vain attempts to claim priority". In other words, it doesn't matter when good ideas happen -- it is enough that they do happen.
...and it is a nice way to make games:
 'How to Design Worlds: Imaginative Programming in DrScheme' http://world.cs.brown.edu 2008, Felleisen, Findler, Fisler, Flatt, Krishnamurthi