While it's always dangerous to criticize a computer legend, I would add one more thing that I believe Alan Kay "got wrong about objects": objects are extremely bad at representing most types of problems in business software.
I mean this in regards to the "true" message-passing type of actor objects, and not OO as it came to be post-C++ and Java. The reason being is that actors 1) fundamentally involve themselves with state and time, and 2) connect subject and recipient together directly.
In relation to 1), most business problems involve _data_ processing, and the thing about data is it is immutable. It's essentially a solid state type of activity, and bears no resemblance to the organic, messy world of biology where things are changing. For problems like these, you really want to avoid state in your program, and persist state after data transformation in an actual database. Concepts like stateful actor supervision hierarchies, and what to do with clearing mailbox state when something goes wrong are fundamentally complex. Those concepts are also broadly unnecessary (and arguably harmful) for most request/response or async pub-sub type software. I can't count the amount of time I've wasted redoing systems on an actor model to be a simple "stateless" request model. On all of those occasions, the model that didn't persist state in-app was more reliable, easier to understand, less code, and more maintainable.
With respect to 2), actors basically violate all concepts of dependency injection. Actors need to know exactly where they're going (or you can add a proxy, and now your state graph is even more complicated), and even with message passing, program flow is dependent upon _time_ and the order in which a program is run. This is a large step backwards in my opinion.
I feel that languages like Elixir that pride themselves on stateless, immutable functions, have a deep contradiction internally embracing the actor model, because it is none of those things. Sadly, I also predict that this generation of Elixir programmers will learn the lessons of the Waldo paper the hard way: you can't just throw an actor model over the network and have it work great. (I think it's especially interesting that most of the success stories around modern day actor systems involve problems that can withstand data loss: telephones, IOT, content, media streaming, games, etc.)
> most business problems involve _data_ processing
To quote two other computer legends, Fred Brooks and Linus Torvalds are right: data not behavior is the more crucial part of programming.
Actors focus on the behavior, whereas the data is the part that is really important.
What `git log` does isn't all that important relative to how the git object model is structured, as the latter dictates the former.
> I can't count the amount of time I've wasted redoing systems on an actor model to be a simple "stateless" request model.
So much this. So often people forget that "it's just code". It can run on a toaster, or a mainframe, or a VPS, or a Docker container, or a Lambda function. The lion's share of code should be structured as input => output in the most straightforward way possible.
You're actually advocating for an "actor model" yourself, you know :)
The actor model of "actors as services" and "messages as API calls" that has practically won in the real world. It also has the advantage of not being tied to one particular VM and it's polyglot by nature. It also isn't something that can be "scaled down" to language level constructs, and maybe that's for the best. Cells aren't made up of organs or of other smaller cells (mythocondria are quite exceptions, and the pattern goes only one level deep).
I think the deeper truth here is that truly good architectures are emergent/evolved, not designed/architected... only problem is that they tend to emerge/evolve on time scales longer than the lifetime of most businesses, so you can't take the approach of "let's evolve the right architecture" and actually be able to ship anything on time. Don't know a "solution" for this other than staying humble and mentally flexible.
Quite zen-ish ironical to think that the deepest wisdom we got so far in distributed may be the "PHP + MySQL" model of statelessness, short live isolated processes and business-valuable state kept in specialized storage systems :) The language you use for message processing is largely irrelevant as long as you keep the processing of messages isolate and transient (eg. throw away everything, including leaked/overflowed memory, after processing a message and replying - most systems besides PHP still fail at this!), the data store is also irrelevant as long as it support some basic transactional guarantees and you can easily back it up and import/export data from it to other systems better suited for other tasks.
> the deepest wisdom we got so far in distributed may be the "PHP + MySQL" model of statelessness, short live isolated processes and business-valuable state kept in specialized storage systems
PHP + MySQL was necessitated for a class of computer programs that process durable, broadly distributed information.
And the "deepest wisdom" is to ignore the distributed nature as much as possible. Yes there is a primitive actor-like model but that is because we more or less have to meet business requirements. If it were possible to scrap the distributed nature of it, we would.
I think you're very confused. Systems like Elixir are powerful because with care they make it easy to design reliable systems where state is marked off in handleable places where state - the source of the trickiest errors - is contained and process faults are segmented and do not cascade.
You talk about supervision trees as if they are an avoidable source of complexity. Well if you're going to design a stateless request system you'll still need systems to manage dns lookups, elastic load balancing, and what is going to happen when a backhoe cuts the fibre to the data center where your code resides. That complexity will exist no matter what, it's just that when you use the beam you do it mostly in one language instead of having to wrangle reams of YAML with a tee ruler.
Have you ever deployed anything in Elixir? Your commentary makes it seem like you only have a superficial and incorrect understanding of the relationship Elixir has with statefulness and statelessness.
> If you don't need a process, then you don't need a process. Use processes only to model runtime properties, such as mutable state, concurrency and failures, never for code organization.
I strongly recommend you watch "boundaries" by Gary Bernhardt. This talk (which precedes elixir) exemplifies the programming philosophy that elixir has attained, though I don't know if that evolved organically or not. The point is, you have to accept that some parts of the universe are going to be stateful. You model those parts carefully, and then encapsulate it in a stateless core which precisely does only data transformations.
I've used Akka for a decade (basically as soon as it came out) and am quite familiar with systems designed by the Erlang/Elixir community.
While actors definitely are a powerful mechanism for reasoning about stateful systems, most business systems are better off without having to design their own _technical_ distributed systems or in-app persistent state model. There is a reason that dedicated systems like Zookeeper, Redis, squid, relational databases, etc. exist, and that's because those things are insanely hard. Those systems also do not need to know about your business problem.
The actor model, and Elixir w/ OTP subtly encourage coupling of technical, distributed systems problems in the same application scope as the business problem. This is where that thorny state comes into play, and I believe it's a fundamentally wrong instinct. (It is also an instinct that is relatively vocal within the Elixir community: "throw OTP at it!", "once you need to scale, you can just add more machines to your problem!", etc.).
I have a rule on design reviews. I ask the team "is this system a technical system or a business system?" If they say "both", I ask them to make it into two systems, one technical and one business. Most likely, a "simple" technical system that does what they want already exists, and we can use it. While OTP and actors are good at designing interfaces or protocols for distributed systems, writing bespoke error handling for your business code with actors is going to be buggy and wrong. Make the business code pawn off state coordination or storage problems to systems with known, provable properties and a remote interface. (That interface will also be usable by other languages, and not tied to the Erlang VM).
> The actor model, and Elixir w/ OTP subtly encourage coupling of technical, distributed systems problems in the same application scope as the business problem. This is where that thorny state comes into play, and I believe it's a fundamentally wrong instinct.
I disagree that Elixir encourages coupling those systems. While erlang comes with no such help, in Elixir, you should be decoupling your distribution problems from your business logic by putting them into different applications within the same umbrella. Sure, there will be a coupling in the sense that they will live in the same VM, but that is abstractly similar to "our microservices all live in {AWS/GCP/K8s}", especially given the level of process isolation that the BEAM affords you.
But for all intents and purposes (program design, testing, heck you can even pick and choose components for deployment) they are decoupleable.
> Sure, there will be a coupling in the sense that they will live in the same VM, but that is abstractly similar to "our microservices all live in {AWS/GCP/K8s}", especially given the level of process isolation that the BEAM affords you.
I disagree that these are conceptually similar. If all my microservices are in a certain cloud, it does not imply that any important state is maintained in those services and not a database. It also does not require me to have my services in Erlang or whatever the actor model is written in. Actor supervision systems (in whatever language) are essentially narcisstic: they have to be in control over the state management lifecycle.
Actors fundamentally are of the philosophy where some state lives in the application, and it's mutated by these incoming messages. This is totally different from systems that scale, where any application "state" is just transient from a request or message in order to facilitate a mutation in a dedicated store outside of the app. There's many reliability benefits to that: language independence, being able to know the operational characteristics of state management don't change out from under you when you release a new version of code, ability to query state operationally without having to run map-reduce across code in your cluster (how fun would it be to write "get me the relevant state of my system" for a custom-written actor hierarchy instead of SQL?).
> Actor supervision systems (in whatever language) are essentially narcisstic: they have to be in control over the state management lifecycle
I'm not sure why you think elixir discourages using databases. Phoenix by default ships with ecto bindings and you're encouraged to use postgres or Amazon rds or whatever for your bulk state management. Phoenix pubsub ships with redis bindings as an option if you'd rather not use distributed erlang.
Fwiw, I have written a multinode graphql query handler that pulls results from actors instead of a database (in this case using a database as a central source of truth is a bad idea because the data being queried lives in "its own databases", libvirt, if you are curious, distributed across nodes and having competing sources of truth can be very hard to manage). Handling these queries was not difficult. And your actors, btw, don't typically live in a very deep hierarchy anyways, you use something like Registry, which lets you query your actors in a flat fashion (the only hierarchical components are supervisors, which don't store much in the way of business logic state)
I think the belief is that if all your state is held in a transactional database, then your actors don't have any state in them and at that point it's not clear what they're doing with messages that the sender couldn't just do directly themselves by talking to the database.
My own experience with Akka style actor designs has been pretty poor. The problem is you lose the notion of a cross-machine call stack. Actors just have mailboxes and process inbound messages/outbound messages, which means you have to maintain state machines yourself and/or handle re-entrancy. Lots and lots of bugs in designs like that.
On the other hand, simply doing blocking RPCs across services where the RPC framework handles re-entrancy and stack consistency for you, it doesn't have that issue. That's how services like Google search are built. They shard across many machines vs talking to a database directly primarily for performance or to enforce team boundaries, not as part of an over-arching architectural design pattern.
Yeah probably 90% of actors in elixir hold state that is relevant to a connection status. Think, tcp state machine or http connection state. It is veeery nice to have a state machine back your inbound http long poll or websockets connections if you want to do serverside rendering or serverside content streaming etc.
The other case is when you're doing something cqrs-ey where you can have competing load balances requests and you need to make sure that writes are logged, committed, then executed, and reads are cached. You should use an actor for that but ultimately your data will reside in a database.
This is totally different from systems that scale, where any application "state" is just transient from a request or message in order to facilitate a mutation in a dedicated store outside of the app.
It's notable that this is the model on which the world is built. This is the model that runs almost all the biggest systems in the world.
Please tell me if I am wrong, but I get the feeling that you conflate the idea of "bad programming and design of a system" with the actor model in general.
You can do just as badly with an object oriented model as you can with the actor model. Bad applications will be bad.
Business software must deal with change! Even something as simple as a financial account changes behavior because the balance in the account changes.
Of course, many business Actors never change their behavior. For example, a quarterly report to the SEC never changes although additional reports to the SEC may correct information in previous reports.
Of course, you know that Factorial is an Actor that never changes behavior :-) Also, the (infinite) list of prime numbers is an Actor that never changes behavior.
Thanks for clearly articulating why the actor model is poorly suited to model data and control flow. In practice, the complexity of the control flow also amplifies the challenge of managing schema and control flow evolution. Particularly challenging when mixed with persisted messages, for example when using https://doc.akka.io/docs/akka/current/persistence.html
I mean this in regards to the "true" message-passing type of actor objects, and not OO as it came to be post-C++ and Java. The reason being is that actors 1) fundamentally involve themselves with state and time, and 2) connect subject and recipient together directly.
In relation to 1), most business problems involve _data_ processing, and the thing about data is it is immutable. It's essentially a solid state type of activity, and bears no resemblance to the organic, messy world of biology where things are changing. For problems like these, you really want to avoid state in your program, and persist state after data transformation in an actual database. Concepts like stateful actor supervision hierarchies, and what to do with clearing mailbox state when something goes wrong are fundamentally complex. Those concepts are also broadly unnecessary (and arguably harmful) for most request/response or async pub-sub type software. I can't count the amount of time I've wasted redoing systems on an actor model to be a simple "stateless" request model. On all of those occasions, the model that didn't persist state in-app was more reliable, easier to understand, less code, and more maintainable.
With respect to 2), actors basically violate all concepts of dependency injection. Actors need to know exactly where they're going (or you can add a proxy, and now your state graph is even more complicated), and even with message passing, program flow is dependent upon _time_ and the order in which a program is run. This is a large step backwards in my opinion.
I feel that languages like Elixir that pride themselves on stateless, immutable functions, have a deep contradiction internally embracing the actor model, because it is none of those things. Sadly, I also predict that this generation of Elixir programmers will learn the lessons of the Waldo paper the hard way: you can't just throw an actor model over the network and have it work great. (I think it's especially interesting that most of the success stories around modern day actor systems involve problems that can withstand data loss: telephones, IOT, content, media streaming, games, etc.)