Thinking in Actors – Challenging your software modelling to be simpler

jonathanpglick · 2024-12-02T16:37:06 1733157426

This is where erlang/elixir really shines! Using actor processes (aka genservers) to model business workflows and logic helps to align the programmer's and stakeholder's shared understanding and language of a feature, which leads to the implementation and expectations about it to be much more "correct".

Too many of us jump straight to modeling the domain objects as database tables without formalizing the the data model of the actual business need. Explicit state changes and considering "time" are way more important than database layout at that point.

And please either use operation explicit types and/or finite state machines when modeling the main domain objects.

My last three jobs started with untangling a bunch of "status" fields into explicit and hierarchical structures so that the whole company could agree and define what each of them actually meant. Common language, yo! It's the secret sauce!

phtrivier · 2024-12-02T12:05:48 1733141148

> State can be persisted (e.g., to storage, database or event log) between messages. If an actor crashes, it can recover its state on another node and resume processing.

And that's the part I never managed to solve, personally. The state of actors tend to look "a lot like" your domain objects, but you don't want to store it until the very end.

Do you have a data-model for each actor to store their own snapshots ? When do you snapshot ? When do you sync with the "ground truth" ? Do you have different tech for each kind of state (eg "term storage" for actors, then relational db for ground truth ?)

withinboredom · 2024-12-02T14:53:19 1733151199

> When do you snapshot ?

In Durable PHP (an actor system for PHP), it tries to achieve at-most-once processing guarantee (though more often than not, it is exactly-once).

1. commit the current state

2. send outgoing messages

3. ack the original event

If we fail before 1, we simply retry

If we fail between 1-3, we simply rewind and retry

phtrivier · 2024-12-02T18:11:35 1733163095

Then, it assumes that your actors don't have any side effects. (Otherwise, "retrying" is not an option.)

I found that surprisingly hard in most applications - and I think that's is a limitation of the actor model _in practice_, that is often overlook because it's only an implementation detail that does not really exists _in theory_.

withinboredom · 2024-12-02T18:20:13 1733163613

Durable PHP has a way to guarantee exactly-once side effects through "activities" (I assume you are talking about the outside world side-effects, not internally). Trying to do the same activity twice will simply result in merging with the previous execution. This isn't integrated with actors -- mostly because I hadn't thought about it before. It wouldn't be hard to do though.

However, in most async systems, you can only guarantee at-most-once or at-least-once; exactly-once is quite hard (and why there is a dedicated way to do it in this framework).

Internally, messaging is set up to guarantee at-least-once and duplicate messages are ignored (as well as dedicated ways to ensure deterministic execution).

dkersten · 2024-12-02T14:12:37 1733148757

This is what I’ve struggled with when trying to learn erlang/elixir/gleam:

If I go all in, then modelling as actors makes sense. The domain model is modelled as in-memory actors, using erlang features for persistence and what not.

But most of the time this isn’t what we want in modern software: instead our state is in a database and we are executing in a request-response setup. This seems to be mismatched with actors, as, it seems to me, at least, you’re mapping the database states into your actors on each request and then back on response, at which point, what do actors actually buy you?

The same mismatch exists with OO too, of course, but with actors there are a bunch of benefits that you get by going all in, which it seems to me are lost if you’re simply building a database backed request-response system.

I mean, many of the OP’s pros of actors rely on actors being long lived things, rather than short lived within a request.

Maybe I’m just missing something. But maybe it’s also just not suited to typical web api backend?

pjm331 · 2024-12-02T16:13:01 1733155981

a few years back i had one side project that i did in elixir/phoenix felt like a good fit specifically because it had a lot of state that needed to live outside of the request/resp cycle but also didn't live in a db

Specifically the project was an app that ran terraform commands inside an actor and streamed the logs to the browser

each actor had a terraform config and each message was a command to run on that config

so in that case the actor model felt like it aligned really well with how the program needed to work

karmakaze · 2024-12-02T03:35:58 1733110558

I suspect that a lot of what can be done with actors can be done in Go using the "Don't communicate by sharing memory, share memory by communicating". Basically a consumer of a channel can act like an actor. What we don't get for free is the activation and resiliency. A closer one would be Erlang/Elixir on the BEAM. On the more advanced end would be using F# or Pony.

Curious to see what the follow-up parts will post. I am interested in actor model programming and want to see different ways of understanding and using them.

jerf · 2024-12-02T14:08:16 1733148496

I view actors like I view "object orientation" in general. In fact they're not even all that dissimilar, being respectively "store and manipulate state attached to a single runtime context" (be it "thread" or "green thread" or async context or whatever) and "store and manipulate state attached to a single highly-structured value". (They go together pretty well, actually.) The primary value comes from the idea, and it can and should be managed and manipulated for the local context.

Many people will come along and insist that you need every last thing according to some particular definition (e.g., in OO, "you need multiple inheritance and all of public, private, and protected, and you need virtual methods and you need and you need and you need OR YOU DON'T HAVE TRUE OBJECT ORIENTATION AND EVERYTHING WILL FALL APART"), and will insist that everything at every layer of your program will need to be structured in this paradigm OR YOU DON'T HAVE TRUE WHATEVER AND EVERYTHING WILL FALL APART, but both claims are observably false.

I use actors in nearly every program I write, in almost any language I write in anymore if there's any concurrency at all. But they are a wonderful tool to drop in somewhere, and encapsulate some particularly tricky bit of concurrent logic, like an in-memory concurrent cache, without having to restructure everything to be All Actors All The Time. I spent a lot of years in Erlang and still sort of consider this a mistake, up there with Java trying to "force" everything into OO by forcing everything to be in a class, even if it has no business being there. You don't need to go down someone's check list and dot every i and cross every t "or you don't have true actors". It's very much an 90/10 situation where the vast bulk of the benefit is obtained as soon as you have anything even actor-ish, and the additional marginal benefit of going all the way down to some very precise definition can be marginal and may even be negative if it requires you to contort your code in some unnatural way just to fit into some framework that isn't benefiting you on this particular task (e.g., Erlang has a lot of nice tools but if you're building a purely-local command-line app with it for whatever reason you probably don't have a great need for clustering support or OTP in general).

suraci · 2024-12-02T07:45:16 1733125516

> a lot of what can be done with actors can be done in Go using the "Don't communicate by sharing memory, share memory by communicating"

Yes, it is logically correct since CSP and Actors are solutions for the same problem(concurrency)

And I do think CSP is easier to use compare with Actors as a programming model, but when it comes to distributed and business applications, the comparative advantage is reversed

asplake · 2024-12-02T09:07:53 1733130473

Why F# there?

karmakaze · 2024-12-02T13:55:26 1733147726

I was thinking of Mailbox Processor and available Actor libraries. Here's a pretty good rundown I found for some[0]:

> Actor Model in .NET

> In the .NET world, there are at least a few battle-tested frameworks that allow you to build stateful systems using the actor model. The famous ones are Akka.NET[1], Orleans[2], Proto.Actor[3], and Dapr[4]. All of these systems support distributed mode, as well as a bunch of other features like actor supervision, virtual actor, parent-child hierarchies, reentrancy, distributed transactions, dead letter queue, routers, streams, etc. In addition to these frameworks, .NET has several lightweight libs that do not have many of the listed features but still allow the use of the actor model. For example, F# has built-in support via Mailbox Processor[5] and also there the toolkits: TPL Dataflow[6] and Channel[7]. TPL Dataflow and Channel are not actor model implementations but rather foundations for writing concurrent programs with the ability to use the actor model design pattern.

[0] https://medium.com/draftkings-engineering/entering-actor-mod...

[1] https://getakka.net/

[2] https://dotnet.github.io/orleans/index.html

[3] https://proto.actor/

[4] https://docs.microsoft.com/en-us/dotnet/architecture/dapr-fo...

[5] https://www.codemag.com/Article/1707051/Writing-Concurrent-P...

[6] https://docs.microsoft.com/en-us/dotnet/standard/parallel-pr...

[7] https://devblogs.microsoft.com/dotnet/an-introduction-to-sys...

CharlieDigital · 2024-12-02T17:04:54 1733159094

Clarification: Dapr's actor system is not limited to .NET[0]

It currently supports .NET, Java, and Python

[0] https://docs.dapr.io/developing-applications/building-blocks...

LaserToy · 2024-12-02T04:36:52 1733114212

I love actors as a concept and I heard some large companies (Expedia) implemented large parts using them.

But I also saw how hard it is to understand a large system that built using actors. It is just hard to comprehend all the communication pathways and what happens in the system.

jspdown · 2024-12-02T11:37:13 1733139433

When the design closely aligns with the real world problem it solves, communication pathways are natural and you don't really have to care much about them. What matters is the Actor's role and making sure it represent a strong domain concept. The rest follows naturally.

But to be fair, it's never that simple and you always end up with some part of a system that's less "well-designed". In that case,figuring out who talks to who can quickly become a nightmare.

Actors are great on the paper, but to benefit from them, you need great understanding of your domain. I tend to use it later in the development process, on specific part where the domain is rich and understood.

rdtsc · 2024-12-02T16:48:53 1733158133

> But I also saw how hard it is to understand a large system that built using actors.

Indeed, it can be just as much of a spaghetti mess as any other code, but it becomes easier if actors are the preferred abstraction for a platform already, for instance as it is for Erlang/Elixir on the BEAM VM.

The platform comes with a few benefits such as:

  1) Immutable data: inside each actor the state is explicitly evolved from one message to the next. It's passed as an explicit argument to functions. Erlang is even better as the variable binding itself is immutable.

  2) Isolated heaps: actors all have isolated heaps. You can have millions of them per OS process and they can't reach in and modify each other's memory. They have to send and receive a message.

  3) Supervision trees: actors that work together can be grouped into a tree hierarchy so that if one starts, it start the others and they have "links" between them. If some crash, others crash with them. After the crash they can be restarted safely. It can be done safely because they have isolated heaps. Restarting a bunch of OS threads in a regular C/Java/etc program cannot be done safely, usually. These supervision hierarchies is how the system can be organized. A top level actor might serve as the API endpoint for its children so message go through it.

  4) Tracing/live debugging: every message that is sent or function call can be traced dynamically by connecting to a live system. That can be helpful of making sense of the mess when debugging.

There are many "actor" systems out there. It's not a big deal to write a function to send a message to a lockless "mailbox" to be received by a thread in pretty much any modern language/platform. Doing that seems like it gets you 90% there to "actors", but without those 4 points above it only gets there 10% of the way. You can build a quick demo, but it would become a nightmare in a production system.

jghn · 2024-12-02T15:09:18 1733152158

> It is just hard to comprehend all the communication pathways and what happens in the system.

Having worked on large scale actor-based systems before, I'll attest this is quite true. However, what often gets lost in these conversations is that this is also true of large scale OOP based systems as well.

If one takes a few steps back and squints, there's really not much difference between Objects and Actors: in both cases you have a namespaced entity (object, actor) that receives signals via some defined mechanism (methods, messages) which lead it to perform some action.

yodon · 2024-12-02T01:12:57 1733101977

Huge fan of building systems on top of Orleans Actors (one of the tools discussed in the article)

jatins · 2024-12-02T10:22:16 1733134936

is this AI generated? I flagged because seemed very low on substance and just generic bullet points

jeremycarter · 2024-12-02T11:13:09 1733137989

It's just part 1. I have just published part 2 which goes into more detail.

dirtbag__dad · 2024-12-02T01:20:49 1733102449

Does anyone care to share an example of the OrderService that is not anemic?

zamalek · 2024-12-02T06:19:19 1733120359

Some actor frameworks (Erlang, Orleans) would interpret each order being an actor instance, and each order line being an actor instance. To make that clearer: each order row would _logically_ have an actor running somewhere in your cluster. This would mean that an order actor would, by nature, have a "add order line" receiver (i.e. method). Each is in charge of how its data is stored. In reality, you'd _probably_ only do this stuff on the command end: query would hit tables+joins+whatever directly.

If you work this back to DDD, then the Order entity would have an AddOrderLine method. The central service would be responsible for handing out Order entities and saving them once modified.

Up until a few years ago (prior to my previous job) I was a hypothetical believer in DDD and actors as two equally viable alternatives. I am now strongly against DDD, but still agree with the article on a hypothetical basis about actors.

svieira · 2024-12-02T06:25:54 1733120754

What did you learn about (by using?) DDD that made you decide strongly against it?

zamalek · 2024-12-02T15:00:07 1733151607

DDD advocates for creating a shared lingo across everyone involved, including customers if they are highly embedded in the design process. This I agree with.

Ultimately DDD attempts to create real-world analogies in code; you know, dog inherits from pet inherits from mammal etc. In my opinion, this approach to OOP easily ends up creating code that is difficult to reason about. Probably because real-world things often have many responsibilities. Code becomes especially confusing when you have dozens of methods on domain objects that interact across several domains: the system-wide control flow becomes extremely complex. Now add outlier code/hacks, likely written to meet an unrealistic deadlines, and things rapidly become completely incomprehensible.

And there's more that's hard to put to words. I code for the love of it, and I truly hated every moment working in DDD code. That was I completely novel experience for me: I'm fine with boring work (it has to happen), but DDD just hit very differently.

valiant55 · 2024-12-02T15:36:35 1733153795

This sounds like DDD done wrong. Just because two concepts have the same name doesn't mean that they are the same thing. Drawing the boundaries of the bounded contexts is hard though, which is why shops often struggle with DDD.

For example if I'm building a pharmacy system a prescription means something to a patient, but also means something different (but similar) to a fulfillment team member. The prescription might have a prescriber, and its important for the patient to know the name, address and contact information of the prescriber. But for fullfulment purposes I don't care about the address or phone number, just the NPI, full name and title for labeling purposes. This doesn't just extend to data, but to actions, a patient can't "ship" a prescription and fullfulment can't "renew" a prescription. In a DDD model these should be two separate objects.

jeremycarter · 2024-12-02T11:25:11 1733138711

I agree. I'm not advocating doing a purist DDD solution but rather using similar modelling techniques and just making naive actors.

jeremycarter · 2024-12-02T01:54:59 1733104499

It's tricky to provide a really good example right now. I notice in the .NET world there are like these DDD "starter packs", and my god, they're just layers of maintainability hell. If you look at older OOP/DDD books you'll notice that the domain object has real world methods on it, just like as if it were from a UML diagram.

What you should end up with are plain OOPy objects that mirror the real world. They're not skewed or constrained by their database model. They shouldn't have any dependencies on your infrastructure layer. The object should encapsulate state, behaviour, validity and consistency.

An example (which is probably overkill) would be https://learn.microsoft.com/en-us/dotnet/architecture/micros...

The next post is about modelling an actor and it might provide more insight for you.

ibgeek · 2024-12-02T02:02:30 1733104950

It's not clear from the article whether actors offer significant benefits (or disadvantages) for data modeling versus the traditional OO paradigm. The article reads more like an introduction that describes the problem and teases a solution rather a complete article that offers a solution and evaluation of it.

jeremycarter · 2024-12-02T02:09:40 1733105380

That's fair feedback. I wanted to post it in 3 parts, but I see now I probably should have just made one large post.

macintux · 2024-12-02T03:57:05 1733111825

You could always post the 3 parts simultaneously, so anyone who wants to dig deeper can continue into the series.

dboreham · 2024-12-02T05:05:17 1733115917

Some background: https://en.m.wikipedia.org/wiki/Communicating_sequential_pro...

gpderetta · 2024-12-02T10:08:33 1733134113

Better link: https://en.wikipedia.org/wiki/Actor_model

CSP is a bit different from the actor model (see the comparison section in the page you linked).

The biggest difference is that CSP communication is synchronous.

ibgeek · 2024-12-02T02:00:21 1733104821

The article seems to be smashing together two (seemingly) unrelated topics and doesn't offer much in the way of a solution. What alternative design does the author propose? Is it possible to solve the problem with traditional object-oriented design techniques? It's not clear that the issues presented require or substantially benefit from the actor model without seeing a best in-class OO example.

jeremycarter · 2024-12-02T02:08:26 1733105306

As I mentioned there's nothing novel in this post, especially for a senior. This is more about getting some context out of the way so that I can show some techniques in a future post.

kiitos · 2024-12-02T05:53:57 1733118837

    Isolation: Since actors process messages they receive sequentially, there are no concurrency issues within an actor. This simplifies reasoning about state mutation and transitions.
    ...
    Fault Tolerance: State can be persisted (e.g., to storage, database or event log) between messages. If an actor crashes, it can recover its state on another node and resume processing.

The system model in which each actor instance is single-threaded, processes received requests individually and sequentially, and can "crash" in a way that affects only the in-flight request, is a total anachronism, irrelevant since more than a decade, at any meaningful scale.

packetlost · 2024-12-02T06:15:51 1733120151

I don't see how this isn't still both possible and entirely relevant to modern systems. An actor is not necessarily a system process/thread. It has more to do with scope/context than any particular execution model. Think more like a request handling context in an async language.

kiitos · 2024-12-02T06:37:19 1733121439

It's true that an actor is not necessarily a system process. It can be a process, or a thread, or a coroutine, or etc. But "crashing" isn't anywhere near so ambiguous. Crashing doesn't mean the request fails, or the thread gets killed -- crashing means the underlying process terminates.

That may not be how some folks understand the concept of crashing, but it's definitely how most folks understand it, at least insofar as they write software.

1 service instance needs to be able to handle O(1k+) concurrent requests at scale, and a failure in any given request can't impact other in-flight requests. Those failures aren't crashes, and that software isn't crash-only -- using those terms just obfuscates things, and makes everything harder for everyone.

jerf · 2024-12-02T14:17:01 1733149021

Any time you are communicating between multiple programming language communities, it is important to understand that they will have differing definitions for things, to extend grace to people trying to communicate across those barriers, and to not apply dogmatic definitions of terms that apply to the contexts you happen to be familiar with but are used differently elsewhere.

"Crash" is not a universally defined term and you will find there are plenty of communities that do not agree that "crash == OS process terminating".

kiitos · 2024-12-02T17:10:21 1733159421

For sure, yes. But terms of art, like "crash", generally have commonly-understood definitions. And the commonly-understood definition of the term of art "crash" is that it means OS process terminating. Not always! Not all the time. But in general, yes, that's what it means, to most people, most of the time.

t-writescode · 2024-12-02T06:10:20 1733119820

It sounds like you may know a lot about this; but each actor operating in a "single-threaded", "do one thing at a time on your own" paradigm is what I'm familiar with, especially with Akka.NET, where my experience lies.

Please tell me more about its irrelevance, etc, I'd love to learn!