While making the state immutable made it more visible what code was affecting the state, the end result is still multiple pieces of code directly affecting what is essentially a global state. Sure, a copy is passed from one function to the next, but there is still one 'current' state and everything is messing with it directly.
The real problem here is not the mutability of the state, it is the ownership of it. Who is responsible for keeping the state internally consistent ? In this code the answer is: no one.
To solve his problem, there needs to be a clear owner of the state, and that code should be the only code directly affecting the state and and be responsible for keeping the state internally consistent.
Wether this 'owner' is a collection of functions that operate on a global state in a language like C, or on a state passed in and returned, or an object in an OO language, or whatever. Doesn't really matter.
For example. Moving the character and collision detection should not be two separate function that affect the state but that can be called separately (or in the wrong order) and keep the system in an incorrect state. Only the code responsible for modifying the state should do so, and it should guarantee to leave it in a correct state on returning. Moving without collision detection can leave the system in an incorrect state and thus should not even be a function that exists.
When designing a system this is always something I keep in the back of my head: who is responsible for what ? Once you have that clear, things become much easier.
There are two (good) ways I know how to wrangle ownership of information: where, or when. But in all of the sane systems I know, at the end of the day it’s really all “when”.
If there is no “where” for state alterations and they can happen any time, then you are in full global shared stage anarchy mode, which some people seem to be perfectly fine with.
If the system of record is the source of authority then “when” is at write time, regardless of who does the write.
If you know when the data was last altered, you can reason about every interaction that happens “after” because what you see is what you get.
The smartest thing about Angular was that there was a layer of the code - the services - that was expected to do all state transformations on data from the server. Anything in your app was “after” so you could trace the interactions by reading the code.
Plus, it was easier to convince the REST endpoint to do the tranforms for you because you had a contiguous block of working code that explained the difference between what you got and what you wanted. A few sniffs at the data to determine if the modifications had already been made was all the migration strategy you needed. If the transform was cacheable upstream, or found its way into the database, the more’s the better.
The upshot is that if you don’t know a priori what information a unit of work requires, then you don’t have an information architecture. And if you don’t solve that problem then you’re going to fall into a concurrency tarpit that often gets called Cache Invalidation Hell, but that’s just the dominant symptom.
What if the database itself contained all business logic? (but no plumbing)
Every table has a function that is the only function that can write/insert into that table. The functions themselves are just records in a specific table, which the database schedules to run based on global access activity (it prefers to prioritize functions that consume more records than they produce, to keep space tight).
But what language constructs are the most universally efficient at expressing the distribution of that ownership responsibility in a way that everyone can both understand and agree upon?
Objects with getters and setters?
Constant self-reflection?
Serialization and schema?
Complex query engines in smart databases?
A political process of trust management?
Or maybe just the soothing chaos of an evolving bio-electro-mechanical planetary supersystem of expanding consciousness.
> Who is responsible for keeping the state internally consistent ?
What does this even mean?
> The real problem here is not the mutability of the state, it is the ownership of it.
1. Start with an initial state
2. Pass a copy of that initial state into a function, let it return a slightly modified version of that initial state
3. Pass that slightly modified version into another function, wash rinse repeat
The only ownership that is happening that needs to be worried about is the parent function passing a copy of state to a child function for the duration of that call, right?
It means that you need to concentrate the actual code that manipulates the state and perform all state changes through this code to ensure that the state is always correct. OO solves this through the concept of encapsulation, but that is just one way of doing it.
If you have code all over the place that can manipulate the state, then it becomes extremely hard to ensure that the state remains valid.
I'm not talking about the ownership of the particular instance, I'm talking about what code is allowed to, and responsible for, making alterations to state instances in general.
If I ask you what code can make changes to state, you should be able to point to a small-ish part of your codebase and say: only these functions can make alterations. Each of those functions should guarantee that the state they return is a valid state, that is: they are responsible owners.
All functions that operate on the state have the main responsibility to keep the state valid (e.g. no player character inside an object). Each specific function has additional responsibilities (e.g. move the character if possible).
In the example, the move function can take an existing valid state, and turn it into an invalid state, the player can be moved inside an object. So you when you think about what that function's responsibility should be: it is to attempt to move the main character to a new, valid position. If you think of it like this, you quickly realise that the move and collision detection functions should be combined.
> The only ownership that is happening that needs to be worried about is the parent function passing a copy of state to a child function for the duration of that call, right?
If you're passing a copy of state from one function to the next, and every function can just modify it in whatever way it wants to, they you basically have globals but with more copying.
I think the parent post is suggesting a lack of cohesion. E.g. the things changing state are sprinkled around in too many different places--it makes it hard to reason about
The solution really depends on the design pattern so the message tends to be fairly vague. From an OOP perspective, maybe more state modifiers should be instance methods are at least belong in the same namespace
I feel like, what is confusing you is that the parent is talking using broader/more general terms, rather than about specifics i.e. the state as an object which you can pass inside functions. His point, I believe, is that what causes problems/bugs is often when partially invalid state is shared, not so much specific implementation details like "are the modifications to the state visible because of mutation or because a copy of the state is passed around explicitly".
I am super bad with terminology so I'll apologize beforehand.
I've found that a good way to avoid this ownership conflict in OO is to categorically prohibit any public accessors to _inherited_ variables, be it at construction phase or later, be it passively (via setters) or actively (via observers).
And there should be only ONE provider of said value, also I've found is sometimes better to have a hot spot where all nodes converge and use it as a nursing node, and JUST THEN, fork this nursing node into every let's say "logic gate requirement" node (with a cached state each).
This is a good approach IMO as long as these smaller nodes are required by more than 2 observers, if not, then a simple specialized observer is the way to go.
In statically typed languages you declare x an unsigned 8-bit integer. Everybody may write to x and still; you'd never anxiously expect x to be anything but an unsigned 8-bit integer.
There is nothing wrong with "multiple pieces of code" "directly affecting global state" if your business rules are encoded in such a way.
That works because an unsigned 8-bit integer is one piece of data.
But the problem we're kind of discussing is a group of objects and/or characters moving in a shared environment. You can't represent that as one 8-bit integer. Instead, it's a bunch of pieces of data: x- and y-locations (and maybe z, as well), extents, x-and y- (and maybe z-) velocities. You can easily get that into an inconsistent state - two objects occupying the same space. This is the point of "not many places write the data": To keep the data in a consistent state, you just have to get a little bit of code working right. To debug it when the data is in an inconsistent state, the problem can only be in a few places.
If you're in a multithreading situation, it gets even worse. Yes, it still works for your unsigned 8-bit situation, because it can be written in one assembly instruction. But if your data takes more than one assembly instruction to write, you have to worry about threading. If there are only a few places that write the data, you only have to protect a few places to keep the data from being corrupted by threading issues. (You might also have to protect the readers, so that they can't read it halfway through a series of writes...)
> That works because an unsigned 8-bit integer is one piece of data.
No, the same holds true for any struct as well. If you declare x to be {a: uint8, b: uint8} you cannot magically turn it into {x: string, y: string, z: string}.
The real problem is that in most languages the expressiveness of this system is severely limited. You cannot do it in Javascript at all, but Typescript gets you very far.
> Instead, it's a bunch of pieces of data: x- and y-locations (and maybe z, as well), extents, x-and y- (and maybe z-) velocities. You can easily get that into an inconsistent state - two objects occupying the same space.
I think that's precisely the OPs point. You model the data (with types) in a way that this invalid state becomes unrepresentable:
Map[(x,y) -> object]
Now you simply can't have two objects at the same coordinate in your game. Of course you can now come up with more constraints and you will then have to refine your types (and create new ones) to match these constraints.
Certain constraints might be hard to express as types in many languages (especially most mainstream languages lack here), but that's not a general problem of the approach but rather a problem of specific languages - for which you then have to find alternatives.
> If you're in a multithreading situation, it gets even worse. Yes, it still works for your unsigned 8-bit situation, because it can be written in one assembly instruction. But if your data takes more than one assembly instruction to write, you have to worry about threading.
Not with immutable data and functional programming (and I mean that's what the article is all about). This style forces you to make state changes explicit by only making copies and, if copies are made concurrently, explicitly specifying how to merge these copies.
Compare:
var map = Map(...)
fill_randomly1(map)
fill_randomly2(map)
do_something_cool(map)
Now if the fill_randomly functions work in parallel/concurrent you have to somehow ensure that they are called in the right way/order and do_something_cool is called after they have finished. Or worse: if they are supposed to run after each other then you have to be super careful about how your whole code is executed to ensure sequential execution of the parts of your code that calls these functions. Which brings us to your valid point that the places that have to be checked to understand the execution should be as limited as possible.
But compare it to the pure functional style where mutation is not possible:
immutable var map = Map(...)
immutable var map1 = fill_randomly1(map)
immutable var map2 = fill_randomly2(map)
do_something_cool(???) // What do we put inside?
The compiler here will force you to give it a map - but which one? It explicitly makes you aware that you have to somehow merge the results. So let's do that:
immutable var map = Map(...)
immutable var map1 = fill_randomly1(map)
immutable var map2 = fill_randomly2(map)
immutable var map3 = merge(map1, map2)
do_something_cool(map3)
The important thing here is that it does not matter if fill_randomly1 and fill_randomly2 are run one after each other, in parallel or somehow concurrently - it is explicitly specified how the results are merged and the result will always be the same. Also, do_something_cool is guaranteed to run after the other functions, simply because it refers to a variable that relies on the output of the previous functions, so you simply cannot run it "at the wrong time".
More to the point for the kind of software I write: What if map3 isn't immutable? What if it's something like the state of a router for a TV station, something that keeps changing? Then it can't be immutable, and you need a way to keep other threads from reading it while it's changing (or, worse, writing it while it's changing).
Or else, you say that it's immutable, and when some thread changes the state, it produces a new map3, and the old one doesn't change. But then you have the problem of getting all the other threads updated to see the new version of map3.
(And, by the way, I've worked on that router for TV stations. Modeling that matrix in a way that invalid states are unrepresentable is, um, extremely non-trivial...)
Not sure If I understand what you mean. It would be both possible to call "merge" in parallel or to implement merge to do the work in parallel as well.
> Or else, you say that it's immutable, and when some thread changes the state, it produces a new map3, and the old one doesn't change. But then you have the problem of getting all the other threads updated to see the new version of map3.
Yes, but this "problem" is exactly the beauty of this style of programming. It forces you to make your data-flows explicit and hence easy to discover, understand and manipulate/change by other developers.
> And, by the way, I've worked on that router for TV stations. Modeling that matrix in a way that invalid states are unrepresentable is, um, extremely non-trivial...
I'm not saying it is trivial. But it is possible and depending on the programming language it is actually surprisingly easy. Not only that, with a good typesystem, you not only prevent invalid state from occurring, you even get a lot of support from your IDE due to the extra information it has.
Here's an example how that can look like for matrices and functions to manipulate them (like transpose): https://youtu.be/DRq2NgeFcO0?t=1137
The syntax might be alien to you, but I hope it still shows what's possible at compile-time before even running the program with modern languages today.
The real problem here is not the mutability of the state, it is the ownership of it. Who is responsible for keeping the state internally consistent ? In this code the answer is: no one.
To solve his problem, there needs to be a clear owner of the state, and that code should be the only code directly affecting the state and and be responsible for keeping the state internally consistent.
Wether this 'owner' is a collection of functions that operate on a global state in a language like C, or on a state passed in and returned, or an object in an OO language, or whatever. Doesn't really matter.
For example. Moving the character and collision detection should not be two separate function that affect the state but that can be called separately (or in the wrong order) and keep the system in an incorrect state. Only the code responsible for modifying the state should do so, and it should guarantee to leave it in a correct state on returning. Moving without collision detection can leave the system in an incorrect state and thus should not even be a function that exists.
When designing a system this is always something I keep in the back of my head: who is responsible for what ? Once you have that clear, things become much easier.