
Local State Is Poison (2012) - jamii
http://awelonblue.wordpress.com/2012/10/21/local-state-is-poison/
======
im3w1l
I don't understand anything this article is saying. What is RDP? What are
"live programming, open extension, and metacircular staged programming. "

What, and why would you want to upgrade during a loop?

>Those concepts are replaced by discovery – potentially in an infinite graph
of stateful resources

What?

>To achieve large scale, robust, resilient, maintainable, extensible, eternal
systems,

So many buzzwords...

Could someone give an example with before/after-pseudocode. I understand
neither what the problem is nor the how the proposed solution is expected to
solve it.

~~~
arethuza
To be fair, there is an "About RDP" link on the page:

[http://awelonblue.wordpress.com/about/](http://awelonblue.wordpress.com/about/)

~~~
mattmanser
I personally find it extremely telling that there isn't a single line of code
in the buzzword filled page.

The entire tree concept he is talking about seems like a misplaced faith that
you can achieve that kind of separation.

My gut says that in moderate to large sized programs so much state would end
up under the root, just out of laziness, time constraints, over-complexity or
lack of programmer skill, that you'd have the worst of all worlds.

I just caught the end of the global variables era and have worked on a few
programs where the state of things can get modified anywhere.

It is not pretty.

~~~
kbenson
I've worked on similar code. Global state which is sometimes twiddled directly
in the main loop, other times functions are called that take no arguments and
return nothing, but they compute something and set global variables.

Not pretty is an understatement. It's ugly at best, and rage inducing most the
time.

~~~
dllthomas
It helps when you can use the type system to restrain what can touch what.
I've been doing this in some of my high performance C, where some global
mutable state is unavoidable - passing around empty structs to indicate
context. It doesn't guarantee I don't violate my rules, but it more likely I
notice when I do.

~~~
kbenson
It's almost never as bad when you design it yourself. Then the logic and
reasons are apparent. Coming into such a system from the outside can be tough
though. An explanation of the rationale and pre and post conditions is _very_
helpful, even if all it does is keep a future programmer from trying to
circumvent the system and causing the problems it was implemented to avoid.

------
Udo
The alternating aversion and draw of global state seems cyclical to me, and in
many case the distinction is in name only. For example, JavaScript's scoping -
even in code without "global" variables - is global-like in that the parent
context is always inherited by sub-objects. Also, you couldn't invoke any of
the built-in objects and functions without them being global in some fashion.
It's in fact one of the reasons why it's so much fun to work with function
objects. Yet, many JS programmers balk at the notion of globality whenever
it's expressed explicitly.

One of the reasons global state gets a bad rap is because we're always trying
to minimize side effects, and that's a worthy goal. The idea not only makes
sure the program's components test well individually, it also enables better
component re-use (although I'm still convinced that re-use without refactoring
is mostly a myth in practice). However, in striving to ban explicit global
state we have developed an astonishing array of cruft and complexity which
counteracts the effects we wanted to achieve in the first place.

I agree with the article that well-defined and clean global data makes a lot
of code easier to handle and it also eliminates unnecessary work, both on the
human and the machine side. Of course, this idea breaks down again when the
global substrate becomes muddled and structurally broken. At which point we've
come full circle.

In my opinion a mixed approach is probably advisable for most projects and in
fact, that's the way we already do things in many cases, even if we're
forbidden to use the actual phrase "global variables". Judicious use of both
paradigms yields the best results in my opinion. Maybe it's time to actually
start calling global state by its real name, without expecting to be
stigmatized for it.

~~~
jamii
I think that a big part of the problem is pervasive access to global state.
Given a block of code, it's impossible to tell what effect it (or its sub-
calls) might have on the global state.

A common design pattern in clojure is to have a single massive data-structure
that represents all application state, but to take it apart recursively whilst
updating so that each function is only passed the pieces of state that are
relevant. That way you can look at a function and immediately know what it can
and can't read/write out of the global state.

A nice way of doing this in an imperative language might be to pass bits of
the global data structure by reference (ala lenses) so that a given function
can only read/write state contained in its arguments. The data is still global
in the sense that it is all accessible from the root object and is not
encapsulated, but access can be restricted on a per-function basis.

~~~
Udo
> Given a block of code, it's impossible to tell what effect it (or its sub-
> calls) might have on the global state.

That's true. The equivalent effect in the "all local" paradigm is local
objects the state of which can't meaningfully be understood or manipulated by
neither the programmer nor other objects, leading again to unexpected behavior
that is painful to track down.

A lot of the precautions that are common sense when working with global data
are already instinctively followed (or at least understood) by most
programmers I think. In effect the lenses are already working when there's an
understanding when and where manipulation occurs. This mostly coincides with
the notion that complex data should be manipulated by a well-defined model and
wherever sensible there should be only one mechanism for doing it.

~~~
mercurial
> That's true. The equivalent effect in the "all local" paradigm is local
> objects the state of which can't meaningfully be understood or manipulated
> by neither the programmer nor other objects, leading again to unexpected
> behavior that is painful to track down.
    
    
      def foo(x):
        x_plus_one = add_one(x)
        x_plus_one_times_two = times_two(x_plus_one)
        return x_plus_one_times_two
    

Here is a function with some local state. Which part do you feel "can't
meaningfully be understood or manipulated by [...] the programmer"? I'm not
trying to bait you, but I'm struggling to grasp the argument you are making.

~~~
lomnakkus
Immutable local state is (as you've observed in that example) harmless, but
once you start mutating closed-over state, you've given up completely on
referential transparency[1] which gives you a lot of power when reasoning
about what code does (and doesn't!) do.

[1]
[https://en.wikipedia.org/wiki/Referential_transparency_%28co...](https://en.wikipedia.org/wiki/Referential_transparency_%28computer_science%29)

~~~
davexunit
I think closed-over mutable state certainly has its place. For example, delay
and force in Scheme. Mutable local state is used to memoize the result of the
delayed procedure. Of course, it would be terrible for a large program to keep
all of its state within a closure. You eliminate the benefits of live-coding
from your REPL at that point since you can only directly affect the top-level
environment.

------
w_t_payne
I largely agree with the poster, although I would couch the argument in
different terms, since I think the global-vs-local dichotomy might be an
orthogonal matter.

Our brains struggle to reason about how state evolves over time. Add in
concurrency, and the problem easily becomes intractable. On top of this,
testing stateful components is burdensome.

So, the state needs to be kept as separate as possible from the complex
algorithmic logic of the application, so that the state-handling-parts can be
kept as simple as possible, and the complex parts can be kept as easily-
testable as possible. If this means that the state is handled globally, then
fine, but it is not really about where the state is held, but rather about how
easy is it to reason about and test.

My rule of thumb is this: we should be able to test our complex mathematical
and algorithmic components as stateless (pure) functions, independently of any
stateful parts of the application. The remaining stateful parts of the
application should have a simple and well understood lifecycle, preferably
well away from any concurrency, and with tightly controlled and documented
state transitions. (OOP is handy for this, although it must be kept on a tight
leash).

------
hexagonc
This seems kinda interesting. I don't know if it would solve all the problems
of global state but it would definitely make serialization of the program
state simple. It wouldn't even be that difficult to implement in a language
that has a LISP-like syntax.

A first stab at the "tree-shaped resource space" referred to in the article
would be the abstract syntax tree of the program itself. Each node would have
a unique URI, which can be a physical directory path on a filesystem or can be
stored in a database structure. Every local variable would be defined by a
path in a flat namespace. The "parent" directory of the local variable would
be the function it is defined in. Security rules can be created that simulate
many of the features of variable scoping rules. The most basic rule, that
variables are only visible within the scope of their parent function simply
means that the only variables that can be referenced within a function are
those within the same directory. Again, none of this seems too difficult to
implement especially if your language uses a LISP syntax.

I'm tempted to implement a toy version of this, if for no other reason,
because I've been wondering about good ways to serialize the program state of
a DSL that I've been working on. Performance seems to be the big problem with
using a database or filesystem. A global map of URI's (that is easy to
serialize) with some sensible access/permission strategy doesn't seem too bad
and could be transparent to the developer.

------
notacoward
I wrote about the related idea of implicit vs. explicit state back in 2005.

[http://pl.atyp.us/wordpress/index.php/2005/07/explicit-
state...](http://pl.atyp.us/wordpress/index.php/2005/07/explicit-state/)

Briefly, the kind of global state that's needed for debugging should be easy
to find "from outside" \- which precludes local variables along with other
common idioms. It might still be distributed, and that can still be
problematic, but the key point is that many ways of avoiding global state are
worse than what they avoid. Global state itself is not the problem;
inadequately contained or constrained changes to it are, and there are other
solutions besides elimination.

~~~
jamii
I like that explanation. I may link to your post in future rather than
dmbarbours as it requires less background understanding.

------
sebastianconcpt
When you invoke a object class to create an instance, you are invoking a
global shared, so all things with its merit.

“local state is good, global shared state is bad” and all those (very easily
over)simplistic kind of thoughts are like pain killers. They might alleviate
you in a moment of affliction but they can be also be addictive beyond the
point of benefit. In that regard, yes, something could be poison.

Your line of thought here will make you converge to invigorate some kind of
proceduralism. Sorry I don’t know what your domain problem is but you seem to
be experiencing an object oriented overhead that you feel like starting to
hurt.

You can go ahead and proceduralize things (functions against a remote
datastore) but I wouldn’t be so fast in questioning the object design
fundamentals. I'd try harder* to remove the original painful overheads or
whatever real pain is in your design.

*by harder I don’t mean to be muscular or that you aren’t paying effort to it. Harder could mean to do something as easy as asking to hacker friends to use their fresh unbiased view for a problem/code review.

Listen all, pay attention to some, then ignore everybody (including me)

------
judk
He isn't saying that all state should be accessible to all functions, he is
saying that all state (including subtle state like call stacks) should be
colocated in a data store.

The data in the store can still be protected (with access tokens or
existential types or whatever) so that an item can be only accessible by parts
of the code that have the "key".

------
davesims
I think it's a bit of a category mistake to call filesystem or database data
'global state' in this context. Virtually every application has external
persistence of some kind and some way of referencing that persistence.
Technically those references -- variables or classes that manage connection
pools or IO utilities, etc. -- can be called 'global state', but that's
generally _not_ what the CS literature is talking about when it says 'avoid
global state'. The Evil Global State of CS lore generally refers to globally
accessible static or class-level values that refer to in-memory structs,
objects or values of some kind, in the stack or heap, rather than external
persistence. Filesystem and database access is usually taken for granted, and
is ceded as the unavoidable level of 'global state':

[http://c2.com/cgi/wiki?GlobalVariablesAreBad](http://c2.com/cgi/wiki?GlobalVariablesAreBad)

State, in general, at _any_ scope, can make things difficult no doubt. But
global state is classically bad because it's hard to reason about across large
chunks of distributed code, pollutes namespaces, creates concurrency
nightmares, etc. Reducing the scope of state to a manageable range of, say,
less than a dozen lines of code, into short-lived references, is clearly _far_
better than the alternative and a reasonable approach in the vast majority of
cases. Calling it 'poison' ratchets the rhetoric way beyond the gravity of the
problem. No, local state is not _considered harmful_.

What it seems OP is _really_ talking about in practical terms is _pure
stateless_ programming, where the application has no implicit or explicit
references to a value whose authoritative data resolves in _main memory_. If
you were to tell me the only state you have in your application is Filesystem
or database data, "just beyond the edges of our program logic," I'd say you'd
basically achieved the fabled 'stateless' programming ideal, long held as a
kind of Holy Grail of functional application development, and as OP points
out, that's not often achieved even in the strictest functional environments.

I don't want to diminish the points made, the article was instructive to me as
yet another anecdote about the perils of shared mutable state at any scope.
But the fundamental principle, that one should avoid shared mutable state as
much as possible -- which is the upshot of the essay -- has been axiomatic for
quite some time.

~~~
jamii
> If you were to tell me the only state you have in your application is ...
> "just beyond the edges of our program logic," ... that's not often achieved
> even in the strictest functional environments.

It's actually a pretty common design pattern in clojure to keep all
application state in a single datastructure (eg [http://www.chris-
granger.com/2013/01/24/the-ide-as-data/](http://www.chris-
granger.com/2013/01/24/the-ide-as-data/)
[http://thinkrelevance.com/blog/2013/06/04/clojure-
workflow-r...](http://thinkrelevance.com/blog/2013/06/04/clojure-workflow-
reloaded) [http://channel9.msdn.com/posts/Rich-Hickey-The-Database-
as-a...](http://channel9.msdn.com/posts/Rich-Hickey-The-Database-as-a-Value)).

> But the fundamental principle, that one should avoid shared mutable state as
> much as possible -- which is the upshot of the essay...

I think you missed the point. The OP is arguing that even non-shared mutable
state should not be encapsulated away but should be accessible from some root
data-structure. That way you can eg serialise the whole state of your program
and restart it elsewhere or traverse the state with debugging and monitoring
tools.

He points out that the traditional evils of global state (unrestrained
mutation, non-reentrant code) have been solved in filesystems and databases
and that those solutions could equally be applied to keeping state in-memory.

In other words, separate data from logic and keep all of your data in one
place (whether that be a database, file-system or some well-controlled in-
memory structure).

------
viraptor
Sounds like what lots of telcos do with dap/ldap already. And there are some
insanely fast in-memory implementations of it.

Maybe not to eliminate the local state itself, but definitely for the
organisation / state sharing / layered security / persistence and many other
things he listed.

~~~
jamii
It's also the standard way to architect web apps, with stateless logic in the
servers connecting to the stateful database.

What's interesting about the OP is the idea of making that the _only_ source
of state, so that every other language construct is a pure function of its
inputs.

------
dllthomas
An interesting approach might be keeping state global, but providing
projections through which certain parts of the code must access that state.

