Case against OOP is understated, not overstated (2020)

taurath · on Feb 10, 2022

In my mind, state is the real enemy impacting: comprehension, brittleness towards making changes, and the surface area exposed to potential bugs. OOP as frequently implemented, while claiming to encapsulate state, ends up creating so much more.

In accordance with this view, I think project architecture should be approached with an emphasis around how much state is necessary for it to run. This is why simulations like say someone making a game or simcity with like relatively independent entities that map to something in real life use OOP. If you're writing a service doing requests, you want as minimal state as possible. Singletons are state. Initialized/non-static objects are state. The smaller amount you have the easier it is to reason about the system.

As I write this however, I worry a little that my view is overly simplistic, or maybe applicable only to domains that I have worked in. If anyone wouldn't mind poking holes in this argument or offering examples I would appreciate it.

agency · on Feb 10, 2022

There's a really wonderful talk that I've recommended to almost everyone I've ever worked with called Simple Made Easy[1] by Rich Hickey. I also struggled to explain why I hated state so much. You can talk about races with shared mutable state but even single threaded code I found I couldn't stand it, that it made things harder to reason about and change. It's because state is complex, in the sense Rich discusses in the talk: State intertwines "value" and "time", so that to reason about the value of a piece of state you have to reason about time (like the interleaving of operations that could mutate the state).

I don't know if it's just me but I watched that talk a couple years into my career and it was like something clicked into place in my brain. It changed the way I think about software.

[1] https://www.infoq.com/presentations/Simple-Made-Easy/

butwhywhyoh · on Feb 11, 2022

The problem I have with talks like this is that they sound fantastic on the surface. They almost sound self-evident! "Duh! I want to make simple things, not easy things! That was great!"

But where are the examples? Not a single example of something easy versus simple, or how something "easy" would resist change or be harder to debug. All of these concepts sound fantastic until you begin to write code. How do I apply it? It's a great notion to carry around, but I often wonder if this is just someone's experience/opinion boiled down to a really well done talk, and not much else.

bcrosby95 · on Feb 11, 2022

The point of simple vs easy is they exist on completely different dimensions. There's simple/complex, and there's easy/hard. Something can be simple+easy, simple+hard, complex+easy, or complex+hard. Obviously there's a sliding scale in each dimension.

Simplicity in a vacuum isn't a good thing. Ideally your solution targets the exact level of simplicity vs complexity required for your problem. Obviously you won't always hit or know the target.

The value in simplicity is greater composability. It's especially important for the building blocks of our systems - of which programming languages make a huge portion. It doesn't sound too controversial to say that it's easier to take multiple simple things and make a more complex thing, than it is to take a complex thing and distill it down to the simpler thing you need. I say this because regardless of what programming paradigm you adhere to, the "kitchen sink" unit of code is universally derided, be it god modules or god classes that does shit you don't need.

It's not that Clojure is all simple, all the time. There is mutable state in Clojure - atoms, refs, etc. They also have interfaces. And multimethods. And so on.

But the simplicity floor is lower in Clojure than most other languages I've used. More than those other languages, you can target the level of simplicity you need. And it provides for more complex elements if you need them. And in my experience, a lot of the time, you don't need those more complex elements.

baryphonic · on Feb 11, 2022

If you want functioning, robust, maintainable software (or even better, software that doesn't require maintenance), then spend a long time modeling the problem domain. Build it as a system of types, a protocol, perhaps even a language (or at least an AST with semantics). Prove things about this model, particularly some useful things about soundness, consistency and (in)completeness. Learn all the funky symbols people use in the literature, learn about the strange tools you weren't told about in undergrad like dependent typing or higher-order contracts or CRDTs and lattices. Spend a lot of time doing this. Then, when you have determined the essential shape of the domain and nothing more, implement the software. At that point, the code almost writes itself.

I submit that if we did that, we would have excellent, elegant, simple software, but following the process would be incredibly hard. So hard, in fact, that it couldn't possibly be distilled into a conference talk.

wpietri · on Feb 11, 2022

What sort of domains do you see as sufficiently well-understood and stable where this process is even achievable? A lot of my career has been in domains where we are exploring problems by building and shipping things to see what really works for users and customers. And other times there's domain volatility driven by changes in technology and competitive landscape.

Even for domains that are stable and knowable, I have to wonder what businesses can afford that kind of up-front investment before the first feature ships.

baryphonic · on Feb 11, 2022

I've had largely the same experience as you, but I have seen some hints that real simplicity could be possible. If the domain is technology itself, there may be no underlying simplicity.

Ultimately, I think we have to make a trade-off between simplicity and easiness. The approach I outlined would be incredibly expensive because the tooling for that approach isn't quite good enough yet, and stakeholders wouldn't even understand it. They wouldn't realize that you were building a pitch for your product not as a PowerPoint deck, but as executable code!

A lot of our complexity today is from constructing software itself over layer upon layer of previous complex software (CSS, I'm looking at you), not due to the intrinsic "business cases" our software is meant to solve. Some of that complexity cannot be avoided, and some of it could be but at significant cost. To use an analogy, it's also cheaper to build a traffic light-controlled intersection, but overpasses are simpler.

Coincidentally, almost all of the tools I've seen that try to make simplicity cheaper come either from the Scheme/Racket/Lisp world that Hickey himself hails from or from Alan Kay and his sphere of influence. (The two groups have quite a bit of overlap, both in terms of ideas and even people.)

jiriro · on Feb 11, 2022

Could you please elaborate on Hickey’s and Kay’s key ideas and how to try them hands on?

I know about Smalltalk (Squeak) so I guess that is the playground for Kay’s. Would just playing with Clojure do the same for the Hickey’s?

wpietri · on Feb 11, 2022

Sorry, I'm still not seeing how/when the approach you're hinting toward is practically valuable. So far it seems to me like you're pursuing one dimension of quality to the exclusion of others. Which is an interesting theoretical exercise, so if that's your jam, have at it. But it sounded to me like you were proposing something people could actually do.

magpi3 · on Feb 11, 2022

Compilers maybe?

wpietri · on Feb 12, 2022

Ooh, interesting! You're right, there's a class of domain where one can just push the real-world change to the edges of the system and ignore it. E.g., there's surely software that's mainly about complying with laws.

But even there, I suspect adaptation has to happen. Python's had how many versions over the years? Indeed, I could argue that it's one of the world's most successful languages precisely because it keeps responding to user need. Or look at tax software, which is going to change at least every year, and more often in emergencies.

So I suspect at best these other domains have a slower iteration clock. Which might be slow enough for the sort of formal modelling that is described. But then I think there's an open question: do other methods also work just as well with slow iteration clocks?

Hercuros · on Feb 11, 2022

Speaking as someone with experience with many of those things (PL theory/formal verification background), I don't think they're even close to being a silver bullet.

Coming up with the right abstractions and the right domain model is difficult (especially if you just sit down and try to come up with stuff, you're likely to get it wrong the first time around). Knowing about some of those things could help you come up with better abstractions, but it's neither necessary nor sufficient to ensure that you will.

Take dependent types for example. They allow you to express more program invariants or correctness properties in your types. But actually using them requires you to write proofs (at least, if you're using them to their full potential). And I do think that in general System F like type systems hit a nice sweet spot and are generally good enough for the stuff that you might actually want to handle on the type system level.

I've also run into similar "proof-like" situations with much simpler type systems like those of Haskell and Rust, where I was structuring my types to "make illegal states unrepresentable", but in the process ended up complicating my program due to having to match the structure of my program to the expected structure of the types. Sometimes it is nice to _not_ to have the type system enforce some of your invariants. (Such things are also doable with dependent types of course, but this is just an example of some of the tradeoffs involved).

You can also still have a shitty domain model even if you use all of those fancy tools. They just allow you to be very formal/precise about the domain model (and do perhaps encourage some more uniformity by making it more annoying to express ugly or complicated things).

FpUser · on Feb 11, 2022

Domain knowledge is very important. In the real world however by the time you finish this type of process the competition will have had the product out already. It may not be that perfect castle in the sky but it will work and if you have revenue you will have time and means to improve.

magicalhippo · on Feb 11, 2022

Our customers don't even want to pay for something that bespoke. They have margins to worry about.

So instead we've had to make a system which makes it less painful when bugs occur.

For us that means making it trivial to run older major and minor versions our software, and an automated update mechanism which delivers new builds to customers on-premise in less than an hour, updating the DB schema as well.

samhw · on Feb 11, 2022

I don't think this excludes what the GP said, but this is super important as well. I think of it as second-order reliability: design your software not only so that bugs don't occur, but also so that the user can take practical steps to remedy bugs if they do occur.

(Also, as one of my past companies enshrined as an engineering axiom: "write software to be debugged". Most programmers write waaay too few logs. You know the print statements you add to your code when it's buggy, to track down what's going wrong? Well, do that all the time, and if there are too many then fix that problem with adequate tooling. If it's running on your customers' computers - whether servers or PCs or phones - then store them locally for N days / N logs and allow them to be submitted when a bug occurs. Stack traces - even good ones - are not nearly enough.)

baryphonic · on Feb 11, 2022

100% agree. It's a trade-off. Get product-market fit first and learn what you can about the domain. Spend enough time on architecture up front so you can easily pivot. That's all the simplicity you should care about at that point.

Once you get traction, you can start to afford to have the crazy vision. IMO, at that point it's easily worth the risk. A decent research team will probably discover something, and potentially extremely valuable knowledge.

If you were James Clerk Maxwell before he published his equations, how much would they be worth to you, especially if you had paying customers?

jcelerier · on Feb 11, 2022

By the time you're 20% in that process your competitor has already overtook the market.

baryphonic · on Feb 11, 2022

To quote Thiel, "competition is for losers."

alatkins · on Feb 13, 2022

Counterpoint is that the Big Design Up-Front utopia didn't win in software, giving rise to Agile (for better or worse).

jackblemming · on Feb 11, 2022

Easy things work until you have to extend them or do anything the least bit complicated. Think of SQL or most "easy" declarative APIs. Or even worse, ORM engines. Simple things are normally also easy to use, but you may have to write some more boilerplate and there's less "magic".

Steve wrote a simple CRUD API that gets some data and returns it. Bob tried to be clever and write a loosly typed declarative cluster fuck that nobody understands, but it's "easy" if you dont do anything interesting or useful with it.

raducu · on Feb 11, 2022

A bit like haiku, wonderful when you read it, extremely hard to maintain conversations in haiku.

Or like an improv exercise where you have to improvise a dialogue, but only by using questions, no afirmations.

Can it be done? Sure, but not by most people, not in real time. Again, wonderful when you see it done right.

Peritract · on Feb 11, 2022

Talking in haiku: Wonderful when you read it. Too hard to maintain.

Improvisation. A constrained dialogue. Affirm? No. Question.

Can it be done? Sure. Most people struggle slowly. When right? Wonderful.

codebje · on Feb 11, 2022

It's easy to stop calling a now-unused function when some behaviour is no longer needed.

The system is made more simple if you remove the function, though.

This is more so if only part of the behaviour of a function is no longer desired - the function becomes easier to understand when it's trimmed down, but it's harder to make that change.

simongray · on Feb 11, 2022

The presenter is Rich Hickey. He is the guy who created Clojure. He basically designed the language around this principle (it is a very opiniated language). If you want examples, look at Clojure and its ecosystem where the ideas of Rich Hickey are held in high regard.

kgwxd · on Feb 11, 2022

The Clojure language is the example. Basic data structures vs classes/objects, immutable vs mutable, lisp vs other languages, etc.

DyslexicAtheist · on Feb 11, 2022

> They almost sound self-evident!

I think it's hard to provide examples since they would all be implementation dependent.

simple to me is a stage of the thought process that will become apparent only after putting in the extra work. It's not just applying "this 1 trick". Making it simple is its own unique challenge. E.g. my first iteration of an idea is always a mess. Then I rework it enough times to make it presentable (a state where it "works" and I can reason about it with others). But on the job nobody pays me to make things simple because that means spending another 10-30% of the budget on it. making things "simple" at work is nearly impossible to sell because people quickly through arguments at you like "perfect is the enemy of good", and few jobs give you a "definition of done" where making things simple is part of it.

Another reason why it's impossible is that the best time to rewrite a greenfield project or an MVP is before you add additional features. But at that point people will not allow it because the expectation usually is to build on top what you (they) invested in previously.

usrusr · on Feb 11, 2022

That time part is what you are wrestling with when you are battling with state. So it's natural to think about it that way. But there's also this somewhat dumbed down version of the argument: every piece of state a method reads is like an additional function argument and every state it writes an additional return value. What a mess.

branko_d · on Feb 11, 2022

This is insightful.

In some sense, the only distinction a "pure" function has over "non-pure" is that it declares all its inputs/outputs (as function parameters and result). We say that a non-pure function has "side effects", but all that actually means is that we don't readily see all its inputs/outputs.

Even a function that depends on time could be converted to a pure function which accepts a time parameter - this is conceptually the same as a function which accepts a file, or an HTTP request or anything else from the "outside world".

The trouble, of course, comes from the tendency of the outside world to change outside of our program's control. What do we do when time changes (which is all the time!) or file, or when the HTTP request comes and goes never to be seen again?

Or when the user clicks on something in the UI? Can we politely ask the outside world for the history of all past clicks and then "replay" the UI from scratch? Of course not. We cache the result of all these clicks (and file reads and network communications and database queries...) and call it "state". When the new click comes, we calculate new state based on the previous state and the characteristics of the click itself (e.g. which button was clicked on). This is a form of caching and keeping a cache consistent is hard, no matter what paradigm we choose to implement on top of it.

The real-world example of this would be React. It helps us implement the `UI = f(state)` paradigm beautifully, but doesn't do all that much for the `state` part of that equation which is where the real complexity lies.

veidelis · on Feb 11, 2022

There's no such thing as UI = f(state) in React. You may know that already, but it's UI = f(allStatesStartingFromInitialState). That way all state transitions are captured and all state changes are handled accordingly inside components taking into account component's internal state.

mujina93 · on Feb 11, 2022

This made me think: if we wrote object oriented code methods where all the members that we access are passed explicitly as parameters, as well as all the members that we modify (as out references), then we at least would immediately identify the real complexity of some methods! I'll try to do this, I'm curious to see how that would look like.

raducu · on Feb 11, 2022

> I'll try to do this, I'm curious to see how that would look like.

That looks like a terrible mess.

The problem is not state, but messy access to it.

usrusr · on Feb 11, 2022

Everybody agrees that OOP was killed by getters and setters. But I don't think that there is much consensus about how long it would have survived without.

(I'm not saying that OOP doesn't have its place, but it has clearly turned from a way of structuring code to universally strive for into something to avoid if possible)

andi999 · on Feb 11, 2022

At some point you get too many parameters, so you pass a struct, which basically means that struct turned into an object. (one interesting difference is that you can pass more than one different struct to that function which is the equivalent of subclassing; but with more permutations possible. Thats actually interesting).

cventus · on Feb 11, 2022

That's not a bad way of putting it. It reminds me of "It is the user who should parameterize procedures, not their creators."

qazpot · on Feb 11, 2022

> State intertwines "value" and "time", so that to reason about the value of a piece of state you have to reason about time (like the interleaving of operations that could mutate the state)

Chapter 3 of SICP deals with this topic in great detail.

wainstead · on Feb 11, 2022

SICP being https://mitpress.mit.edu/sites/default/files/sicp/full-text/...

troupe · on Feb 11, 2022

I think I was at that talk. If I remember right the Sussmans were there as well and Gerry was the first to his feet giving Rich a standing ovation after that talk.

allenu · on Feb 10, 2022

This is one of my favorite talks. It also helped things click for me regarding state. I try to use immutability wherever I can now and when there are unavoidable state changes, I try to understand and constrain the factors that could lead to such a state change. It's simplified things so much for me.

dwohnitmok · on Feb 11, 2022

I enjoyed the talk and agree with it in many ways, but perhaps a contrarian stance will stimulate some interesting discussion. Here's the steelman I can think of against that talk.

Hickey's fundamental contention is that whether something is easy is an extrinsic property whereas whether something is simple is an intrinsic property. Whether something is easy is dictated often by whether it is familiar, whereas simplicity lends us the more ultimately useful property of being understandable.

To which I'll counter with Von Neumann's famous quote about mathematics : "You don't understand things [simple]. You just get used to them [easy]."

There is no fundamental difference between ease and simplicity. Simplicity (of finite systems) is ultimately a function of familiarity. There's a formal version of this argument (which is effectively that most properties of Kolgomorov complexity when applied to finite strings are defined by your choice of complexity function, even in the presence of an asymptotically optimal universal language. In particular there is not a unique asymptotically optimal universal language, that is the Invariance Theorem is overhyped), but the informal version is that both simplicity and easiness arise from familiarity.

Indeed the fact that there is "ramp-up" speed for simplicity suggests that in fact what is going on is familiarity. E.g. splitting state into "value" and "time" is one way of thinking about it. But I could easily claim that in fact "time" complects "cause" and "state." Rather state machines where the essential primitives are "cause" and "effect" are the proper foundations from which "value" and "time" then flow (you can think of "effect" nondeterministically, a la infinite universes, and then "value" and "time" fall out as a way of identifying a single path among a set of infinite universes). Likewise Hickey claims that syntax mixes together "meaning" and "order" whereas I would could just as easily say that "order" complects syntax and semantics!

What of the idea of "being bogged down?" That "simple" systems allow you to continue composing and building whereas merely "easy" systems collapse and are impossible to make progress on past a certain threshold? I claim that these are not intrinsic properties of a system. They are rather extrinsic properties that demonstrate that the system no longer aligns well with the mental organization of a human programmer. However this is dependent on the human! A different human might have no problem scaling it.

Now hold on, perhaps, while simplicity is perhaps dependent on the human mind and humans all more or less have the same mental faculties. Perhaps we can't find a truly intrinsic property that we call simplicity, but perhaps there's one that's "intrinsic enough" and relies only on the mental faculties common to all humans. That is, returning to the idea of "being bogged down," there are systems whose complexity puts them beyond the reach of all, or at least most, humans. We can then use that as our differentiator between "simple" and "easy."

To which I would reply that this is probably true in broad strokes. There are probably systems which are are so arcane as to be un-understandable by any human even after a lifetime of study. But at a more specific level, the way humans think is very varied. The ways we learn, the ways we develop are hugely different from person to person. Hence I find this criteria of "bogging down" far too weak to support Hickey's more concrete theses, e.g. that queues are simpler than loops or folds.

When you're talking about things like love, hate, and fear, sure maybe those are universal enough among humans to be called "objective" or to have associated "intrinsic properties," but when you're talking about whether a programming language should have a built-in switch statement, I don't buy it.

For the purposes of programming languages, simple is not made easy. Simple is easy. Easy is simple. The search for the Platonic ideal of software, one that relies on a notion of intrinsic simplicity, is a false god. Code is an artifact made for consumption by humans and execution by machines and therefore any measure of its quality must be extrinsic to the humans that consume it.

Sometimes X is simple. Sometimes it's not. It all depends on the person.

As empirical evidence of this I leave this final exchange between Alan Kay and Rich Hickey where the two keep talking past each other, no matter how simple their own system is: https://news.ycombinator.com/item?id=11945722

taurath · on Feb 11, 2022

I appreciate the thought process here, and I'd want to spend more time thinking it over before a full response - though I think it maybe goes a little bit too into etymology for my taste! My immediate comment is that working memory is a measurable finite resource that developers have to use. The more entities they have to track in order to model the part of the system they're working on, the more usage of working memory.

Every bit of state creates potentially exponentially more possible entity states. So therefore limiting potential changes in state limits the amount of working memory necessary to understand the system. Its starting with "can't" and then building a "can" when necessary, which is a lot better on memory, comprehension and feeling safe/secure to make changes then starting with a collection of 10^n "can"'s and adding in "can't"'s.

dwohnitmok · on Feb 11, 2022

First off I don't think this is quite the way Hickey thinks about the issue (though I suspect he would agree about the working memory part), especially with the comment about etymology /s!(it's a meme in Clojureland that every Hickey presentation and library must contain at least one slide on/mention of etymology) In particular Clojure as a whole embraces an ideology of "open systems" vs "closed systems" where we start with an infinite sea of "can"s and then add "can't"s as needed.

But that's immaterial to your main point, which is that adding state into the mix of things makes things hard. Which I agree with, but again to steelman the point, I could turn around and say that values allow for exponentially more possible values as well! When I see a map passed into a Clojure function I have no idea what could be in that map!

I think the main objection here which you are alluding to is one of "global" vs "local" reasoning. With a value I just need to worry about the body of my function, whereas with (global) state I need to worry about every function everywhere! But what if that's just a problem with our tools rather than an intrinsic issue? What if I had a tool that could automatically present all the mutable state of your system that is publicly accessible as a single screen and automatically link to different procedures that link to different parts of it? At that point I don't see much of a difference between state strewn everywhere and nice orderly values plumbed everywhere. In fact maybe it's nicer to have that implicit state strewn everywhere instead of having to carry around values which are irrelevant for the bulk of a function body and only relevant for a single part of a subfunction. What if it's all just a matter of not having the right IDE?

Working memory is definitely a hard limitation and universal enough among humans, but it's not clear to me it's a specific enough concern to convincingly justify certain programming language features which may just be crutches for inadequate visualizations or different educational backgrounds.

nyanpasu64 · on Feb 11, 2022

> But what if that's just a problem with our tools rather than an intrinsic issue? What if I had a tool that could automatically present all the mutable state of your system that is publicly accessible as a single screen and automatically link to different procedures that link to different parts of it?

The world needs this. I think Pernosco has a workable technical foundation, but the GUI is a debugger and I need a code exploration tool to "find my way" in big unfamiliar codebases. Encouraging developers to pick up and hack around in others' codebases is the only way to get enough eyeballs to make all bugs shallow.

> maybe it's nicer to have that implicit state strewn everywhere instead of having to carry around values which are irrelevant for the bulk of a function body and only relevant for a single part of a subfunction.

I think global state (which is unusually bad) or shared mutable state (which is omnipresent outside of Rust) is a mental overhead (more things to keep in mind). I don't think tooling can eliminate the overhead of worrying about moving parts, only make it faster to look up (and hopefully document) what touches each bit of state.

jolux · on Feb 11, 2022

There's a lot to think about in your comments in this thread but I have a nitpick about functional programming style here.

> In fact maybe it's nicer to have that implicit state strewn everywhere instead of having to carry around values which are irrelevant for the bulk of a function body and only relevant for a single part of a subfunction.

I would call this an anti-pattern in FP. It's often a symptom of trying to replicate more imperative styles like OOP in a pure language. Threading mostly-irrelevant state through a bunch of different functions is a sign that your program is under-abstracted. If you think of all the function calls in your functional application as a tree, state should stay as close to the root of the tree as possible, kept in nodes it's relevant to, and the children and especially leaves of these nodes should be decoupled from it to the greatest extent possible.

dwohnitmok · on Feb 11, 2022

> Threading mostly-irrelevant state through a bunch of different functions is a sign that your program is under-abstracted.

The problem is that often you do want fairly complex state in the leaves of the tree, but want very little of it in anything else. Web browsers are a classic example of this. Pure FP solutions such as Elm that completely eschew the idea of local mutable state require a lot more ceremony to implement something like a form (the classic thorn for Elm users). By forcibly moving up the state to the root, you sometimes end up needing to pull some fairly severe contortions.

E.g. the usual answer to move the state back up to the root in the land of statically-typed, pure FP is to express it in a return type (e.g. a reader or state monad, culminating in the famous ReaderT handler strategy in Haskell) or in the limit bolt on an effect system instead. The usual answer in impure FP is to accept some amount of mutable state and just rely on programmers not to "overdo" it.

But from a certain point of view, writing an elaborate effect system whose very elaborateness might cause performance issues and inscrutable error messages sounds suspiciously like trying to work around a problem in visualization with an over-engineered code solution. And from another perspective it feels a bit like a trick. If some function has a lot of state, then I would hope by opening up the definition of the function I'd see how it all works, but with an effect system all of a sudden I've split things up into an interpreter that actually performs the mutation and an interface that merely marks what mutation is to be done. It feels like I've strewn logic around in even more places than if I just had direct stateful, mutable calls there!

jolux · on Feb 11, 2022

I will say plainly that I think there are situations in which mutability offers more elegant solutions than immutability, but I think most languages that offer it do it badly. I’m most experienced programming the Erlang platform via Elixir, and I think it offers a really nice midpoint between locality of state and purity. Within a process everything is immutable, and mutation requires sending a message to a process that will have a function specifying an explicit, pure state transformation from that message. Just about the only thing I don’t love about Elixir is the lack of real types.

I’m also very pragmatic and to the example of a web browser I would say, most applications are not web browsers. The overwhelming majority aren’t, in fact. I’ve chosen at this point in my career to mostly focus on enterprise software development, which I believe was Rich’s original field as well, and I’ve seen an enormous number of solutions with too much state cast about everywhere that benefit massively from centralizing the state high in the tree and really thinking through the data model carefully. So I stand by the principle I advocated originally, but it’s not universally applicable. It’s my belief that one of the core virtues of software development is knowing when to apply which principles.

dwohnitmok · on Feb 11, 2022

> to the example of a web browser I would say, most applications are not web browsers.

I should've clarified. I meant developing a web page to run on a web browser, hence the form example.

jolux · on Feb 11, 2022

It’s a good point. UI is a situation where the classic OOP-style frameworks work really well when they’re carefully designed. I think we’re still waiting on a model for doing that with FP that doesn’t rely on passing state deep down into an expression tree like React and its descendants encourage you to do. There’s stuff like Redux but it has its own problems.

raspasov · on Feb 11, 2022

You can "solve" global mutable state with an IDE until you bring concurrency plus parallelism into the mix. Then all bets are off for mutable global state.

In the case of Clojure, the map that you pass to a function is a value. It is guaranteed not to change underneath you and it can be freely shared with anybody.

dwohnitmok · on Feb 11, 2022

Well to keep my contrarian hat on...

> concurrency plus parallelism into the mix

The hard part of concurrency is writing or writing+reading, not just reading, so an immutable map isn't going to solve everything. Instead the hope is that you confine the mutability to one place with various transactional guarantees (in Clojure's case, this is usually atoms) and then everywhere else you don't have to worry about it.

But then again why couldn't the same analysis be performed on mutable state? How are we sure this isn't just a tooling issue? If we knew exactly what parts of mutable state were being touched by what we could identify what critical sections needed various guards.

Taking my hat off and going back closer to my own views, I actually think Clojure's combo of maps+atoms are an arguable case where Clojure has in fact complected things together in a way that e.g. STM doesn't (and Clojure's implementation and use of STM has its own problems). Namely it's complected committing a transaction with modifying an element in a transaction.

To illustrate the problem, right now Clojure atoms basically give up parallelism entirely. If you have a map in an atom with two threads modifying different keys, then those threads have to come one after another. It's actually kind of a waste of resources compared to the single thread case because work done in one thread will be thrown away and retried if the other thread wins.

So if you want true parallelism when modifying different keys you can use a ConcurrentHashmap. But that then gives up atomic updates of multiple keys at once! (Or you can have nested atoms but that has its own problems and doesn't solve the inter-key atomicity issue).

It looks like an all or nothing proposition where you either get non-parallel but fully atomic map updates or parallel per-key updates but nothing in-between. These kinds of false dilemmas are a classic symptom of complection.

The way other languages with an STM system deal with this is to build concurrent maps out of STMs refs. That way you get exactly the amount of parallelism you can relative to the amount of atomicity you need. If you have a transaction that touches two keys at once then both of those keys are atomically updated together and those two keys form one unit of parallelism. If you have a transaction that only touches one key then you have per-key parallelism. If you have a transaction that touches all the keys at once then you just collapse to the normal case of a map inside an atom.

As far as I can tell the reason Clojure doesn't do this (but other languages have) is that its STM API is a bit clunky and missing some interesting combinators.

All this is to say that maybe indeed simplicity and ease aren't all that different if from one perspective atoms are simple and from another merely easy.

raspasov · on Feb 11, 2022

Those are well reasoned points.

I'm not going to delve into STM because that can be a whole book worth of discussion :). It's a fascinating universe, I've spent many hours (weeks, months?) exploring it, and I don't consider myself even close to an expert.

You are absolutely correct about the trade-off about atoms in Clojure.

Practically speaking, to start seeing retries you'd have to have a big number of updates going on at the same time. You can push a huge number of updates through a single thread. If you do have the need to do big throughput, you can explore not-so-idiomatic options like atoms-in-atoms, like you said.

IMO, the biggest unique benefit of combining atoms with immutable persistent data structures, comes from the fact that you can get unlimited number of consistent readers virtually for free. Any thread can look at (aka, deref) an atom, while the state/world keeps moving forward. I don't think any amount of tooling can solve that case for mutable data. A snapshot of a mutable data structure would require copying the whole data structure while using some sort of a locking strategy to stop writers while the read is taking place.

john-shaffer · on Feb 11, 2022

In production, I may only want one connection pool to a DB, and in that case global state is pretty much equivalent to passing state as an argument. Development in a Clojure REPL is a different story. I have one connection pool for the dev server, and a separate pool to run tests against. The test db is re-created from a template between each test run, without affecting the dev db at all. I can trivially have multiple test pools if I want to run tests concurrently.

I also have a separate service that the server makes calls to, which doesn't run on this server in production (it has its own production server), but does run in dev and test. Each dev/test system runs a separate instance of this service, which has its own separate connection pool(s), and setting this up was trivial.

Needless to say, failures are reproducible and meaningful. There is no mocking -- we test against real local services with real local DBs. (There are still some remote service calls which I'm slowly replacing, and some flakey, unavoidable remote dependencies in a few browser tests).

I didn't do anything special to make this possible other than naming the config files "service-name-config" instead of just "config". It is just the natural result of passing state in explicit arguments. The same is not true of global state.

dwohnitmok · on Feb 11, 2022

To continue with my devil's advocacy...

> It is just the natural result of passing state as explicit arguments.

But nothing you've mentioned here is intrinsic to mutable state. It seems like all that's happened is you identified a part of your program that you wanted to be configurable and exposed a configuration knob. If for example you wanted to make it so that there is a test mode that where you want to prefix "test-" to every string written to the DB that would also probably involve a new argument somewhere. There's nothing here special about the mutable state part of it.

kaba0 · on Feb 11, 2022

> To which I'll counter with Von Neumann's famous quote about mathematics

I’m fairly sure this great quote is about mathematical “objects” in that you will never be able to truly “understand” or have a “real feeling” for more complex ones, like higher dimensions. Yet, by applying some simpler rules we can use and transform them, and after a bit of practice that will make it feel “close to us”, or “real”.

> Simplicity (of finite systems) is ultimately a function of familiarity.

I really don’t believe it would be true. Maybe I’m misunderstanding, but no matter how familiar I am with a given crud program vs JIT compiler technology, the latter will always be complex - but as you later refer to, I’m sure you know the difference between essential and accidental complexity. But in this view I would rather say that simple things are ones with minimal accidental complexity, while the easy-hard axis is about the essential part of that, that is irreducible.

raspasov · on Feb 11, 2022

>>> the way humans think is very varied

>>> It all depends on the person.

Based on what I've recently learned about neuroscience and optogenetics, I don't think there's much evidence to support this sort of relativism. On the contrary, many processes in mammalian brains have common mechanisms.

To explore more, this is a great podcast https://peterattiamd.com/karldeisseroth/

Disclaimer: I am a complete layman on the topic, so please correct me if I'm wrong.

Peritract · on Feb 11, 2022

There is more to how we think than the underlying mechanisms, just as varying programs can be run on the same hardware.

qsdf38100 · on Feb 11, 2022

This concept of "used to" vs "understand" reminds me of an interview with Feynman where IIRC he explains how can magnetism work at a distance to a layman person. He discusses about the "why" questions and how you keep getting deeper and deeper each time you ask "why". He concludes that his explanations won’t be satisfying for the other person, saying "I can’t explain this to you in terms you are more familiar with". I thought it was interesting and related. I’ll try to find that video.

jodrellblank · on Feb 11, 2022

It's the Feynamn "Fun to Imagine" video / series.

This bit is wher ehe says that about magnets: https://youtu.be/P1ww1IXRfTA?t=1300

wnkrshm · on Feb 11, 2022

I want to add to this that physics aims at this 'simplicity', i.e. being able to derive mathematical models ab initio, with the least amount of assumptions.

While the 'simplest' (in the physics sense) description of something is elegant, it can also be extremely hard to understand and work with. Maxwell's equations are used in engineering for a reason - and not their simpler theoretical physics underpinnings.

simongray · on Feb 11, 2022

If you're going to reference a Rich Hickey take-down of OOP, I think "Are We There Yet?" is the most pertinent: https://www.youtube.com/watch?v=ScEPu1cs4l0

Of course, Simple Made Easy is excellent too, probably his most influential talk.

kazinator · on Feb 11, 2022

Time does not go away from the concept of value when you remove state.

What state takes away is access to a given value at any other time but now.

It's always now; every value is the current value and no other version of that value exists.

kraf · on Feb 11, 2022

Not just you, I had the same experience. I rewatched it several times over the years and understood something new every time.

mycall · on Feb 13, 2022

> State intertwines "value" and "time"

Reminds me of deterministic finite automaton. Is that what you mean?

cutler · on Feb 10, 2022

Me as well but I was already sold on Clojure by then.

Supermancho · on Feb 10, 2022

> ends up creating so much more.

This is primarily because of inheritance, which seems counter-intuitive. In a meta-analysis of OOP-based designs, inheritance is used as the primary form of composition with other strategies being either last-resort or added later when the inheritance is already deeply embedded as part of the design.

Inheritance is a brittle form of a composition (no-reinherit) that nests state in a deep tree-like type system, rather than isolating it into attachable modules. Most OOP-based languages have slowly had to adopt additional forms of composition, as inheritance is not suited well for cross cutting concerns. Ironically, almost anything added after the base class (and maybe some abstracts above that) is a cross cutting concern added after the core functionality is established.

eternalban · on Feb 11, 2022

> In a meta-analysis of OOP-based designs, inheritance is used as the primary form of composition

Whose meta-analysis came up with that? Like to see that.

> Ironically, almost anything added after the base class (and maybe some abstracts above that) is a cross cutting concern added after the core functionality is established.

That's a bold statement (unless you have a novel definition of "cross cutting concerns") and actually backwards: The super provides the generalization and subs specialize. A cross cutting concern is a 'general' concern. AFAIK, cross cutting concern is a term originated by the inventor of AOP, and the typical garden variety CCC deals with matters that rarely have anything to do with the types to which it is applied. (Debug log in-args is a garden variety example.)

Supermancho · on Feb 15, 2022

> Whose meta-analysis came up with that? Like to see that.

You'll have to dig into each language that has expanded it's composition capability and the reasoning, but the outcome is self-evident. Many languages started with simple inheritance (eg PHP, Java, VB, C++, et al) and expanded composability mechanisms over time.

> That's a bold statement (unless you have a novel definition of "cross cutting concerns") and actually backwards: ... A cross cutting concern is a 'general' concern.

I'm not going to argue about how you wish to redefine things.

Good luck with whatever.

taurath · on Feb 10, 2022

> This is primarily because of inheritance, which seems counter-intuitive

I agree that inheritance creates a lot more problems, but the usages of non-static methods and internal state even in classes with no usage of inheritance can feel just as bad, when you have a high level method utilizing instantiated objects. Internal state as a whole can be avoided fairly often

jshen · on Feb 11, 2022

I don’t see much difference between

    some_state.do_stuff()
    do_stuff(some_state)

lolinder · on Feb 11, 2022

They're just different syntaxes for the same thing. I think what OP is driving at isn't the syntactic difference but making immutable what doesn't need to be mutable. You could do that with either syntax.

jshen · on Feb 11, 2022

Yes, but I think the person he/she was replying to is right, the biggest problem is inheritance.

rbanffy · on Feb 11, 2022

The second one scares me. It implies some_state is mutated (or not, we may just be logging something) by the do_stuff function while the first makes it very clear that some_state is in charge of doing stuff and that the implementation is aware of how some_state is implemented itself.

OTOH, the second one would be much better (and imply immutability) if it were written as

  new_state = do_stuff(some_state)

But it'd also allocate a new state.

cowl · on Feb 11, 2022

This is a symtom of seeing everything with the OOP and state glasses.

let's have do_stuff=square and some_state=2.

What does square(2) imply?

jshen · on Feb 11, 2022

Better yet, what does 2.square() imply.

jshen · on Feb 11, 2022

Immutability is independent of whether something is a method call vs a function call.

interpol_p · on Feb 11, 2022

I would say a big contributor is also reference semantics for classes being the default behaviour in many languages. You end up sharing the state and increasing the surface area of your code which can touch the state with every pass-by-reference in the code base

I know there are mechanisms to avoid this, but many times they are opt-in rather than opt-out, and so it encourages this access-to-state propagation through the codebase where you have far reaching consequences

stickfigure · on Feb 10, 2022

If only we could completely eliminate state! Thankfully, I am working on a plan for this. It should take around 10^106 years... give or take.

The serious comment here is that the real world imposes a minimum floor on the amount of mutable state that you have to model. Databases are giant piles of mutable state. Maybe we should start talking about "essential state" and "accidental state" the way we talk about complexity.

jkaptur · on Feb 11, 2022

I agree and would add that the UI is also a pile of mutable state. Even if you model the DOM using a pure function, there's still scroll position, selection, animation, history, and so on.

At the end of the day, users interact with state. We need languages and techniques that manage it well.

native_samples · on Feb 11, 2022

Yes. This is one of the core problems of FP style analysis of what's "wrong" with software development. Sometimes those people act like state is some sort of bad habit, like chewing tobacco. But the way they say to get rid of state really just hides it behind large state-management engines, like browsers and databases. It boils down to "state for thee but not for me". Well, great. But some of us have to deal with the fact that computers are often used to model the real world, which isn't a pure function.

mmis1000 · on Feb 11, 2022

There is a common pattern in OOP land that you use ooo.setXXX(yyy) to dupe states across objects/fields. Instead of using some kind of getters to map them. (probably due to difficulty in languages to link between objects?)

You end up get some states that should just the same but in multi places.

And this is almost one of a biggest source of bugs. Because you WILL do it wrong. As the code grows, the place you need to manual synchronize data grows. You end up miss it, create a lot of bugs.

On the other end, in most FP languages. It is pointless to dupe state most of the time, so this kind of problems did not happen so much in the first place.

BTW: Personally I like the idea of the [computed](https://v3.cn.vuejs.org/api/computed-watch-api.html#computed) primitive of vue, because it make the getter/setters first party encouraged. And getter/setters never add states. If you use it properly, states can be reduced by a lot and with much shorter code. While it looks the same as manually dupe them on surface, so you don't need refactor everything just to use it.

Too · on Feb 12, 2022

I like the term accidental state. This is also the type of state you see in a lot of OOP code, as referred to by parent.

Beginners want to keep functions short and the way to do that is to chunk up a bigger method into several smaller, then realize oops, that you needed that variable in both functions. Store it into “this” and now instead of one decoupled function you have two coupled functions.

Contrived example written on phone but code like below is extremely common, especially from Java coders who have been mislead to make classes for everything and haven’t learned the static keyword yet. Here obviously the self.stuff is the accidental state creating coupling between the functions, that now carefully have to be called in correct order and any of their mutations to self can impact the other.

    class Worker:
    def init()
        self.setup()
        self.foo()
        self.bar()
    def foo()
        self.stuff = fluff
    def bar():   
        do_work(self.stuff)

Rather than just do_work(bar(foo(fluff))).

taurath · on Feb 11, 2022

I agree with this - and of course without any state whatsoever the program is unlikely to be useful. A database, a network connection pool, and initialized configurations are required pieces of state for just about any backend service, and you can't really get rid of them. But having clear lines around how state is stored and utilized and minimizing it in business logic to me creates a much more sane program.

kaba0 · on Feb 11, 2022

Isn’t exactly these boundaries what OOP gives us?

ajuc · on Feb 11, 2022

This is true, and another thing about state is also true - relational databases are much better suited to handle state cleanly and to minimize the amount of it than programming languages (except maybe prolog). Especially OOP languages are bad at state minimalization. For example there's no commonly used equivalent to normal forms in oo. There are no indexes and no materialized views.

The funny effect is - if you want to minimize state and you're serious about it - keep everything you can in database and make a 2-layered (relatively thin) client <-> relational db architecture. With stored procedures. We were there in 90s and we moved away because web and OOP became fashionable.

So we took our clean, normalized, minimal state from db and made it messy and complicated with ORMs to satisfy OOP gurus :)

kaba0 · on Feb 11, 2022

I don’t think ORMs are fair choice as prototypical examples of (good) OOP. But I agree that having data (record) primitives is very important in some given domains (and that relational databases are really powerful). There are other cases as well which are not as data-oriented though.

ajuc · on Feb 11, 2022

ORMs were the OOP way in 00s and 10s.

hardolaf · on Feb 11, 2022

People also moved away from that paradigm because databases are slow. I work in the world of optimizing TLP level communications over PCI-e buses. To me, a database access is already in the world of "why bother?".

ajuc · on Feb 11, 2022

We moved from

    DB <-> Front end

to

    DB <-(ORM)-> APP SERVER <-> FRONT END

I don't think it's any faster :)

t00 · on Feb 17, 2022

Simple reason - add local or distributed cache to the app server, scale it along front end horizontally and you can handle several orders of magnitude more traffic.

carlmr · on Feb 11, 2022

>project architecture should be approached with an emphasis around how much state is necessary for it to run. This is why simulations like say someone making a game or simcity with like relatively independent entities that map to something in real life use OOP.

In the beginning of my career I did a lot of engineering simulations (Simulink), to me signal flow diagrams have always been a very obvious way to model programs. All the state is explicit in that it becomes a delayed output->input mapping of the signal flow graph. Each block behaves the same, because it has no internal state.

I always thought about programs in the same way. What goes in, what goes out, what goes back in (if multiple iterations). Only later did I find out about functional programming, which basically is the same idea, and that instantly clicked.

Except for the auto-completion after writing the . on an object, I've never really seen OOP (in the Java way, not the Erlang way) be intuitive or simple. Always keeping state in mind, class hierarchies spanning tens of files where the only way to know what your object really does is step through with a debugger, interfaces for everything because otherwise you can't mock the classes for the test, the list goes on.

lostcolony · on Feb 10, 2022

I think in a lot of ways you're correct. FP and imperative code tends to makes state explicit; OOP hides state. The latter MAY make things "easy"; it never makes it simple.

taeric · on Feb 11, 2022

To continue my hot take from earlier, OO wasn't supposed to "hide" it anymore than FP was supposed to. Rather, OO was supposed to be about changing the metaphor of the program as you are writing it.

This is most easily seen if you consider a TON of the domain specific languages out there. Logo, PostScript, DVI, GCode, etc. Many of these are "move to X" "put down pen", "move to Y", "pick up pen", etc. Very imperative and how you would talk to someone on how to do something.

So, if your objects give meaningful verbs to control the state that they maintain, it works rather naturally to reduce the code that you have.

Now, most OO today, that I see, embraced objects as records. And goes out of its way to not encode any language of behavior in the code they let you write. But, I don't think that is enough of an argument to say that abstracting some active objects into an OO paradigm is a waste and can never help.

kaba0 · on Feb 11, 2022

I think you are exactly right. There is a huge difference between data-oriented and behavior-oriented parts of a program. OOP is only a great tool for the latter but it does allow for immutability besides it, it is not either-or.

Wrap the behavior in classes and have data as data. Hopefully this will be indeed the direction taken by Java with its records and other new TBD features.

zozbot234 · on Feb 10, 2022

"Hiding" state is necessary to endow it with well-defined invariants. This can be done in many FP languages, too. The semantics-side implications of "encapsulated" state w/ proper invariants have yet to be explored, though, and this is where newer PL formalisms like "homotopy types" might end up being quite helpful.

astrange · on Feb 10, 2022

> The semantics-side implications of "encapsulated" state w/ proper invariants have yet to be explored, though

It seems that that's called a state machine, and OOP objects should come with state charts, but they don't.

> and this is where newer PL formalisms like "homotopy types" might end up being quite helpful.

PL research would actually get adopted if they didn't insist on using the worst possible names for everything. If they're not calling something an "intuitionistic type theory in the calculus of constructions" they're calling it a "pi-calculus".

zelphirkalt · on Feb 11, 2022

> It seems that that's called a state machine, and OOP objects should come with state charts, but they don't.

That is because the state chart would so quickly explode into uncountable states or so difficult to understand transitions from state to state, that such a diagram would become instantly useless. Which only goes to show, how unrealistic the idea is, that you can really fully understand such a system and that shows what the problem is. Granted, there may be certain areas of related state, that are separated from other areas, but when the program becomes non-trivial, the lines usually blur, unless some kind of approach is used, which reminds me very much of FP, only that it wraps functions uselessly in classes and objects, instead of making use of modules and functions only.

In an FP style, ideally each function would be a thing you can look at separated from the whole system, if you know what its input can be (which may be difficult). That makes for testable code. I should be able to test every function separately, without having to use ten other classes to make instances to set up an environment, in which I just hope that what I wanted to test, can actually be tested.

kaba0 · on Feb 11, 2022

> In an FP style, ideally each function would be a thing you can look at separated from the whole system

That is theoretically impossible - complexity is fundamentally not decomposable into parts in the general case. That is, given a complex function, you may not be able to extract one more meaningfully separate part, leaving the core function still too complex.

OOP’s general idea is to encapsulate just enough of the complexity to make it possible to reason about its outside API, while the complexity will live inside, allowing the class to enforce some of its invariants.

Don’t get me wrong, I’m not saying that FP is bad, hell, I think that both paradigms are essential, they are not either-or choices.

zelphirkalt · on Feb 11, 2022

I think you are slightly misunderstanding me or I did not put it very clearly. Of course there will be complexity inside functions in an FP style. I think I never said there would not be.

What I want to express is, that I can call every function of the program separately. I might have to put effort into preparing the call's arguments, of course, but I can in the end look at its inputs and outputs in a unit test, separate from the setup of an environment. The environment is basically in the arguments of the function.

The FP paradigm encourages people to avoid global state and state mutation, which helps with reducing the setup effort required to make the arguments for the function call.

In an OOP style program, I cannot simply call and test every part separately. I will have to create a kind of landscape of objects, which experienced the set of state mutations, which hopefully sets up an environment, in which I can test for one specific case of a method doing what it should do in that case. That is the moment, when the state diagram has already exploded into uncountable states, usually impossible to keep all in your head. It might also be the case, that the constructor of an object interferes with the actual setup, that you want to have. Then you will need to apply mutations to change the state to get there, doing more work than ideally would be necessary.

I see OOP maybe still in things like GUI. People are trying to get declarative there as well or functional, but there it seems like a normal thing to have some widget really change state, to avoid overhead of creating a new widget and re-displaying it. But maybe in the future FP will invade this territory as well somehow.

And yes, you can combine FP and OOP, but many common practices used in OOP are detrimental to the advantages FP can bring. I thing it would be best to limit OOP to parts of the system, where it makes sense and then wrap it in an API, which protects the rest of the system from having to use mutation all the time. The question becomes again "What is OOP?". Is it still OOP, if I work with structs and functions working on structs, instead of objects? Do we use Alan Kay's definition with message passing and each object being its own little machine? I think in Erlang we have some combination of it. Like Joe Armstrong said in a talk with Alan Kay, it is either the most or the least OO lang. Well maybe nowadays we have different candidates for that as well.

zozbot234 · on Feb 11, 2022

> if they didn't insist on using the worst possible names for everything

If the alternative is stuff like "FactoryFactoryFactory", I'm not sure that's better.

7steps2much · on Feb 11, 2022

It might not be better, but at least i don't need a dictionary to understand what it is all about, I just need to read it again and sort out the words in my head.

scrubs · on Feb 11, 2022

Absolutely. Why make const char * data private in a string class otherwise? I have know it's in a valid state so I can get it to the next valid state when caller runs an operation on it.

But then a lot of proving (small p prove) an object in c++ is in a valid state amounts to hoare predicates and other spark-ada like expressions.

Certainly FP could do same and like c++ define that away when callers and callees think undefined behavior is gone?

dwohnitmok · on Feb 10, 2022

What would homotopy types bring to the table?

zozbot234 · on Feb 10, 2022

They seem to be necessary if you want a notion of "equivalence" (for both values and types) that enables you to make functions, operations, constructs etc. independent of any notion of "underlying representation" as well as seamlessly applicable across equivalent 'representations'. This is desirable in both higher mathematics (where homotopy types were first developed) and software engineering, for much the same reasons.

arrow7000 · on Feb 11, 2022

Could you explain this to a layman with an interest in FP?

secdeal · on Feb 11, 2022

I read about this a while ago, I'm not an expert but this is my take on it: In homotopy type theory an equivalence of types is a first class value that you can manipulate. And also you can separate types from their 'implementations'. Classic example is you have a Nat type, with a Peano construction (Nat is a zero or a successor of a Nat.) This is not very efficient, but you write functions with it, prove things etc. It's time to optimize, and you change your Nat implementation to something more efficient (e.g. a Nat is Zero, or twice a Nat, or twice a Nat + 1). Your functions and proofs that you wrote with the previous implementation still will work and your type signatures won't have to change

_fullpint · on Feb 11, 2022

Isn't this what a homotopy category is for though?

halpert · on Feb 11, 2022

Sorry, but state is everything. If you don’t have state, then you’re essentially doing useless work computing an answer that is already known. Computation is only useful because of state.

momentoftop · on Feb 11, 2022

Here's a pure computation:

    import Data.List (nubBy)

    refuteGoldbach :: Integer
    refuteGoldbach = head $ [ n
                            | n <- [4,6..]
                            , not $ n `elem` [ p1 + p2 | p1 <- primesTo n, p2 <- primesTo n ]
                            ]
      where primesTo n = takeWhile (< n) $ nubBy isMultiple [2..]
            isMultiple m n = n `rem` m == 0

If you think you already know the answer to this computation, get yourself a Field's medal.

And then there are pure functions. Every time you compute a function using an input no-one has tried before, you are probably computing something that is not already known. You do this routinely even with a calculator.

Scarblac · on Feb 11, 2022

And if you don't store the answer in some kind of state, it's lost and computing the pure function was useless.

HelloNurse · on Feb 11, 2022

In typical functional languages state is managed so that it doesn't hurt, not completely absent.

For example, a REPL keeps state in the messages it prints to the terminal and this state isn't visible in the pure function definitions and expression the program is concerned with.

Life_exe · on Feb 11, 2022

You don't have to throw away the result. That's what the State Monads are for. Immutable State is Ok. Memoization is Ok.

linspace · on Feb 11, 2022

> get yourself a Field's medal.

Unfortunately I'm over 40, otherwise I would try

lloydatkinson · on Feb 11, 2022

This is a really poor take and you know it. State can exist in functional systems - as results of computations passed to other computations. Recursion where the new argument is the state.

No one was saying "don't use state" they were saying we need to adjust how we use it.

jay_kyburz · on Feb 11, 2022

This is what I don't understand about the mutable vs immutable, my functions only exist to mutate state.

kinjba11 · on Feb 11, 2022

> only exist to mutate state

Which part of the state, and when?

Is this your program?

   // change anything, anywhere, in the entire database, the biggest state
   execute_sql(user_input)

You might want controls around that state so not anyone can change it. It might be read only for some users - that is, immutable.

Is this your program?

    log_to_database(financial_event for $5.00)

You probably want your financial logs to be immutable. Nobody should be able to mutate that $5.00 event to be $500.00.

Is this your program?

    o.foo = complex_function(...)
    ... 1000 lines later ...
    o.foo = null
    ... 1000 lines later in another file ...
    o.foo.do_stuff() // oops! foo was set to null somehow - but where?

The above scenario has shared state with "foo". Somewhere it was set to null somewhere. It could be set to null anywhere in your program. Good luck tracking the bug. If "foo" was immutable, you would know immediately where null came from, because it can only be initialized one time. It lowers the cognitive load, knowing that certain actions are impossible makes it easier to focus on what matters.

Is this your program?

    thing = computation()
    return thing + 2

Many program are a series of computations - fresh state is created on each line, nothing is mutated. There is no reason not to be using immutability. In fact, if immutability were throughout, a compiler wouldn't have to worry about things like aliasing.

This is all the tip of the iceberg, there are many reasons to enjoy using immutability.

https://en.wikipedia.org/wiki/Aliasing_(computing)

jay_kyburz · on Feb 11, 2022

but, in your foo example, presumably complex_function() returned some data that was important, and foo was set to null for a reason, so that when you call do_stuff() you need to know was it supposed to be called on using the result of complex_function, or should it not be run at all because there is a missing null check.

I guess what you are saying is that foo should not have been changed, but a new variable created called "can_do_stuff" and it should have been checked before the call do_stuff()

I think the old Carmack article linked somewhere below makes a very good case for why pure functions are a good thing, and I see the value of making small atomic changes to an apps state, but ultimately, almost every applications job is to mutate data.

rawoke083600 · on Feb 11, 2022

>Is this your program?

You sir - should write flyers and product-copy :)

scrubs · on Feb 11, 2022

Try instead this for pure function: thread safe radix trie supporting ins, del, find. Purpose exclude blacklisted ip addresses. Go!

scrubs · on Feb 11, 2022

This question was not snark for FP pure functions or otherwise; no ill-will! It needs a legit answer maybe with a small lecture on why radix trees can be FP not pure FP. Hey, I can read and learn.

Other sorts of streaming operations --- message in, transform to new object which is logged or put into a data store most certainly do not need mutation so long as they are cache friendly, performant. FP and friends might even argue: yah ok it's some 10 pct slower but no race conditions, no weird sync. Formal methods are easier there to deploy. Those secondary arguments can be effective too; I'd bite

brabel · on Feb 11, 2022

With mutable state, your function mutates state somewhere in the environment (typically a global variable or the class in which the function resides). With immutable state, your function returns an immutable value, never modifying anything. I program Java for the most part, and this is how I do most things. Mutable state is the enemy, not OOP. Some people think mutable state and OOP are inseparable but they are just looking at a definition of it that is easy to criticize. If you make every object immutable , OOP actually becomes a much nicer model to work with.

ahartmetz · on Feb 11, 2022

I have come to the same conclusion. State is the problem. State should be:

- minimal (amount and lifetime)

- well conceptualized (~= easy to understand the organization)

- well named

- minimally exposed

- coherent by construction (make inconsistency impossible by design of the format or by offering updating functions that ensure the invariants)

OOP can actually help with some of these things! I develop mainly in C++, which doesn't encourage a purely OOP style like Java.

rawoke083600 · on Feb 11, 2022

I like your "bullet points" and agree with them all.

Wat are your thoughts on (super simplified example):

* 1 state-var with 3 values ?

* 2 state-vars with 2 values each ?

Sometimes I steer my design too much to first example and then other times to the last example. Both extremes can make things ugly

ahartmetz · on Feb 11, 2022

This is a typical conflict, and I think my main problem is that I spend too much time worrying about it. The important thing is that you make sure that they cannot become inconsistent (you can do this by always going through a function that ensures that when updating them). A thing I have done somewhat recently is:

  enum AuthConnectionState 
  {
      WaitingForConfig = 0,
      Disabled,
      Connecting,
      Connected,
      TimedOut
  };

where the value of the corresponding variable is derived (in just one place that is called when any input changes!) from many inputs, and it's the authoritative source of information. If you want to know whether the current state allows to proceed with login (which can be local if so configured or the connection definitely failed), call:

  static bool connectionStateAllowsLogin(AuthConnectionState state)
  {
      return state == Disabled || state == Connected || state == TimedOut;
  }

(Note for people who don't know C++: this is a file-static function, which is basically as private as it gets in C++, and it's also a pure function, not by any language feature though. It could access globals.)

It has a couple of sister functions like isWaitingForWhatever() or isLocalLogin().

The naive alternative is a very nasty and error-prone forest of booleans, each of which you must remember to update when something about the connection changes, and to make sure it's all consistent. It's almost impossible to get right without exhaustive testing.

WA · on Feb 11, 2022

All well and good, but where do you put the damn state?

ahartmetz · on Feb 11, 2022

Somehow I often end up with classes containing lists or hash tables containing structs, more often than other people apparently. A technique that is IMO underused is getting creative with the key in a hash table or an ordered map - it does not have to be a primitive type, and even an integer can be divided into ranges or an integer plus a few bit-flags.

I also like to use enums, but these are widely used anyway.

It's hard to say something general because the answer is "it depends".

rbanffy · on Feb 11, 2022

> OOP as frequently implemented

This is the real issue with OOP. Abusing a type system, using inheritance when composition or interfaces would be more appropriate, etc.

I've seen pretty much every programming silver bullet implemented in the most horrifying ways by people who didn't understand the reasoning behind each approach.

You can write great FORTH and terrible LISP. You can write readable FORTRAN or APL (I'm stretching it a bit here) and elegant 6502 assembly. You can even write resilient and reusable JavaScript and PHP if you have the discipline to do it.

> If you're writing a service doing requests, you want as minimal state as possible.

The service can have a lot of state. What you really don't want is your client trying to keep track of it. When tempted to do so, you need to change the service.

xupybd · on Feb 11, 2022

I like the Elmish way of explicitly managing state. You have one model that changes. Any model state can be rendered. The state is obvious and clear. If you want to test something just creat that state and test it. No need to click 10 buttons just to get the UI into the state where your bug was found.

rosenjcb · on Feb 10, 2022

Elegant Objects by Yegor Bugayenko actually argues against mutation in OOP for the same reasons that FP advocates due (bad for concurrency, hard to test, hidden state is hard to keep in your head, et cetera), but then all you get are namespaces with functions that act on a (usually) single data type (i.e. the class itself and its properties).

OOP itself could be good for problems where you need state machines. The Erlang Actor model is successful for a reason, but I wouldn't apply OTP to general programming.

pishpash · on Feb 11, 2022

Nothing that the typical junior engineer does with OOP can't be done with functions and well organized files. If they can't be trusted to do that well they shouldn't be writing OOP code.

mypalmike · on Feb 11, 2022

When everything's a function, junior programmers treat the entire set of available functions as reasonable things to call at any time. With OO, they at least have to think about how to organize things by which interfaces are available.

People love to hate on Java, but a Spring app with everything set up into a graph of interacting interfaces is about the most well-organized code you'll ever see, and it encourages modifications that maintain thoughtful organization.

Lutger · on Feb 11, 2022

Your view isn't simplistic, I think a vast majority of developers would agree with it. State, however, is necessary for the purpose of creating useful and in some cases performant programs. Furthermore, in some cases state can make a program quite a bit easier to write and even read.

I think there are much more interesting things to be said about state than just to minimize it. For example, you can limit stateful computations inside a function in such a way that the function itself is still referentially transparent (it behaves as if it wouldn't have state). In this way, you can still do a for-loop, or do a quick-sort on a copy of the input data, without losing the benefits of pure functions. In the D programming language, this can be expressed in the type system.

We need to 'deal with' all the risks that state involve, reducing it is the first thing to do but then there are a lot of other options as well.

eschaton · on Feb 11, 2022

Wow, now we just need to stop interacting with anything stateful, like the real world!

reificator · on Feb 11, 2022

> This is why simulations like say someone making a game or simcity with like relatively independent entities that map to something in real life use OOP.

I don't think this is the case, or at least it hasn't been for quite awhile.

Any gamedev I've known in the last decade or so would reach for an ECS[0] if they wanted to clone SimCity, in other words they would design it somewhat like a normalized database.

Each character, building, zone, or whatever in the game would be an Entity, represented by a unique ID.

Then there would be collections of Components, which are basic structs like Position or Sprite. A set of components that are all tied to the same ID would represent a single Entity the same as if it were an instance of an Entity class. These components together hold all of the game data, to the point where a naive save game system could just serialize all the (non-pointer) component data and be done. How these are stored varies by implementation and configuration, but a table in a database is a reasonable mental model to understand the core concept.

The game logic is executed in Systems, which are functions that read and write component data on some schedule. The simplest example is a velocity system, that would find all the entities with both a Position component and a Velocity component and update the Position accordingly.

In the case of a SimCity style game, the ECS approach is much more cache friendly, for both instructions and data, because you're handling all of the same work at the same time, instead of updating each entity one at a time which leads to cache miss after cache miss. This can bump the max number of agents in your simulation by multiple orders of magnitude.

Some other benefits are:

* Empower designers to iterate more quickly by giving them an editor where they can change components out without changing code. Say you have Players and Enemies and they both have Health, and Walls which are simply props with Collision. If you want to try destructible environments you can simply add a Health component to your Walls in the editor rather than try to move Walls into your LivingEntity inheritance chain or modify everywhere that does damage to check for WallEntity in addition to LivingEntity.

* Easier to parallelize. If you're using objects and you want to start multithreading you quickly start feeling like mutexes are the only answer. But with an ECS if your Systems only operate on their arguments, then you can run any systems in parallel that you want, as long as anything you mutate is not referenced in any other currently running systems. For instance every Rust-based ECS I've ever seen does this out of the box, because they can tell what fields are mutable from the function signature.

* Easier to test. If all your movement system cares about are entities with both Position and Velocity, then that's all you need to setup to perform a test. No MockPlayerInput or headless rendering required, except where those are actually the thing under test.

[0]: https://en.wikipedia.org/wiki/Entity_component_system

taurath · on Feb 11, 2022

I appreciate that info! Its an abstraction that I had used in the past to demonstrate to other engineers that I have thought seemed somewhat useful about OOP, but as I was typing I thought to myself I'd bet that modern performant games wouldn't use active mutable entities but rather abstract into systems that change state at certain ticks so state can be much more easily managed, reasoned about and optimized.

This sort of begs the question: where does classical OOP, the one taught to all undergrad CS majors in programs that use Java or C++, really fit in nowadays?

reificator · on Feb 11, 2022

> This sort of begs the question: where does classical OOP, the one taught to all undergrad CS majors in programs that use Java or C++, really fit in nowadays?

I have no idea. The smalltalk ideal of OOP lives on in all sorts of ways, but the deep inheritance chain version that gets taught in classes? I don't think it has a place apart from maintaining existing code.

sirwhinesalot · on Feb 11, 2022

I like to call that style "modeling a taxonomy of the world". Even real-world carefully studied taxonomies change all the time, good luck adapting your code base when an employee is also a customer. It was a crap style from the very beginning.

Smalltalk style OOP has its own issues, but there's lots of good aspects to it and you really have to experience Smalltalk as a complete system to really get it, it's nothing like Simula/C++/Java-style OOP.

That said, best thing you can learn in your programming career is to get rid of "Customer" classes (or whatever the equivalent is for your particular problem). A customer is a unique entity represented by an ID, a private key if you will, which links data in various systems.

Need to split some part of your codebase that handles authentication into a separate microservice for whatever reason? No problem, the same key can be used to refer to data in that system.

Your program magically becomes more modular, more maintainable, easier to understand, and more performant, all in one go.

Use OOP when you need an abstract interface to something that can be implemented in various different ways, e.g. data structures or plugin systems, stuff like that. Any time I see "PODs" full of getters and setters I cry on the inside (and sometimes on the outside).

q-base · on Feb 11, 2022

Would you mind explaining this part a little more - I do not get the difference between a Customer class and a customer as a unique entity represented by an ID?

"That said, best thing you can learn in your programming career is to get rid of "Customer" classes (or whatever the equivalent is for your particular problem). A customer is a unique entity represented by an ID, a private key if you will, which links data in various systems."

sirwhinesalot · on Feb 11, 2022

Hi q-base, it all comes down to the single responsibility principle. What is the single responsibility of the Customer class?

What methods should it have? What does a Customer do exactly? Or is it just a POD which simply stores customer data? If so which data? Does it mix authentication information with the user's purchase history? Maybe not that but what about the user's little Avatar in the UI?

The answer is none of the above, the single responsibility of the Customer class is to identify a user, that's it. In the end that's just a number, no need for a class. The purchase history of a user is only relevant to the system that manages the purchase history. The avatar is only relevant to the UI. The authentication information to the authentication service.

A Customer is not in and of itself an "entity" of some sort with associated behavior in the OO sense. But you see this all the time with companies trying to model these taxonomies of their business as class hierarchies.

It's an unhelpful practice that imposes a structure on your code that's not relevant in a any way to the actual functionality of your software.

q-base · on Feb 13, 2022

Thank you very much for for detailed response. Now I understand where you are going with it. It seems very powerful and would illicit endless expandability. Because I have not thought about modelling it on such abstract level before I am left with some questions as to how complex a model this leaves. I do not know if I am too locked in OOP-land but I like that with this thinking, you can have a SSN table for instance that references this ID and limit access to it on that level instead of having to scramble parts of tables for instance. I can also see it being useful for endless additions of tables that can relate to that ID.

But I cannot fully comprehend in which scenarios this would be a clear winner and which it would add too much overhead. There is something to it though, so thanks for opening my eyes.

_dain_ · on Feb 11, 2022

What is a POD?

sirwhinesalot · on Feb 11, 2022

"Plain Old Data". An OO term for a simple record/struct with getters and setters and no associated behavior.

kaba0 · on Feb 11, 2022

GUIs actually map quite well to that OOP model. Different sort of widgets where only the behavior is changed In children.

nomel · on Feb 11, 2022

So does any hardware.

AtlasBarfed · on Feb 18, 2022

The deepest inheritance chains I saw in practice were the OOP GUI programming toolkits, they'd go six-seven deep and invariable started needing to clone logic to avoid diamond issues (they were always single inheritance it seemed).

Honestly OOP did a lot better in those than the procedural toolkits they replaced in many ways. It was generally "better".

So if we're going to dump OOP for ??? I would say that a really amazing GUI toolkit that is pretty clearly better than the classical OOP inheritance model would really underscore the point. I can see a compositional interface GUI toolkit that is great.

I have since those days not done a native UI, so all of the modern ones, from KDE/Qt, GNOME, whatever the hell replaced MFC on windows, etc I am clueless about.

Plus all UIs are kind of dominated by HTML/CSS/javascript style of code, which is its own special evolutionary tree at this point.

rbanffy · on Feb 11, 2022

> The smalltalk ideal of OOP lives on in all sorts of ways, but the deep inheritance chain version that gets taught in classes?

What is the deepest chain we find in SmallTalk itself? I can't check it right now, but I'm sure it's not as deep as some classroom examples I've seen. Inheritance makes a lot of sense when you are modelling the world, but most structures we see in the real world aren't that deep.

scroot · on Feb 11, 2022

Just looked in the current Squeak trunk image as an example. The max depth of a Class from Object (rather than from the metaclasses like Behavior, ProtoObject, etc -- these to me are irrelevant to the count) is 8. There are very few 8, 7, or even 6 deep classes. The vast majority are between the 1-3 range.

kortex · on Feb 11, 2022

> This sort of begs the question: where does classical OOP, the one taught to all undergrad CS majors in programs that use Java or C++, really fit in nowadays?

Hot take: nowhere. I mostly kid. But really I don't know of any domain in which strict inheritance and data hiding beats mixins/composition, interfaces, and dataclasses. It's so much easier to reason about the behavior of an interface in a given role, than an "Object" which spans all kinds of scopes.

seanmcdirmid · on Feb 11, 2022

> This sort of begs the question: where does classical OOP, the one taught to all undergrad CS majors in programs that use Java or C++, really fit in nowadays?

It is not clear that such strawman OOP was ever really taught, let alone ever practiced. Mainly it seems to exist as something to be disparaged.

taurath · on Feb 11, 2022

I have worked on several very large codebases for companies you've heard of in which mutable state, OOP and inheritance are used very heavily over the years. Its not a strawman - it has, despite everything I find wrong with it, generated working systems. Granted I've found them very bug prone and difficult to change compared to others, but there's very large sets of code and coders out there who use it all the time. Its also literally taught as if its a fundamental building block and thats what people will be using all the time.

sirwhinesalot · on Feb 11, 2022

Oh you sweet summer child :). Have an upvote just because I'm happy someone managed to avoid it. Did you know Facebook's iOS app has around 18000 classes? Yeah.

Edit: It's actually the iOS app, not the Android app, crazy either way. But yeah in university I was taught the whole "a cat is a feline which is a mammal which is an animal" shtick. Completely useless in the real world.

seanmcdirmid · on Feb 11, 2022

There is nothing wrong with ontological reasoning. A cat is a feline, how you express that relationship, or if it is even worth expressing, is another matter. We also learned “Cat(x) :- Feline(x)”, but never used that either (and no one ever derides the use of predicate logic in programming). I think we spent most of our smalltalk time covering metaobjects (I did my CS program before Java took over).

sirwhinesalot · on Feb 11, 2022

The problem with the OO style taxonomy is that doesn't just model the taxonomy but it also structures your code, that's usually where the problem is, and it leads to issues like what is the proper superclass: Square or Rectangle.

A square is a kind of rectangle but the "API" of a Square is more restrictive than that of a Rectangle (where both width and height can change independently). I've seen this issue discussed in either this year's or last year's CPP-CON...

dragonwriter · on Feb 11, 2022

> The problem with the OO style taxonomy is that doesn't just model the taxonomy but it also structures your code, that's usually where the problem is, and it leads to issues like what is the proper superclass: Square or Rectangle.

“Rectangle" is a superclass of “Square”, if you are referring to the entities in geometry.

> A square is a kind of rectangle but the "API" of a Square is more restrictive than that of a Rectangle (where both width and height can change independently)

Neither the width nor the height of either can change without changing the identity, just as neither the whole or fractional part of a real number can change without changing what number it is.

If you've got something with mutable side lengths, it's no longer a Square or Rectangle, it's probably some kind of Drawable that might have a mutable state variable for position and a mutable state variable for a shape, but just as numbers, but geometric shapes themselves, like numbers, are immutable values (they can be in a mutable container, but then you are changing which one is in the container, not changing the shape while retaining it's identity.)

sirwhinesalot · on Feb 11, 2022

The Square vs Rectangle issue is just an example. They shouldn't even be modeled as a hierarchy they should just be simple structs. It's just an example on how OO taxonomies are a silly (and detrimental) exercise.

dragonwriter · on Feb 11, 2022

> The Square vs Rectangle issue is just an example.

Yes, it's an example of a problem that has nothing to do with OOP and everything to do with applying model concepts from one domain to a different domain.

> They shouldn't even be modeled as a hierarchy they should just be simple structs.

Whether they are structs or objects with state and attached methods is an orthogonal concern to whether they form a type heirarchy. It's true that geometric squares and rectangles make sense as value types like structs. They also form part of a natural heirarchy that it is perfectly useful to leverage in code.

> It's just an example on how OO taxonomies are a silly (and detrimental) exercise.

Except that isn't what it is an example of. It's an example of why you can't apply a heirarchy from geometry to things which don't represent the concepts that heirarchy applies to. It doesn't show that OO taxonomies are silly or detrimental.

sirwhinesalot · on Feb 11, 2022

We will just have to agree to disagree then, for me the Square vs Rectangle inheritance issue is the simplest possible example of the problem, where trying to model an inheritance hierarchy "the right way" directly impacts the structure of your code in a detrimental way, and the solution is to avoid OOP silliness in its entirety.

Even in the immutable case if you have a Square inherit from Rectangle it still stores 2 separate fields for width and height (that it inherited), which are unnecessary. The only winning move is not to play.

nec4b · on Feb 11, 2022

You are building a straw man from your misunderstanding of the OOP technique. As 'dragonwriter' tried to explain it to you that mathematical model of a square being a kind of a rectangle has nothing to do with OOP modeling. Liskov substitution principle tells you that a rectangle cannot be a super class of a square.

sirwhinesalot · on Feb 11, 2022

If modeling taxonomies of the world is not OOP modeling, what is exactly? The very reason Barbara Liskov had to point out the damn principle is that people were doing dumb stuff like this.

They still do btw. Universities still teach that Mammals have a walk() method that you implement for Dog and Cat, except then you need to add a Whale and now you're screwed. Silly example, but real world cases are much more subtle. I have seen plenty of NotImplementedExceptions strung across various codebases.

If you haven't done Domain-Driven Design, with cute UML diagrams and everything, then we aren't talking about the same thing (and I have no idea what you are talking about).

Reflecting the business domain into class hierarchies and structuring your codebase around that structure has ruined more codebases than NULL ever did. Code structure should not be dictated by business taxonomy concerns, only by concrete business data.

dragonwriter · on Feb 11, 2022

> If modeling taxonomies of the world is not OOP modeling, what is exactly?

It is, but modelling the taxonomy of a different domain than the one you are operating in is not. “An mutable object whose current state will always correspond to some square and can be mutated to correspond to any square” and “A mutable object whose current state will always correspond to some rectangle and can be mutated to represent any rectangle” (which are the entities in the domain being discussed) are not the same things as “a square” and “a rectangle” (entities in the domain of geometry), and taking the is-a relationship that holds between the latter and trying to apply it to the former is bad modelling at a level prior to how it reduces to implementation in any particular programming paradigm.

> Universities still teach that Mammals have a walk() method that you implement for Dog and Cat, except then you need to add a Whale and now you're screwed.

To the extent this is true, it's a pedagogical problem not an paradigmatic one.

> If you haven't done Domain-Driven Design, with cute UML diagrams and everything

DDD is at least 3 decades more recent than OOP, and is not equivalent to it.

sirwhinesalot · on Feb 12, 2022

> It is, but modelling the taxonomy of a different domain than the one you are operating in is not.

What I'm trying to get across is that this happens all the time. All the freaking time. I will concede that attributing this to a flaw in OOP might be unfair to OOP, but this comment thread started from a comment on how this style of OOP modelling is a strawman. It's not a strawman because I've seen this happen all the time. Universities still teach it really poorly. It is still discussed in conferences.

Now, you can do proper OOP modelling that is actually useful. My approach is to use inheritance exclusively for the purpose of code reuse and even then only when it is trivially correct (if you need to think about it, that's enough of a sign to not use it). Interfaces are for is-a relationships so you can have a DAG instead of a tree (which is too limiting in practice). The other 3 pillars of OOP are very useful, meaning abstraction, subtype polymorphism and encapsulation.

But I only use OOP to model software entities that have actual behavior + data, not business concerns (I will rant about how "Customer" classes with associated methods are a bad idea till the cows come home, they break the single responsibility principle by default).

But, and this may be more of DDD problem than a OOP problem, I've seen people invest tons of effort into modeling UML diagrams where every class corresponds to some business concept, and then they think of all the little methods these classes should have, and then this gets turned into code.

The design is *always* wrong because the behavior is associated to the wrong data, and then you end up with the single responsibility principle broken, horrifically, everytime. The performance is abysmal because state is spread across RAM like my cats spread sand all over the house.

Maintenance is also a pain because you feel like you need to maintain this utterly suboptimal class hierarchy to fit the "business taxonomy" even if it doesn't help in any way with transforming data A into data B, so you add all these additional classes and abstractions and dependency injection to make it somewhat usable in practice.

Things get a lot easier when you focus on using paradigms as tools rather than ideals.

nec4b · on Feb 12, 2022

>If modeling taxonomies of the world is not OOP modeling, what is exactly?

You are still focused on a mathematical property of a square being a special case of a rectangle. Inheritance is not the tool to model such a relationship. Again it has nothing to do with taxonomy.

>Universities still teach,...

The same people teach functional programming with silly recursion examples and linked lists being the most important data structures.

OOP code can be convoluted, but a large imperative mess is worse. I haven't seen any large collaborative project written in a purely functional language, so I can't comment about that.

sirwhinesalot · on Feb 12, 2022

I'm not suggesting getting rid of OOP or switching to functional code. I've seen far worse horrors on "functional" codebases. Rather I'm merely complaining of something I've seen happen in practice, from companies that are "proudly OOP". The moment you turn a tool into an ideal, you end up with a mess. Doesn't matter what it is.

You're too hung up on the square vs rectangle example. It's just an example. Replace it with a big tree-shaped UML model encoding business concepts, and it's the same problem. Business concept relationships are a DAG or even an undirected graph and no edge in that graph should be given "preferential treatment" (as is the case for inheritance relationships, since they form a tree). The moment you do, you screwed up, and people do all the time.

suyjuris · on Feb 12, 2022

> Liskov substitution principle tells you that a rectangle cannot be a super class of a square.

It does not. The Liskov substitution principle states, as it applies here, that a property that holds for all rectangles has to hold for all squares. This is true, as every square is a rectangle.

dllthomas · on Feb 11, 2022

A constant rectangle can be a superclass of a constant square; a mutable rectangle cannot be a superclass of a mutable square (without some kind of type changing mutation?).

I think this generalizes to any "this is a more constrained that".

seanmcdirmid · on Feb 11, 2022

If the square and rectangles are not mutable, you can indeed treat a square as a rectangle (however you want to express that type). If they are mutable, then a square is of course not a rectangle.

In OOP systems that support predicate-style dynamic inheritance, rectangles pick up the square trait only when their width is equal to their height. We don't talk about such systems very much today (JavaScript does not have a notion of type, and updating an object's type is a mutable operation rather than one driven by a rule), but there were experimental systems from the 90s that looked at this.

sirwhinesalot · on Feb 11, 2022

Pretty cool to hear about those, didn't know there was such a dynamic approach to the problem.

Either way, the problem in my view is that the question is silly to begin with, Squares and Rectangles should be PODs (or, even better, simple structures), and not have any associated behavior. The decision of whether they are mutable or not then depends solely on the needs of the software, and not on irrelevant modelling constraints unrelated to the problem.

There's no need for them to have a taxonomy relation unless the problem being solved involves taxonomies of shapes.

nec4b · on Feb 11, 2022

What have you solved by having squares and rectangles as PODs? Now you won't be able to treat them as same in some way (e.g. IShape, IDrawable, IPrintable, IArea,...) and you will need twice as many functions for doing same things.

sirwhinesalot · on Feb 11, 2022

If you make a square and a rectangle into IShapes (with no inheritance hierarchy) you still need 2 separate implementations for GetWidth().

Either way, it's simply not true that you can't treat them the same way because you've made them PODs, there are more kinds of polymorphism beyond subtype polymorphism. Even for the latter modern languages split the data (structs) from the behavior (traits/protocols), see Rust or Swift for example.

If you explicitly need to treat squares and rectangles as a single entity to draw them (alongside other shapes), rather than thinking of each of them having a "draw()" function (likely breaking the single responsibility principle), you should instead have a function square_to_drawable, rectangle_to_drawable etc, where a drawable is also a POD but one that has the relevant drawing data (textures, triangles, whatever). Then a drawing system is responsible for actually rendering these drawables.

That's how it's done on modern game engines. It takes a while to wrap your head around but it works really well. This style is used a lot by modern game engines because performance is way better with this style.

nec4b · on Feb 12, 2022

Everything you are describing is still OOP. Again, inheritance in not necessarily the best tool to model all relationships.

sirwhinesalot · on Feb 12, 2022

The most basic principle of OOP is the bundling of behavior and data. If you don't have that, you don't have OOP.

PODs + independent functions is not OOP. ECS-style designs are not OOP because ECS-style designs are PODs + independent functions (it's a relational model). OOP and the relational model are not equivalent, if they were the object-relational impedance mismatch wouldn't exist.

Polymorphism exists independently of OOP. OOP is not the only way to model relationships. OOP is not the only way to encapsulate data. OOP is not the only way to abstract things. Sometimes it is by far the best way to do all three. That's when you use it, not because of some silly attempt at purity of style.

People have a really distorted view of procedural-style programming where it's all global variables and copy pasted code. Couldn't be further from the truth. Even within OO codebases you have tons of procedural-style code that just happens to be stored in a class with the little "static" keyword in front. Many "procedural" codebases (i.e. stuff written in C) have lots of OO-style code with structs full of function pointers.

But it's wrong to say that it's all just OOP in disguise. If behavior and data are not bundled it's not OOP, and most code doesn't need that, not even when you need abstraction + polymorphism. Encapsulation is the main one where OOP is often the best approach.

ummonk · on Feb 11, 2022

> This sort of begs the question: where does classical OOP, the one taught to all undergrad CS majors in programs that use Java or C++, really fit in nowadays?

As everyone is telling you, it’s a poor paradigm that makes the developer’s job harder compared to both non-OOP approaches and better approaches (mixins / multiple inheritance).

That said, one should be cognizant of the reason it was created in the first place - single inheritance allows C++ to implement fast polymorphism via virtual function tables.

branko_d · on Feb 11, 2022

> single inheritance allows C++ to implement fast polymorphism via virtual function

Multiple inheritance can be implemented to be as fast as single inheritance regarding virtual function calls, and typically is in C++ by keeping multiple virtual table pointers per object. The problem this brings is that we can have pointers/references to the different parts of the same same object. A Child* may point to the start of the object, but Parent* may point to somewhere in the middle (because that's where the parent's vtable pointer happens to be). This also means that casting a pointer can change it. This can introduce some "interesting" bugs, as you can imagine.

OTOH, there are languages like C# that don't do that, but require 2 "hops" when calling an interface method, causing a slight performance penalty (a class can inherit from at most one other class, but can implement multiple interfaces). But a reference to an object always points to the object's start, which is very important for garbage collector.

c-cube · on Feb 11, 2022

C++ supports multiple inheritance though. And it still does it with (multiple) vtables. Did you mean single dispatch?

ummonk · on Feb 11, 2022

No, I just had an incorrect memory / mental model of how C++ vtables work.

native_samples · on Feb 11, 2022

Yes, but multiple dispatch is much slower than single dispatch.

rawoke083600 · on Feb 11, 2022

Even if you not a game-dev. I urge you to look into ECS just as a mental-exercise. There are some excellent YouTube talks on the subjects.

Many of us are 'stucked' wring the same old CRUD-Variation-Apps stuff. Learning about ECS is a great way to get those "original-programming-excitement-juices" flowing :)

The gaming industry is renowned for some fantastic programming solutions. To ekt out every bit of performance.

Anywhoo - makes for a nice change to arguing with colleagues over ORMs, Fat-Vs-Thing models :P

jay_kyburz · on Feb 11, 2022

I doubt any professional gamedev would reach for ECS. No major game engine has a finished ECS system yet. You would have to roll it all yourself and shoehorn it into the engine somehow.

Unity has not shipped DOTS. They actually removed it from the Package Manager last year. Joachim says it has a bright future, but I suspect it will never ship in Unity itself.

Epic has nothing in Unreal yet, though apparently a few months ago somebody spotted some changes in repository that suggests one may be on the way.

dragonwriter · on Feb 11, 2022

> No major game engine has a finished ECS system yet.

The major engines may not have ECS built in, but some of them are supported by ECS systems that are readily available. Your implicit restriction of professional gamedevs to include only people who both use a major game engine but don't use third party components not supplied by the engine is, I think, overly restrictive.

jay_kyburz · on Feb 11, 2022

I expect the programmer who liked working with an ECS system would have a hard time making a business case for taking on the additional risk of a third party ECS foundation.

A traditional OO architecture will provide all the performance you need to create a modern Simcity style game without having to take on the additional problems of potential bugs in the underlying code, training everybody to understand and use it, and not even really knowing if the end result will be significantly faster or not.

Where I work we think long and hard about adding any third-party packages to the project. The benefits have to be very real, for the budget or for the player.

lewispollard · on Feb 12, 2022

There are a tonne of open source, MIT licensed ECS libraries in pretty much every language you can write a game in. And, at the end of the day, ECS frameworks aren't all that complicated at their core, it's not hard to roll your own, though libraries will have better querying features and optimisations.

The benefit is to the developer, arguably the most important part of the game development process. It can save a lot of headaches that OO hierarchies can create, makes things more easily concurrent, allows for more flexibility in behaviour and emergent gameplay.

ECS helps to prevent bugs in game object logic, by keeping state and behaviour separate from its corresponding entity. If I'm looking for some behavior, I don't have to start thinking about which ring of the inheritance chain it ended up on, I just find the component or system that does that thing. It encourages you to write state and behaviour in a way that works with any entity and can be attached to anything at any time without crashes and state mutation problems or race conditions.

reificator · on Feb 11, 2022

Dungeon Siege[0] shipped in 2002. Unity began their ECS journey by sniping talent from Insomniac. ECS is not an unproven concept in someone's head.

That Unity and Unreal are lacking (even after Unity's public efforts in the area) is because they are licensed en masse and retrofitting each with an ECS would be no small feat. And is that refactor worth losing revenue from obsoleted marketplace content? Or does the ECS need to interoperate perfectly with existing code from endless numbers of existing projects while still providing the benefits of a dedicated solution?

Unity and Unreal are not the only engines. In house engines are shaped by the needs of the immediate users, and not the sales pitch for hobbyists or third party studios.

[0]: https://www.gamedevs.org/uploads/data-driven-game-object-sys...

AtlasBarfed · on Feb 18, 2022

Does Freeciv use this approach? Or any other significant open source game or roguelike?

ECS sounds somewhat like how I was thinking of a civ-like game engine while falling asleep a couple months ago when I was thinking why Civ scaled so poorly under some circumstances: traditional games used to have everything in memory, but civ type stuff could just have a database with some indexes and spatial indexes for quick lookups, but otherwise just sweep the various tables. Thus the size of your game was more constrained by disk space than RAM.

Per your discussions with cache conservation/thrash avoidance, does ECS work well for mapping entities to specific processors so that cache-hopping doesn't occur in modern multicore processors and NUMA stuff?

a-dub · on Feb 11, 2022

what about when the abstractions age and a new feature for a system needs the data from the components from another system?

the long term answer is probably a refactor, but what's the common quick fix? copying/duplicating data between component types? systems that examine other components as well as their native ones? merging systems?

asking the hard question... how does it stand up to the unpleasant cases?

reificator · on Feb 11, 2022

> asking the hard question... how does it stand up to the unpleasant cases?

It's the best tool I know of for a certain class of problem, that's all I'm claiming. Not evangelizing it as a magical cure-all for all domains, just explaining it in enough detail that it can be understood by someone outside the domain of gaming.

> what about when the abstractions age and a new feature for a system needs the data from the components from another system?

> the long term answer is probably a refactor, but what's the common quick fix? copying/duplicating data between component types? systems that examine other components as well as their native ones? merging systems?

I apologize if I gave the impression that things are so tightly coupled that a system owns a component or vice versa. (Though if I had to pick one, it'd be that components have associated systems)

In reality your components are just plain structs, and any systems that want access can query for entities based on any combination of components. (and good ECS impls allow for exclusion as well)

For instance I mentioned a Position component. This would be used by a movement system that checks for (Position, Velocity). It would also be used by a rendering system, which could query for (Position, Mesh) or (Position, Sprite) as appropriate. It could be used for collision by querying for (Position, Bounds).

If later you want to unload entities that are too far away from a player, you could perform two queries: (Player, Position) and (Position), filter out entities in the second query that are within n units of an entity in the first query, and then despawn what remains.

No existing data or systems need to change to allow multiple systems access to the same data. The only thing that might change is if the engine provides automatic parallelization, and you have two systems that mutate the same data, you may need to define an explicit ordering for them and they would not run simultaneously. If you don't need explicit ordering you may not even need a code change in this case.

***

In the spirit of what you're asking though, let's say you've been making a tile-based game where the player can move between discrete spaces like in chess, but you decide you'd rather have free movement like Mario. Your Position component uses integers instead of floating point values and you don't want to change your world's scaling. Here you have a few options:

0. You could just bite the bullet and change the types on the Position component to floats instead of ints, and let your type system guide you to any errors. Then you run your test suite again and make sure that everything is still behaving as expected. I'd also plan on creating several new tests based on analyzing existing uses of Position. And of course play your game again to make sure it feels right.

1. You can store the fractional position in a separate component, PositionFraction, and create/update systems as needed. Movement would need to be updated to look for (Position, PositionFraction, Velocity) and rendering would need to be updated to look for (Position, PositionFraction, Sprite). Meanwhile pathfinding could still just look for (Position, Goal).

2. You can create a second component that holds the full float value called PositionFine. Like above you update the systems that care about fine-grained positioning to use PositionFine instead of Position. Then you create a system to update Position based on PositionFine's value or vice versa, and log anytime there's a discrepancy. Once you're confident that you can drop Position, you replace every use with PositionFine. Rename afterward as desired.

If I'm changing the meaning of an existing component like this, #0 would be my strong choice, but if the game is already in production and the migration needs to go over super smoothly I'd consider #2 as well. In particular you can treat the existing logic as the source of truth, but run the new logic side-by-side and log out any variation between the two for analysis before flipping the switch and preferring the new logic.

However if the new data is purely additional and makes sense being split off from the existing data, then #1 is the solution to use. Think migrating from displaying usernames over players' heads to letting them choose a custom name. You still need the username for authentication, save games, friends lists, and the like. But a new component for the display name lets you reference that when it makes sense.

a-dub · on Feb 11, 2022

thanks for humoring me! i'm writing because i'm genuinely interested.

so it seems like then, that the discipline in building a system in this fashion revolves around responsibility for updates (which system is the sole updater of a given type of component) and the sequencing of those systems (ensuring that all the systems that update components that are used by a given system have completed their updates, possibly with some sort of record level update indication that allows for downstream systems to begin processing before upstream systems have completed all their records).

do any of the ecs libraries provide facilities for these problems, or are they typically built into a game engine framework?

reificator · on Feb 11, 2022

That’s a very good question, because sequencing tends to be the place where implementations vary the most.

Usually there will be some way to sort systems into steps and define the order of those steps. Some require you to hardcode the order, others define constraints like “after input” or “before physics” and attempt to solve those constraints, and some let you define groups that can run together and you define the order of the groups. Explicit signaling is not typically built in, but you may be able to implement it yourself if desired.

In environments with an expressive type system, most tend to favor the constraints model. Otherwise ordering tends toward the hardcoded approach, typically implicitly in registration order.

As to managing what systems are responsible for updating a given component’s state, that tends to be left to the game developer.

Sometimes there is a concept of events and if it’s not obvious who should mutate something then systems that want to mutate will send events that later systems can consume. For instance an input system and an AI system might both send a Move(entity, direction) event that a later system validates and applies.

And because it’s cute to do so, often times those events will be implemented as components themselves, with convenience wrappers to make it feel more natural to end user developers. This can come in handy for networked games and debugging in editor. You could also use it in game, such as displaying a unit’s next planned move in a turn-based game.

a-dub · on Feb 11, 2022

interesting. this is really cool!

in the automatic constraint solver variants (which is more interesting to me) are the schedules static (precomputed at compile time) or do they run as a dynamic scheduler of sorts and are the constraints statically checked for deadlock ahead of time?

this architecture is exciting!

reificator · on Feb 11, 2022

Compile time would be pretty sweet. I've never looked into it so I don't know for sure.

Don't get too caught up in the hype, it never ends well no matter what the architecture. But it's certainly useful and fun to play with.

taeric · on Feb 11, 2022

Hot take: OO is more powerful when you embrace stateful objects. As long as you are dealing with stateless objects, many other techniques have plenty of advantages.

But, consider, OO grew in a time when the likes of Logo was strong. How do you draw a square in Logo? Usually, some form of:

    pen down
    repeat 4
        straight 10
        right 90 degrees

But this /only/ works if you keep track of the state of the system in your mind, when your are figuring it out. Which, works really well if you are being taught that your program doesn't exist in and of itself, but to manipulate something else.

Functional advocates lose many learners because they don't allow that writing a local function can be done with the more global state in mind.

snovv_crash · on Feb 11, 2022

But then you're mixing up the state of the system with the shape you want to draw. If you were now working with 2 pens, you'd have to rewrite your shape from scratch too, not just your rendering, to speed up the output.

Better to separate the shape data, which is immutable (and basically declarative), and the rendering method, which does need to know about the previous work which was already completed and what it is doing right now.

taeric · on Feb 11, 2022

Maybe. There is a reason gcode exists. Sometimes you really are controlling a single pen.

snovv_crash · on Feb 12, 2022

Gcode exists because hardware abstractions are even more leaky than software extractions. When pixels go onto your screen, you don't assume to know better than the guy who wrote the driver for the graphics card how they should get there. When plastic gets deposited on a 3D printer the way in which it is deposited actually affects the properties of the resulting object. Same for a CNC lathe or milling machine, although to a lesser degree.

There are of course also historical reasons, when it would be a central mainframe that would generate the gcode, and then it could be executed many times by cheaper computers attached to the machines. There was even a point where a lot of gcode was written, or at least edited, by hand. In these modern days of compute excess, gcode probably wouldn't have developed to the extent it did, and we'd be distributing STLs with some metadata around tolerances, materials and primary stress directions and the machines would figure it out themselves. The equivalent of gcode would just be used as a communication protocol between the interface and the motor controllers.

toomanydoubts · on Feb 11, 2022

A polygon is a set of lines

A square is a polygon with four lines of the same length, forming 4 equal angles of 90°

A drawing can be made given a pen and a shape

The output is a drawing of square made with pen

taeric · on Feb 11, 2022

And a line is a set of points. Your point?

That is, yes, to an extent you can expand your vocabulary to include more shapes and find ways to compose them. Or you could peel back and literally write point values. Both have uses. And both are used.

Consider, we aren't forcing all art assets to be created and described using elementary shapes. For many of those, we are closer to literally writing the canvas by hand. And then retroactively glueing to program.

It is a shame we often force a single paradigm with our code bases.

grumpyprole · on Feb 11, 2022

I personally think "encapsulation" as us used in OOP, is a misnomer. State is usually not encapsulated, it is just hidden. Proper state encapsulation would be to use mutable state internally for efficiency, but for that state to be unobservable externally.

OOP does unfortunately encourage introducing mutable state into the domain model. The canonical example being the back account, with a mutable back balance!

The good parts of OOP are interfaces and first-class modules. Obviously we should try and keep those.

dvt · on Feb 11, 2022

> Proper state encapsulation would be to use mutable state internally for efficiency, but for that state to be unobservable externally.

This is literally how private/public keywords work, so I think your criticism is unfounded. However, I do agree with the overall sentiment that OOP implementations tend to "leak" way too much state than they need to.

grumpyprole · on Feb 11, 2022

I think you misunderstood my point. Private may protect direct access to mutable state, but the object may still have mutable state that is observable externally and must be reasoned about. In which case, the mutable state is not truly encapsulated.

kaba0 · on Feb 11, 2022

That is not hidden by immutable data either - the state is just global in the latter case, in a way.

There is no getting away from essential state, from a theoretical point of view. In my opinion the often leaky partial encapsulation of state is still one of the better ways to deal with it. And immutability is another axis so of course the frequent case where it makes sense should be adhered to, but not religiously.

grumpyprole · on Feb 11, 2022

It is much easier to reason about immutable state than mutable state, both for humans and compilers, especially in a distributed setting (any modern piece of hardware).

The problem with "leaky" encapsulation as you put it, is the combinational explosion of the state space as many stateful objects are composed.

Most mainatream programming languages are still unfortunately not well geared up for working with immutable data, they lack even persistent collections. Functional languages are of course ideal for this.

kaba0 · on Feb 11, 2022

> The problem with "leaky" encapsulation as you put it, is the combinational explosion of the state space as many stateful objects are composed.

I don’t think it has to be necessarily more than with an immutable approach. Like, there is an essential amount of state you will have to have either case and it is not clear to me that immutability is always the better choice.

But don’t get me wrong, I also default to immutable data structures, I’m just saying that 1) OOP is not incompatible with FP 2) not every problem is solved better with FP-idioms. It’s not accidental that haskell has state monads as well.

dvt · on Feb 11, 2022

Ah, I see what you mean, though I do think that immutability purists like yourself have their own monsters to contend with (even apart from the obvious performance hit).