Hacker News new | past | comments | ask | show | jobs | submit login
Depending Less on Structure (lmatteis.github.io)
62 points by sktrdie 8 months ago | hide | past | favorite | 51 comments

1. Example is too simple, functions could modify anything in a model depending on context.

2. Good luck trying to optimise this (just merging all behaviours together at the end does not equal optimisation).

3. You're just pushing the problem elsewhere - now I need to understand all behaviours from a black box, then wrap them, all while dealing with all the other wrappers people have already put in place. Imagine dealing with that complexity.

Yeah. I like it from the point of view of "think new thoughts". It's an interesting thought. Let me highlight that with a few more words here to indicate that I don't just mean that as a polite sidebar; it really is an interesting thought. There may be something there.

But as presented, the code complexity scales like a nightmare; it makes the worst Ruby monkeypatching look sane. Something would have to be done about that before even a toy implementation could be done, because this will fall apart very quickly.

When you give away structured programming, you give away its benefits too. One of its benefits is that in a structured program, you can freeze a thread at any point and you have a clean mechanism for determining from there what its stack trace is, and all the relevant scopes it has access to and could be affecting it. In real code, this process may produce a very large number of variables and stack levels, but it'll still be a fraction of the program's possibilities. This is how structured programming helps us approach programs as structured slices of the code base, instead of a holistic view of the entire code base at once (this is the true reason goto-based code was so evil in the day; you could never do this analysis). This approach throws this away. That is not intrinsically wrong. But I'm not convinced at this time that the compensating features we get make up for losing that clarity, especially because the benefits are going to have terrible scaling problems as presented.

If someone was going to pursue this, this is the angle I'd take; how do I freeze my program and determine what is affecting the behavior of the current execution trace? Assume an arbitrarily powerful debugger attached to your running process, as long as it doesn't "magic" anything into existence. I can see sketches of ideas in my head, I am by no means saying this is unsolvable. But I think it's something I'd need to see laid out a bit more before I got too excited. To anyone thinking about this, I'd advise you to also not forget scale. Anything works for 100 lines of code. I need you to present me with something that can make sense of when I'm running, let's say, thousands of these advisory functions at a time on a single question. Structured programming has an answer to that; being a thousand layers deep in a stack trace may be a lot for a human to take in, but nevertheless, the procedures that structured programming provides will still give you answers. (And I'm not asking for the solution to work any better than structure programming does in that case. Thousands of anything is irreducibly complicated. But it needs to at least give me understanding in the ballpark of what I have in a thousand-deep stack trace.)

If you want to continue thinking along these lines, then look at how humans do it. For example, when writing a word document, the person has a continuous stream of ideas of how to write about a subject. But do they continually append only to the document? No, they delete, rewrite, restructure etc. It's just english is more flexible than our programming languages (and thus more ambiguous), so these operations are easier to do.

I would think if continually layering over the top of existing ideas was somehow better, then evolution would have gone that route instead.

You're comparing writing a document with how we evolved which in my opinion is a false analogy. Our brains interact with complex systems in a much more intricate way than what is "writing a structured document". And in fact the entire attitude of this approach is the fact that we're not dealing with a statically structured word document, but with a much more intelligible system: a computer. Hence why not imagine programming differently: where we discuss with the system, and inquire it similarly to how the modules written in the article illustrate.

Perhaps, but we've had this problem since being able to formulate speech, where we have no way to 'undo' what we said in the physical world. So our language adapted to allow us to say things such as 'let me rephrase that', or 'ignore what I said, I'm wrong' etc. It's interesting that when we moved to computers, the preference was almost immediately to be able to completely delete/rewrite our thoughts instead.

Indeed! And that's exactly why I think programming feels so unnatural. Sure it is natural for us as we have years of experience modifying the structure of programs - where the structure is dictated by the nature of how computers mechanically work with memory/storage and others parts of the processor. Programming is this way because the actual computer structure demands it to be that way.

Hinting at solutions that are closer to how we generally think or at least closer to how "non-coders" interact with computer systems should be more mainstream. For instance, this approach is much more similar to how non-coders write down requirements: when we discuss with people "clicking the button should stop the door from opening" we don't open a bunch of other documents and modify them accordingly to satisfy such statement. Instead we just "pile-on-top" such fact that might overwrite, contradict or replace other pieces of requirements.

The whole point of this article is to make programming similar to the way we write requirements.

It will make it as much of a mess as any legal document, making it impossible to work with for any sane human not paid to do so.

If you literally cannot guess at a glance how it will work as a programmer, how can you expect a user to do so?

Even this trivial example is rather wrong. People do not think of message passing, synchronization or locks. We generally work on unordered, timesliced but fuzzy logic.

When I press a button, it will immediately light up a red lamp - nothing in this is remotely near to how a machine will implement it. There's no message passing in the transistor that sits on a wire. Immediately does not exist, but you can consider some things "perceptually immediate", putting a hard time bound. Yet our programming does not deal with time explicitly. Pressing a button means a contact, but it does not say for how long or whether any bounce has to be handled, because we think in terms of stable states when really they don't exist either. Yet there's immutability and hard state everywhere in our programming.

A human will handle a hardware error with a workaround, but software will fail in any unexpected scenario. And there's tons of these. We simplify programming by relying on the intelligence of the operator in many cases, but do not provide guidance. Literally not even whom or where to ask. It's good if the software even attempts to collect error info.

How do we handle errors? Race conditions? Re-tries? Anything beyond the boundary of our executing process is veering into distributed systems.

From what I can tell that's exactly what a 'module' is. By weakening the contracts between modules and turning synchronous execution into asynchronous calls, you've taken on creating a framework / language that takes the difficulties of distributed systems and made them local, so every bit of important logic is a distributed systems call.

> However I think that this approach comes with some cost. For instance if (isMultipleOfThree(x) && endsWithDigitFive(x)) is very easy to understand, but isMultipleOfThree(x) and endsWithDigitFive(x) in different modules is a lot harder. Personally I see this more as a trade-off: trading control flow for extensibility.

I think this is a massive understatement. Imagine replacing most of the ifs in your program with new top-level constructs that run in parallel and exchange events. The maintenance overhead of this idea seems astronomical, for relatively little benefit. I would be terrified to add new modules to such a beast, knowing that they could affect the behavior of any other module, in complicated cascades.

Difference is that any point in time you can add things that "must not occur" ex. forbidden scenarios. If you don't understand a specific execution you can block it and work on replacing it with another one that is more appropriate.

How would I know if blocking an event would not cause cascading behavior wrecking other use cases?

How do you know when adding a new requirement that the forbidding of something won't cause a cascading of other requirements to conflict with each other?

In essence we are used to restricting our understanding of a piece of software through the structure and the code (and maybe some documentation). Instead this approach aims at a much more dynamic and feedback oriented view on debugging: if blocking an event causes other things to break one should be able to run tests with the new blocking module in place and figure out why things break; just like you do in normal code.

I would argue that also a small change to a strictly-structured program might cause the wreckage of other use cases.

When doing things properly, we are not restricting our understanding of a piece of software through the structure and the code, we are restricting what we need to take into consideration in order to achieve a necessary understanding our code, and confidence in its correctness.

In badly-designed conventional code, changes can have far-reaching consequences, which makes it hard to understand, but under this proposal, it would seem to be the default. Reasoning about asynchronous behavior is hard. (These points are demonstrated by how much had to be written to explain the example.)

The author says that merging is the problem, and that this approach obviates it. With conventional programming, the pain of merging can be mitigated with the careful separation of concerns, while this approach seems to turn every aspect of programming into merging.

Everything I have said here about conventional programming depends on doing it well, but at least you have the option to do so, and simplify things through the separation of concerns, while I suspect the approach here makes combinatorial explosions of complexity almost unavoidable. The goal is not to make complex software, but to solve complex problems with software that is as simple as possible to produce and verify.

> trading control flow for extensibility.


Nevertheless, taking ideas to the extreme might yield interesting insights.

This looks similar to tuple spaces, which were popular when I was a kid. At least the promise is similar: programs that work together without knowing about each other, by adding rows to a data store or waiting for rows that meet a specific condition. https://en.wikipedia.org/wiki/Tuple_space

This 'layering' of new functionality reminds me of the Subsumption robots architecture[0].

My main takeaway was that you could build complex behavior by layering together reflexes. You wouldn't actually disable an existing layer, just add an overlay to disable it in certain circumstances.

I heard in a lecture once that our hands are wired similarly; we have a nerve that triggers a grasping motion involving all fingers, then we have overlay nerves that can disable the signal on a per-finger basis.

It may be that this is a fine way to build complex systems that's not optimal for _human_ reasoning patterns, but may for evolution (or other hypothetical non-human reasoning processes).

[0] - https://en.wikipedia.org/wiki/Subsumption_architecture

At the top level it seems this is and-ing together multiple requirements. You can add more requirements and it will return true less often.

This sounds similar to the Cue configuration language? I haven't used it but I understand that you can add new constraints without necessarily understanding all the other constraints.

You probably do need to test the results to make sure you haven't overconstrained things, though?

This reminds me of a project I worked on right after college, doing the initial implementation and research of something now called FLX (https://books.google.com/books?id=DiOr9P2oA6YC&pg=PA115&lpg=...)

One of the issues when then number of modules grow to a large number is that they may all have different conditions to keep track of. It’s then possible that the modules all contradict each other and never yield an executable state. Imagine trying to debug a system like that.

The next layer is then to ask, with all the modules and their output state, will their combinations ever be satisfiable. At that point, we turn the states into a large predicate logic equation and can use a SAT solver to figure that out.

I left off there in 2005, but the project is still being researched on. If you’re interested I can get you in contact with the professor.

Very interesting insight! Indeed the authors are working (as of 2019) on making the sync points resolve based on constraint solvers:

- https://arxiv.org/abs/1909.00408

- https://www.researchgate.net/publication/338358976_Executing...

I am currently reading through your research and will sure get back to you about this.

The concept is interesting - adding new behaviors to existing code without fully understanding it. The new code also looks like it could execute concurrently, which is a bonus.

But there are times you do want to understand how things work. With lack of clear "connections" to follow in the code, it seems like it would become harder to reason about what is going on in the first place.

I think there could be 'views' of data flow instead of a static data flow. I.e. You could somehow query the data flow for a specific output to see which behaviors had effect on it.

Forgive me my ignorance but isn't this scemantically how branch prediction and pipelining in cpus work ? As in when you are writing code that looks like 'isMultipleOfThree(n) && isMultipleOfFive(n)' in a lower level language, the compiler and cpu work together to produce the logic that computes both and sync()s on the answer ? I do not see the benefit of pulling this complexity upwards into the problem domain or the manner we model solutions to real-world problems.

OTOH, if one were to think about it, we do anyways build software in the manner described -- the only difference being that we 'sync()' on a different layer of abstraction -- the OS, the network, the libraries, the apis ..etc. Building software to be modular, pluggable, extensible, swappable at those layers has always been considered to be good practice. Is the benifit of creating additional absraction layers within the software itself really worth the complexity ? I'm not convinced.

The example with the number is poorly chosen. It is overly simplistic and can easily be implemented as as something like

(pseudocode) bool satisfies_all(value, functions) { for each funcction in functions { return false if not function(value) }; return true }

I get that the author wanted to go with a very basic example to keep it easy to understand, but it seems like that's the main problem with the proposed mechanism: It gets very hard to understand very quickly, leading to spaghetti code.

<quote> In the common integration-style way we’d have to alter and somewhat complicate the modules that are responsible for these changes.

In this new Behavioral Programming style we can instead map these changes quite naturally to new modules that can be swapped into the program without touching or even seeing how the system works. </quote>

If it's so natural, where's the example?

I gave an example of how a requirement is mapped to an actual module in the next section.

This sounds a lot like lazy evaluation. And the cohesion/coupling graphics look a lot like a category diagram. Haskell is a language that is lazily evaluated and is one of the reasons why programs written in it compose so well. The benefit of it is that you don't have to write all of this extra code.

An interesting way to think about it.

To me it's a bit like event-based programming: separating the event (when to do) and the eventhandler (what to do) where handlers depend only on events and not on other handlers (or state), so the logic the handlers depend on remain separate.

Yes, although a key difference seems to be that steps in the pipeline are allowed to interfere with arbitrary other steps.

Traditional event-driven programming often has event priority values or rank, and an optional "end event processing" operation associated with events, doesn't it? MS-Outlook's "rule engine" has these, for example. That seems the same thing as "allowed to interfere with arbitrary other steps".

Sure, you can "lock out" using/modifying such operations, but one must know when and where to lock someone/something out. Programming is still a persnickety endeavor with or without using events.

At least events themselves are processed sequentially, so having a trace of them is meaningful. Here, anything goes, as there's a multitude of queues.

> Intuitively we just introduced a change to a program, albeit a simple one, without having to do any integration work.

Rather, "while leaving integration work undone, pretending it doesn't exist".

does it mean instead of spending time on defining functions, we need to spend time on defining modules? what's the difference between modules in this context and functions in the `traditional` context?

As far as I understand, a "module" doesn't takes parameters, neither return a value, but instead subscribe and publish to some queues? That can be seen abstractly as a function (topics to subscribe to) => (messages written to topics). I think that they are named "modules" to make it clear that it isn't a conventional function, but instead a block of logic that consumes and effects some message passing data structure.

Just my understanding after reading the article ¯\_(ツ)_/¯

I wish there were more context here. I have no idea what `sync` is. I gather this is js. Searching "javascript sync function" yields nothing obvious. Is it a framework thing?

There is no JS in this post, it is pseudo-code for a wholly different kind of language/runtime system:

> Let’s rewrite the program above using a sort of new “language” with different execution semantics.

The semantics of it seem to be "wait for an event with some tags".

It’s not, just a name the author gives to a fictional function that is supposed to block, like nodejs’s fs.readSync(), at least that’s how I understand it.

You probably overlooked this in the text:

> Let’s rewrite the program above using a sort of new “language” with different execution semantics.

It's in the text:

sync() calls that allow a module to peek at other modules and control their execution.

    return isMultipleOfThree(readInput());

I might be missing something but from a quick skim did he just reinvent (a specific instance of) monads?

Well, Haskell's IO is "a specific instance of monads", but it was still an important discovery, even though monads were a well known concept.

The idea would be to encapsulate your whole program in a very specific monad, one that:

- runs all functions in parallel

- implements sync points between functions

- implements an event system with named events and attached values

This does not at all reduce to "reinventing monads".

Note: I think the idea is horrible, any moderately complex program written this way would be entirely unmaintainable in my opinion. The author touches on this towards the end, but I think they are completely minimizing the down-side.

> Note: I think the idea is horrible, any moderately complex program written this way would be entirely unmaintainable in my opinion. The author touches on this towards the end, but I think they are completely minimizing the down-side.

I agree so much. I can't help but think that modifying code you don't understand by injecting more code via hooks is a recipe for disaster, and the next person maintaining the composite product has to chase down all (ab)users of these events like we once had to hunt down abusers of global variables to understand what the hell is going on.

I don't even agree with the premise that this thing is less dependent on structure. Arguably, it is more dependent. It's just that instead of structure at the syntax level, we're now dealing with structure at a higher level, structure that is opaque to syntax but nevertheless present. Those named events and synchronization mechanisms make up their own structure.

Well, good point. I think what the author is talking about can he implemented as a Monad, and he / she is describing a flatmap operation.

However, I’d hardly describe it as “just”, it’s a fairly interesting idea.

> However, I’d hardly describe it as “just”, it’s a fairly interesting idea.

I'd argue that it's an absolutely amazing idea however this idea has been brought into CS 31 years ago and utilised heavily ever since, so an article like this should imho at least mention it so that people can go away and learn about these concepts.

PS: My GP comment got heavily downvoted, which means that I'm missing something here.

> PS: My GP comment got heavily downvoted, which means that I'm missing something here.

I'm also wondering why this is; is there some obvious innovation described in the article that I missed?

The initial comment (and the follow-up) are completely dismissive. The article is absolutely not a reinvention of the concept of monads, or of any commonly used monad (though what they are describing could probably be implemented as a monad).

If I'm missing something, could you point to some monad that can be used to `wait until an event named "good" is generated, and abort if an event named "bad" is generated`?

I'm not much of a functional programmer, so I don't deal much with monads. What I can tell you, is that for simple use-cases this looks an awful lot like map-reduce, except you're not necessarily mapping a list of inputs, but evaluating a series of checks on one input. Let's call it eval-reduce.

But even from a non-functional perspective, I think the article is reinventing the wheel. We have a name for that sort of behavior: Object Orientation. Not in the UML/Java/C++ sense, but the original idea that centered around message-passing. Think erlang or elixir, those work in a very similar way already, and it's why people love them so much.

There is certainly some inovative thinking in the article, but that's mostly relating to the syntax and the specific semantics, not the overall idea of evaluating conditions concurrently and joining them in the end.

> What I can tell you, is that for simple use-cases this looks an awful lot like map-reduce

I really don't think this is like map-reduce, map-reduce usually focuses on pure functions applied in a series and/or in parallel, with the entire pipeline acting essentially like one large pure function that can easily be distributed.

In contrast, this idea seems to focus on persistent side-effectful procedures that react to a string of events, with constant global syncing between an unbounded number of modules, which makes it impossible to distribute and highly non-pure.

Also, this seems like a much more specific idea than Objects or even than Erlang. It specifically focuses on extremely small "objects"/"processes", it specifies rigid semantics for their communication, and a frankly insane composition model for the whole program.

I'm pretty sure you could implement this in Erlang, but also pretty sure that this is not how most Erlang programs are written. Similarly, this is within the paradigm of Object Orientation, but it is a very specific example and that there are other (saner, imho) ways of composing Object Oriented programs that conform to the original Alan Kay ideas.

Pretty much, or at least reinventing functional interfaces which encode behavior at the type level.

The point of structure is to aid comprehension. If you throw out the requirement for the system to be comprehensible, you can write it any way you wish.

Treating code as a black box with regard to making behavioural changes is essentially taking the worst aspects of TDD to extremes. It is not enough that it gives the right answers for your test cases, since you can only test a vanishingly tiny subset of all cases which need to be handled. It must be possible to reason about all possible outcomes, and that means removing unnecessary logic, not plastering over it.

Alternatively it involves actually mathematically proving that the function is used correctly, which is rather hard in sequential programming languages, and nigh impossible in parallel. Just proving the alleged "sync" works is a nightmare involving some proof or assumptions on underlying hardware. Exhaustive testing of parallel behaviors is near impossible too.

Removing logic can only go as far as the essentials that are required for solving the problem.

Applications are open for YC Winter 2021

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact