Language constraints aside, the real world is not something that can be cleanly modeled without the notion of circular dependencies between things. Not very many real, practical activities can be truly isolated from other closely-related activities and wrapped up in some leak-proof contract.
Consider briefly the domain model of a bank. Customers rely on Accounts (I.e. have one or many). Accounts rely on Customers (i.e. have one or many). This is a simple kind of test you can apply to your language and architecture to see if you have what it takes to attack the problem domain. Most approaches lauded on HN would be a messy clusterfuck when attempting to deal with this. Now, if I can simply call CustomerService from AccountService and vice versa, there is no frustration anymore. This is the power of reflection. It certainly has other caveats, but there are advantages when it is used responsibly.
If you want to understand why functional-only business applications are not taking the world by storm, this is the reason. If it weren't for a few "messy" concepts like reflection, we would never get anything done. Having 1 rigid graph of functions and a global ordering of how we have to compile these things... My co workers would laugh me off the conference call if I proposed these constraints be imposed upon us today.
Sometimes I read conversations about software engineering concepts by career developers and I get this sense that there is this entire hidden world of other engineers that I have somehow managed never to encounter.
Every engineer I've ever met who has been doing it for more than a couple years universally decries circular dependencies. Every one of them has come to that opinion via hard won experience dealing with real world problems they encountered. And yet comments like yours reveal there are other engineers working in places where apparently the opposite is true. Whenever I see that I wonder how this sort of self sorting manages to occur.
> Every engineer I've ever met who has been doing it for more than a couple years universally decries circular dependencies. Every one of them has come to that opinion via hard won experience dealing with real world problems they encountered.
They may not realize they have them - at runtime, in the object graph, possibly indirectly. There's little difference in principle between cycles in "static" code vs. runtime state, but we often can't express them in the former, because our languages don't specify the concept of dependency at enough granularity (and some rely on a linear compilation pass).
The fact that computing is shot through with people who think that putting things in-band makes them go away, and that because special cases have nice properties therefore King Canute can abolish the general case with a wave of his hand, is a very large part of the problem with computing.
I've worked with very high level engineers at Google that thought cyclic dependencies are fine. Sometimes they are just the simplest solution and trying to design abstractions to avoid it just creates a huge mess.
If you've working with Java at Google (Guice) makes you learn to hate hiding circular dependencies behind dependency injection, since your binary gets injected by some huge tree of thousands of dependencies all that can create errors or interfere with each other. And without static checking trying to reason and fix those issues becomes really hard.
I strongly prefer using the languages actual type system to create circular dependencies that you can inspect using well known tools over that any day, better have a problem you can see than hide the same problem to satisfy some tools requirement.
I have to say that in my whole career of doing Java / Clojure, Google’s SDK are one of the worst in terms of dependency hell. It’s incredibly unpleasant to work with, lots of breaking incompatibilities with libraries, etcetera.
I may sound very snarky, but it really feels like an over engineered mess and I suspect the reason they need all these dependency tricks is the fact that it’s over engineered, not necessarily that the problem they’re solving is so unique that they need circular dependencies.
You never truly need circular dependencies, however many times letting functions at the same abstraction level call each other greatly simplifies code.
I think something has been lost along the way with injection and IoC that makes using a fully functional programming language for object wiring a compelling alternative for most engineers.
To me, the parent is describing a functional circular dependency that we all have to live with, but when we’ll design the system it will be masqueraded in a many to many association in a third object/table and we’ll proudly say “we have no circular dependencies”.
To me both are technically true, reality is messy, and we hide the mess under imaginary constructs.
Possible reason: selection bias. If everyone else is fine with circular dependencies or at least sees them as a necessary evil... are they going to extremely vocal about the current status quo when there they have no need to defend it?
Some problem domains have unique characteristics. I think the closer a problem resembles nature, the more it breaks our structures we want to impose on it.
It's not clear the parent commenter actually needs a cyclic reference there, so I wouldn't draw any conclusions from it. Some of us don't believe it's necessary in that example.
In F#, you would naturally approach this by defining a Customer independent of the Account (e.g. just containing a name and address), and an Account independent of the Customer (e.g. just containing an ID), and then a Bank which is a mapping of Account to Set<Customer>. What you see as a cyclic dependency, I see as a data type that you haven't reified.
This came up in an application I had modeling college football. A Team is a member of a Conference and a Conference is made up of Teams. But during realignment a few years ago when teams were switching all over the place it became apparent that Teams and Conferences were truly independent of each other and that their association itself was a first class object which also needed to capture the Season/Year.
Not really rocket science but some times your model is telling you something if you’re willing to listen.
This is almost exactly the motivational example that Codd used when originally describing relational algebra. He described 5 different data organization schemes for a single problem and designed relational algebra to work with all of them. This ability for the same program logic to work with many different data storage layouts should make changes like you describe less painful to implement.
(see page 2)
E. F. Codd. 1970. A relational model of data for large shared data banks. Commun. ACM 13, 6 (June 1970), 377–387. DOI:https://doi.org/10.1145/362384.362685
Mine too. My main gripe with object graphs is that often you need to make arbitrary decisions about what comes "first" -- should I represent my customers as a list of `(name, address)` records or a record `(names, addresses)` of lists? This is a silly decision to have to make, and an even sillier one to have to deal with when later you need to transpose the structure to more naturally fit some other task. Relational tables sidestep this issue.
Over the years I've found that these transposition problems are a huge drag on programs, and once you recognize them you start seeing them all over the place. They make simple and obvious relational tasks into dozens of lines of nested data structure traversal and manipulation.
The notion of transposition comes from the array programming world, where a multi-dimensional array/tensor can be seen as a tree structure, whose levels may be easily reordered by transposition. I sometimes use numpy arrays for general-purpose programming; it can be very powerful with a well-chosen representation. Unfortunately array concepts basically require the data to be rectangular -- `x[i]` has the same shape as `x[j]` -- which can be hard to adapt to general problems.
My dream then is a general-purpose data structure that takes the best of both the relational and array-programming worlds: the freedom and flatness of relational tables, enriched with tacit ideas like broadcasting that make array programming so ergonomic.
It's not quite the same, but how do you feel about Datalog or Prolog? Having a set of facts and being able to make arbitrary relations in them makes for some interesting programming. Especially if you limit your data to a more rigid normal form like RDF tuples. Datafun is also a cool upcoming language in this space https://github.com/rntz/datafun
Generic foreign keys / Polymorphic relations. I.e. table Foo can either point to table Bar or table Xuul.
This type of relation shows up quite a bit, and either leads to application-level workarounds, or having to add many duplicate tables for the same "thing".
Relational algebra isn’t that hard to implement if you’re willing to sequential-scan all of the time. The massive complexity comes in the query planner: The point of adding an index is to change the space-performance tradeoff of certain operations. The system needs to be able to take advantage of these data layout changes, or there’s little benefit over storing everything in flat arrays.
This resembles sql a lot. Btw I'm surprised that few languages seem to have in their standard library a many-to-many container which supports efficient lookups in either direction.
Actually yeah that is kinda interesting. The couple times I've had to do it, always ended up hand-rolling a 'class' with hashmaps in both directions. I wonder why few standard libraries have a both way look up structure.
Insightful. Once you learn to see dependencies as implicitly revealing abstract data types you haven't yet teased out into existence, you start seeing all kinds of places where behavior that ought to be explicit and handled by an abstraction is smeared across multiple other concepts.
I've got a half-cooked blog post about this in the oven: "reify the state machine". I've seen plenty of code with lots of communicating parts, which is really a distributed implementation of a state machine. Rewriting to make the states explicit as data, and perhaps creating a central "state machine" type for that machine, can really help clarify the domain model and let you see what you're missing.
> and then a Bank which is a mapping of Account to Set<Customer>
The problem is that this only lets you get customers from accounts, and not vice-versa.
And if you add a Map<Customer, Set<Account>> to the Bank, you now have to ensure the mappings are synchronized at all times. Which, to be fair, is very straightforward as long as Bank is immutable, but still kind of annoying.
But considering how incredibly common many-to-many relationships are in business logic and especially in SQL, I'm constantly surprised that there isn't a de facto standard, efficient data structure to represent them. That speaks to how entrenched cyclic relationships are in software engineering, I suppose.
In the in-storage form (database), the mappings is synchronized at all times.
Many-to-many relationships in the purest form is described as records of ItemAId/ItemBId pairs. `List<{ account: Account, customer: Customer}>`. There isn't any cyclical relationship in this form.
From `List<{ account: Account, customer: Customer}>` one can derive `Map<Customer, Set<Account>>` and `Map<Account, Set<Customer>>`. Bank can have copy of both mappings and use them as double index, synchronizing both mappings as operations goes by, while at the same synchronizing both mappings to the in-storage form. Or not!
We're talking mostly about data-modelling in a programming language, which applies to the in-memory form. `Customer { accounts: Set<Account> }` and A
`Account { customers: Set<Customer> }` is its in-memory form, a layer of indirection from its actual form, the in-storage `List<{ account: Account, customer: Customer}>`.
If informations of Accounts, Customers, and the many-to-many relationship of both are laid flat in its purest, source-of-truth form, the in-storage database entries, why are people suggesting that there is/should be a cyclical relationship in its representative form, the in-memory objects?
This is the right answer and I wish more languages would make this dramatically more ergonomic to do. Store the most basic normalized truth and query/derive what you need.
Sure, but you wrap that up if you want. A Bank can contain both maps, if you want, and you hide the methods that would update individual maps, only exposing your wrappers that update everything together.
Yeah, I'm not saying it's difficult, I'm just saying that it's weird that there isn't a standard ManyToMany<T1, T2> class in most programming languages, even though a functional (if perhaps not performance-optimal) implementation is straightforward and should be very useful at least for DB mappings.
Which makes it easy to answer the question “what customers are on this account” and hard to answer the question “what accounts does this customer have”. As a customer, I usually want the latter.
I don’t quite understand what you’re trying to convey with your example in regards to FP.
First of all, both data and functions are first class concepts in the functional paradigm and are not glued together as classes. So you’re not modeling your domain as „account service“ and „customer service“, but rather have data representation of these concepts and functions that query/reduce etc on those. In your example that would be a relational model, described as sets of tuples/maps, and relational functions do derive new data. There is no need for circular dependence, because your operations are generic over relational data.
I agree with the notion that the purely functional approach doesn’t dominate, for good reason. But more and more languages are incorporating FP concepts since about a decade or longer, at least the basic building blocks like closures, immutability and function composition have gained massive traction and eliminate the need for many of the complex patterns in traditional, class based OO.
I actually don't follow your example; I don't see why this is cyclic. In your example what exactly would CustomerService be responsible for that prevents it from functioning unless it has a reference to AccountService? AccountService is pretty obvious (add/remove money, link other accounts, close account, etc.) but I'm struggling to see what CustomerService would do on a per-customer basis that it would need a reference to AccountService for. The only thing I can imagine is a get/add/remove accounts operation, but (a) you need nothing but an account ID for that (not the actual AccountService class) and the caller can look up the actual accounts using the IDs if they want, and (b) you can (and I think perhaps should?) model this as a database table managed by the Account class where you're associating customer IDs with account IDs.
Another way to think about this is, how would you do this in C++? You have header files, and they shouldn't be cyclic. Who #includes who? Well in this case it seems to me AccountService.hpp should #include CustomerService.hpp, but not the reverse. Maybe CustomerService.hpp would #include AccountID.hpp? Whatever I imagine I just don't see why you'd need cyclic #includes.
One can have Customer in one header file, possibly forward declaring Account. Account in another header file possibly forward declaring Customer. Definitions in the two cpps which can include both header files. When writing such a thing one should ask oneself whether one really needs to do it that way, though. Not that it is too much of a mess but asking such question is always good.
In a relational database you would have a CustomerAccount table keeping track of which accounts that belongs to which customers. Customer would not know about Account, and vice versa.
Reflection allows you to put the interfaces to Account and Customer into a shared module and the business logic implementations into their own mutually independent modules, thus removing the cyclic compile time dependency.
You "wire up" the implementations at runtime, using reflection.
I still don't get it. I put the interface into a shared module and the implementations in independent modules without reflection. Why exactly would we need reflection to do that?
It's called the "dependency inversion principle". People don't like to call it "reflection", and heap various layers like XML or dependency injection annotions on it, but technically, it comes down to reflection at runtime, as far as I've seen.
It certainly has its cost in additional complexity through indirection, but it's better than creating cyclic dependencies or giant balls of mud.
The question was "what has reflection got to do with it". I've used reflection for dependency inversion, so I think that's what it's got to do with it.
If you can do dependency inversion without reflection, more power to you :-) We can't do classpath scanning in the project I'm working on because of the size of the classpath, and compile time configuration using direct imports would introduce cycles, so reflection it is for us, in one form or another.
My heart sinks whenever I need to spend ten minutes trying to work out which of the three different implementors is actually going to be called on a particular code path. For about a week of December, my work was solely spelunking to track down which implementations of an interface in C# were dead code and so on.
Right. Often there are 2 types of interfaces in a project. The first are "natural" interfaces, that you have put some design into and are meant to be reusable. Things like Streams or Collections. When you write a function that uses one of these, you really are expecting to be able to use any implementation. If you want to go to the implementation of Stream.Read, obviously the IDE isn't going to be able to do it, it's abstract and there are any number of implementations.
You get the other kind of interface when you want to loosely-couple your code, and so you define interfaces for many classes. Often, there is only a single class that implements the interface in your project, though there may be mock implementations in your test code. Even though there is a single "real" implementation, the IDE can't/won't jump to that implementation in the same way it won't in the first case. This is frustrating though, because it would have worked if you hadn't extracted the interface for improved testability.
It seems like an IDE could know if there is a single implementation and could go to it, especially if the interface is not visible outside the project ("internal" in C#). In practice, the interface is in the same file as the primary implementation, so it's not as bad as it seems. More of an annoyance.
Jetbrain’s Rider as well as their ReSharper for Visual Studio have a navigate to implementation feature which I use the shortcut key for all the time.
The fact our solution has an interface for just about everything for testing reasons doesn’t slow me down even in the slightest when I’m looking through the code.
I've seen people learning about dependency inversion from a book called "Clean Architecture", then proceeding to apply it to every bit of code they write, to make it "clean". It makes code difficult to trace by reading alone. Indirection may be cheap, but it adds up.
But functional-only business applications already did take the world by storm! Relational DBs is the defacto way data is organized, and a Junction table is the way you model the data so that a functional query (generally in SQL) can be made against it.
Yes, there is a single transactional global state.
There is always state somewhere, the question is where it should be.
Global state is the input to the program, and a new global state is the output. Given identical inputs, it will always produce the same output.
This is the entire principle behind ACID, and mutation of global state is a common approach for functional programmers regardless of whether they they are using a DB or not.
If a DB were not fully transactional, or if your queries of the DB are not deterministic themselves, then we’ve moved out of functional territory.
I think the OP is overly dogmatic and I agree with you that sometimes you just need circular dependencies
That said, I don't see what circular dependencies have to do with reflection, much less a functional style? For a trivial example, Rust lacks reflection but the following (functional-ish) code works just fine:
mod ModA {
use crate::ModB::B;
pub struct A {
ref_to_b: Option<Box<B>>
}
pub fn into_b(a: A) -> Option<Box<B>> { a.ref_to_b }
}
mod ModB {
use crate::ModA::A;
pub struct B {
ref_to_a: Option<Box<A>>
}
pub fn into_a(b: B) -> Option<Box<A>> { b.ref_to_a }
}
Could you give an example of a domain that naturally requires circular dependencies? I have been trying lately to practice seeing outside the functional-programming-tinted lenses, but I need a bit of help to do so :)
I think the GP's example is pretty reasonable. And like I said, I still don't really see what this has to do with FP. Clearly F# has trouble with it, but so does C++
In the functional paradigm, language constraints aside, to model this "story" would need these:
- The definition of Account
- The definition of Customer
- A ledger consisting of:
List of Account,
List of Customer,
List of Account-Customer relations
A clerk works on the records on the ledger with one hand and one pen, a metaphor for the service process working on the data in the database.
An act, such as money transfer, would be described as a function.
A customer creating a new account for himself is written as `fn createAccount(Customer, NewAccountData)` because from the perspective of the clerk/bank manager/service the customer, newAccountData, and the existing data in the ledger as objects which the clerk/bank manager/service must move around in a precise way.
The module which has 'fn createAccount` depends on the types `Account` and `Customer`.
In english it roughly sounds like,
"the success of writing the rule of creating an account for a customer depends on knowing the definition of Account and Customer."
The function is not written as `customer.accountService.createAccount(newAccountData)`, because the clerk doesn't schizophrenically pretend to be the customer and create an account for himself. The clerk just receive request from the actual customer, writes new account entry and customer-account-relation entry into the ledger, that's it.
There's simply no need to call CustomerService from AccountService and vice versa. There's no need for reflections because data types are all available at compile time.
Oh, I meant to emphasis on that "customerService depends on accountService and vice versa" thingy. While in my writings "accountService is a member of customer" is a form of customer depending on accountService.
But my main point is that it should not be written that way at all.
Semantically, if we must include the subject, it should be
Or replace theService with CustomerAccountService or whatever.
Writing that way, with functions depending on types, avoid getting tangled from the so-called account-customer cyclical dependencies. Because account and customer don't need to know each other until a function needs to know both of them. There's no such thing as `customer.getAccounts()` because in the end the query would roughly look like `getAccountsWhereCustomerIs(CustomerId)`.
You're not dumb. It's just me mis-writing due to the fact that I'm writing this at 4 in the morning lol. Or maybe I'm misinterpreted parent comment and that confuses you. In that case, I'm the dumb.
Banks think the other way around - the account is important, the customer less so. Multiple entities, or no entities, might be authorized to access or control the account at various times.
It reads as an extremist take on Chesterton’s Fence: “the fence exists, therefore it is probably the only thing that could’ve worked.”
There is nothing in this contrived example that F# or Haskell (gasp, pure functional) wouldn’t handle with ease. It just would be modeled differently from traditional langs. I’d argue the alternative modeling forced by FP langs would be superior.
" I’d argue the alternative modeling forced by FP langs would be superior."
Then I'd like to see that. The other example of the functional someone gave more above, was not convincing to me. Much more complicated then the straight-forward simple, but evil cycle approach.
Functional only actually can model this kind of real world circulair dependencies easily without having circular dependencies in the language level. Not sure what it even has to do with functional style in general though.
Consider briefly the domain model of a bank. Customers rely on Accounts (I.e. have one or many). Accounts rely on Customers (i.e. have one or many). This is a simple kind of test you can apply to your language and architecture to see if you have what it takes to attack the problem domain. Most approaches lauded on HN would be a messy clusterfuck when attempting to deal with this. Now, if I can simply call CustomerService from AccountService and vice versa, there is no frustration anymore. This is the power of reflection. It certainly has other caveats, but there are advantages when it is used responsibly.
If you want to understand why functional-only business applications are not taking the world by storm, this is the reason. If it weren't for a few "messy" concepts like reflection, we would never get anything done. Having 1 rigid graph of functions and a global ordering of how we have to compile these things... My co workers would laugh me off the conference call if I proposed these constraints be imposed upon us today.