Hacker News new | past | comments | ask | show | jobs | submit login

Language constraints aside, the real world is not something that can be cleanly modeled without the notion of circular dependencies between things. Not very many real, practical activities can be truly isolated from other closely-related activities and wrapped up in some leak-proof contract.

Consider briefly the domain model of a bank. Customers rely on Accounts (I.e. have one or many). Accounts rely on Customers (i.e. have one or many). This is a simple kind of test you can apply to your language and architecture to see if you have what it takes to attack the problem domain. Most approaches lauded on HN would be a messy clusterfuck when attempting to deal with this. Now, if I can simply call CustomerService from AccountService and vice versa, there is no frustration anymore. This is the power of reflection. It certainly has other caveats, but there are advantages when it is used responsibly.

If you want to understand why functional-only business applications are not taking the world by storm, this is the reason. If it weren't for a few "messy" concepts like reflection, we would never get anything done. Having 1 rigid graph of functions and a global ordering of how we have to compile these things... My co workers would laugh me off the conference call if I proposed these constraints be imposed upon us today.

Sometimes I read conversations about software engineering concepts by career developers and I get this sense that there is this entire hidden world of other engineers that I have somehow managed never to encounter.

Every engineer I've ever met who has been doing it for more than a couple years universally decries circular dependencies. Every one of them has come to that opinion via hard won experience dealing with real world problems they encountered. And yet comments like yours reveal there are other engineers working in places where apparently the opposite is true. Whenever I see that I wonder how this sort of self sorting manages to occur.

> Every engineer I've ever met who has been doing it for more than a couple years universally decries circular dependencies. Every one of them has come to that opinion via hard won experience dealing with real world problems they encountered.

They may not realize they have them - at runtime, in the object graph, possibly indirectly. There's little difference in principle between cycles in "static" code vs. runtime state, but we often can't express them in the former, because our languages don't specify the concept of dependency at enough granularity (and some rely on a linear compilation pass).

The fact that computing is shot through with people who think that putting things in-band makes them go away, and that because special cases have nice properties therefore King Canute can abolish the general case with a wave of his hand, is a very large part of the problem with computing.

I've worked with very high level engineers at Google that thought cyclic dependencies are fine. Sometimes they are just the simplest solution and trying to design abstractions to avoid it just creates a huge mess.

If you've working with Java at Google (Guice) makes you learn to hate hiding circular dependencies behind dependency injection, since your binary gets injected by some huge tree of thousands of dependencies all that can create errors or interfere with each other. And without static checking trying to reason and fix those issues becomes really hard.

I strongly prefer using the languages actual type system to create circular dependencies that you can inspect using well known tools over that any day, better have a problem you can see than hide the same problem to satisfy some tools requirement.

I have to say that in my whole career of doing Java / Clojure, Google’s SDK are one of the worst in terms of dependency hell. It’s incredibly unpleasant to work with, lots of breaking incompatibilities with libraries, etcetera.

I may sound very snarky, but it really feels like an over engineered mess and I suspect the reason they need all these dependency tricks is the fact that it’s over engineered, not necessarily that the problem they’re solving is so unique that they need circular dependencies.

You never truly need circular dependencies, however many times letting functions at the same abstraction level call each other greatly simplifies code.

I think something has been lost along the way with injection and IoC that makes using a fully functional programming language for object wiring a compelling alternative for most engineers.

To me, the parent is describing a functional circular dependency that we all have to live with, but when we’ll design the system it will be masqueraded in a many to many association in a third object/table and we’ll proudly say “we have no circular dependencies”.

To me both are technically true, reality is messy, and we hide the mess under imaginary constructs.

Ah, but in the functional version, the messiness is both totally explicit and trivially inspectable.

Possible reason: selection bias. If everyone else is fine with circular dependencies or at least sees them as a necessary evil... are they going to extremely vocal about the current status quo when there they have no need to defend it?

In my experience, it’s generally less experienced engineers that will happily make a clusterfuck of circular dependendies.

If a more experienced engineer does, it’s generally considered a necessary evil, but not a choice taken lightly.

Some problem domains have unique characteristics. I think the closer a problem resembles nature, the more it breaks our structures we want to impose on it.

It's not clear the parent commenter actually needs a cyclic reference there, so I wouldn't draw any conclusions from it. Some of us don't believe it's necessary in that example.

Manifolds? Locally we want everything to be a straight line, but when you zoom out far enough you see it was a circle all long.

In F#, you would naturally approach this by defining a Customer independent of the Account (e.g. just containing a name and address), and an Account independent of the Customer (e.g. just containing an ID), and then a Bank which is a mapping of Account to Set<Customer>. What you see as a cyclic dependency, I see as a data type that you haven't reified.

This came up in an application I had modeling college football. A Team is a member of a Conference and a Conference is made up of Teams. But during realignment a few years ago when teams were switching all over the place it became apparent that Teams and Conferences were truly independent of each other and that their association itself was a first class object which also needed to capture the Season/Year.

Not really rocket science but some times your model is telling you something if you’re willing to listen.

This is almost exactly the motivational example that Codd used when originally describing relational algebra. He described 5 different data organization schemes for a single problem and designed relational algebra to work with all of them. This ability for the same program logic to work with many different data storage layouts should make changes like you describe less painful to implement.

(see page 2)

E. F. Codd. 1970. A relational model of data for large shared data banks. Commun. ACM 13, 6 (June 1970), 377–387. DOI:https://doi.org/10.1145/362384.362685

PDF: https://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf

My dream, when I'll have a few years of free time, is to design a language were the relational table is the primary data structure abstraction.

Mine too. My main gripe with object graphs is that often you need to make arbitrary decisions about what comes "first" -- should I represent my customers as a list of `(name, address)` records or a record `(names, addresses)` of lists? This is a silly decision to have to make, and an even sillier one to have to deal with when later you need to transpose the structure to more naturally fit some other task. Relational tables sidestep this issue.

Over the years I've found that these transposition problems are a huge drag on programs, and once you recognize them you start seeing them all over the place. They make simple and obvious relational tasks into dozens of lines of nested data structure traversal and manipulation.

The notion of transposition comes from the array programming world, where a multi-dimensional array/tensor can be seen as a tree structure, whose levels may be easily reordered by transposition. I sometimes use numpy arrays for general-purpose programming; it can be very powerful with a well-chosen representation. Unfortunately array concepts basically require the data to be rectangular -- `x[i]` has the same shape as `x[j]` -- which can be hard to adapt to general problems.

My dream then is a general-purpose data structure that takes the best of both the relational and array-programming worlds: the freedom and flatness of relational tables, enriched with tacit ideas like broadcasting that make array programming so ergonomic.

I’m working on a Rust library for this as my MS project. It calculates the queries inside the type system, so there’s minimal runtime cost.

It's not quite the same, but how do you feel about Datalog or Prolog? Having a set of facts and being able to make arbitrary relations in them makes for some interesting programming. Especially if you limit your data to a more rigid normal form like RDF tuples. Datafun is also a cool upcoming language in this space https://github.com/rntz/datafun

Starting from the other end, what is missing in SQL to make it more general purpose?

Generic foreign keys / Polymorphic relations. I.e. table Foo can either point to table Bar or table Xuul.

This type of relation shows up quite a bit, and either leads to application-level workarounds, or having to add many duplicate tables for the same "thing".

This is a cool idea. What would you need to be able to try this in 2 weeks, instead of a few years? What's the lean slice?

Relational algebra isn’t that hard to implement if you’re willing to sequential-scan all of the time. The massive complexity comes in the query planner: The point of adding an index is to change the space-performance tradeoff of certain operations. The system needs to be able to take advantage of these data layout changes, or there’s little benefit over storing everything in flat arrays.

This resembles sql a lot. Btw I'm surprised that few languages seem to have in their standard library a many-to-many container which supports efficient lookups in either direction.

Actually yeah that is kinda interesting. The couple times I've had to do it, always ended up hand-rolling a 'class' with hashmaps in both directions. I wonder why few standard libraries have a both way look up structure.

Which languages support this at all? Are you talking about ORM?

I'm talking about something like this

    std::relation<Foo, Bar> foobars;
    auto& bars = foobars.forward(foo);
    auto& foos = foobars.backward(bar);
Edit: Boost apparently has it, see e.g. https://stackoverflow.com/questions/1128144/c-maps-for-many-...

Boost.Bimap is a special case (and built on top) of Boost.Multiindex that allows indexing a logical table with an arbitrary subset of fields.

> a data type that you haven't reified

Insightful. Once you learn to see dependencies as implicitly revealing abstract data types you haven't yet teased out into existence, you start seeing all kinds of places where behavior that ought to be explicit and handled by an abstraction is smeared across multiple other concepts.

I've got a half-cooked blog post about this in the oven: "reify the state machine". I've seen plenty of code with lots of communicating parts, which is really a distributed implementation of a state machine. Rewriting to make the states explicit as data, and perhaps creating a central "state machine" type for that machine, can really help clarify the domain model and let you see what you're missing.

> and then a Bank which is a mapping of Account to Set<Customer>

The problem is that this only lets you get customers from accounts, and not vice-versa.

And if you add a Map<Customer, Set<Account>> to the Bank, you now have to ensure the mappings are synchronized at all times. Which, to be fair, is very straightforward as long as Bank is immutable, but still kind of annoying.

But considering how incredibly common many-to-many relationships are in business logic and especially in SQL, I'm constantly surprised that there isn't a de facto standard, efficient data structure to represent them. That speaks to how entrenched cyclic relationships are in software engineering, I suppose.

In the in-storage form (database), the mappings is synchronized at all times.

Many-to-many relationships in the purest form is described as records of ItemAId/ItemBId pairs. `List<{ account: Account, customer: Customer}>`. There isn't any cyclical relationship in this form.

From `List<{ account: Account, customer: Customer}>` one can derive `Map<Customer, Set<Account>>` and `Map<Account, Set<Customer>>`. Bank can have copy of both mappings and use them as double index, synchronizing both mappings as operations goes by, while at the same synchronizing both mappings to the in-storage form. Or not!

We're talking mostly about data-modelling in a programming language, which applies to the in-memory form. `Customer { accounts: Set<Account> }` and A `Account { customers: Set<Customer> }` is its in-memory form, a layer of indirection from its actual form, the in-storage `List<{ account: Account, customer: Customer}>`.

If informations of Accounts, Customers, and the many-to-many relationship of both are laid flat in its purest, source-of-truth form, the in-storage database entries, why are people suggesting that there is/should be a cyclical relationship in its representative form, the in-memory objects?

This is the right answer and I wish more languages would make this dramatically more ergonomic to do. Store the most basic normalized truth and query/derive what you need.

Sure, but you wrap that up if you want. A Bank can contain both maps, if you want, and you hide the methods that would update individual maps, only exposing your wrappers that update everything together.

Yeah, I'm not saying it's difficult, I'm just saying that it's weird that there isn't a standard ManyToMany<T1, T2> class in most programming languages, even though a functional (if perhaps not performance-optimal) implementation is straightforward and should be very useful at least for DB mappings.

True, I have implemented it myself multiple times, getting a bit annoyed at its absence without stopping to ask why.

Which makes it easy to answer the question “what customers are on this account” and hard to answer the question “what accounts does this customer have”. As a customer, I usually want the latter.

I don’t quite understand what you’re trying to convey with your example in regards to FP.

First of all, both data and functions are first class concepts in the functional paradigm and are not glued together as classes. So you’re not modeling your domain as „account service“ and „customer service“, but rather have data representation of these concepts and functions that query/reduce etc on those. In your example that would be a relational model, described as sets of tuples/maps, and relational functions do derive new data. There is no need for circular dependence, because your operations are generic over relational data.

I agree with the notion that the purely functional approach doesn’t dominate, for good reason. But more and more languages are incorporating FP concepts since about a decade or longer, at least the basic building blocks like closures, immutability and function composition have gained massive traction and eliminate the need for many of the complex patterns in traditional, class based OO.

I actually don't follow your example; I don't see why this is cyclic. In your example what exactly would CustomerService be responsible for that prevents it from functioning unless it has a reference to AccountService? AccountService is pretty obvious (add/remove money, link other accounts, close account, etc.) but I'm struggling to see what CustomerService would do on a per-customer basis that it would need a reference to AccountService for. The only thing I can imagine is a get/add/remove accounts operation, but (a) you need nothing but an account ID for that (not the actual AccountService class) and the caller can look up the actual accounts using the IDs if they want, and (b) you can (and I think perhaps should?) model this as a database table managed by the Account class where you're associating customer IDs with account IDs.

Another way to think about this is, how would you do this in C++? You have header files, and they shouldn't be cyclic. Who #includes who? Well in this case it seems to me AccountService.hpp should #include CustomerService.hpp, but not the reverse. Maybe CustomerService.hpp would #include AccountID.hpp? Whatever I imagine I just don't see why you'd need cyclic #includes.

One can have Customer in one header file, possibly forward declaring Account. Account in another header file possibly forward declaring Customer. Definitions in the two cpps which can include both header files. When writing such a thing one should ask oneself whether one really needs to do it that way, though. Not that it is too much of a mess but asking such question is always good.

Funny that we both think in C++. It was what popped into my head too.

AccountMgr has many Account(s). Ditto for CustomerMgr. Interact with both to taste.

In a relational database you would have a CustomerAccount table keeping track of which accounts that belongs to which customers. Customer would not know about Account, and vice versa.

And what has reflection to do with this?

Reflection allows you to put the interfaces to Account and Customer into a shared module and the business logic implementations into their own mutually independent modules, thus removing the cyclic compile time dependency.

You "wire up" the implementations at runtime, using reflection.

I still don't get it. I put the interface into a shared module and the implementations in independent modules without reflection. Why exactly would we need reflection to do that?

If I may, that sounds horrifying. Please, for the sake of my sanity and the IDE, always do things at compile time rather than runtime if you can!

It's called the "dependency inversion principle". People don't like to call it "reflection", and heap various layers like XML or dependency injection annotions on it, but technically, it comes down to reflection at runtime, as far as I've seen.

It certainly has its cost in additional complexity through indirection, but it's better than creating cyclic dependencies or giant balls of mud.


No, you don’t need reflection or annotations to use dependency inversion. Why do you think so?

The question was "what has reflection got to do with it". I've used reflection for dependency inversion, so I think that's what it's got to do with it.

If you can do dependency inversion without reflection, more power to you :-) We can't do classpath scanning in the project I'm working on because of the size of the classpath, and compile time configuration using direct imports would introduce cycles, so reflection it is for us, in one form or another.

void createAccount(ICustomer c) { ... } // No dependency to concrete Customer

class Customer implements ICustomer { ... } // Customer Class can be created after Account above

createAccount(new Customer()) // Use of both, with dependency injection

Scanning for classes dynamically using reflection has nothing to do with above. And you certainly don't need any xml or framework to do it either.

If the code is implementing against interfaces, why would late binding mess up your IDE experience at all?

My heart sinks whenever I need to spend ten minutes trying to work out which of the three different implementors is actually going to be called on a particular code path. For about a week of December, my work was solely spelunking to track down which implementations of an interface in C# were dead code and so on.

Right. Often there are 2 types of interfaces in a project. The first are "natural" interfaces, that you have put some design into and are meant to be reusable. Things like Streams or Collections. When you write a function that uses one of these, you really are expecting to be able to use any implementation. If you want to go to the implementation of Stream.Read, obviously the IDE isn't going to be able to do it, it's abstract and there are any number of implementations.

You get the other kind of interface when you want to loosely-couple your code, and so you define interfaces for many classes. Often, there is only a single class that implements the interface in your project, though there may be mock implementations in your test code. Even though there is a single "real" implementation, the IDE can't/won't jump to that implementation in the same way it won't in the first case. This is frustrating though, because it would have worked if you hadn't extracted the interface for improved testability.

Is there any solution to this?

It seems like an IDE could know if there is a single implementation and could go to it, especially if the interface is not visible outside the project ("internal" in C#). In practice, the interface is in the same file as the primary implementation, so it's not as bad as it seems. More of an annoyance.

Jetbrain’s Rider as well as their ReSharper for Visual Studio have a navigate to implementation feature which I use the shortcut key for all the time.

The fact our solution has an interface for just about everything for testing reasons doesn’t slow me down even in the slightest when I’m looking through the code.

I've seen people learning about dependency inversion from a book called "Clean Architecture", then proceeding to apply it to every bit of code they write, to make it "clean". It makes code difficult to trace by reading alone. Indirection may be cheap, but it adds up.

This is perfectly possible without reflection. The interfaces already make it possible.

But functional-only business applications already did take the world by storm! Relational DBs is the defacto way data is organized, and a Junction table is the way you model the data so that a functional query (generally in SQL) can be made against it.

SQL is not functional, it mutates global state.

Yes, there is a single transactional global state.

There is always state somewhere, the question is where it should be.

Global state is the input to the program, and a new global state is the output. Given identical inputs, it will always produce the same output.

This is the entire principle behind ACID, and mutation of global state is a common approach for functional programmers regardless of whether they they are using a DB or not.

If a DB were not fully transactional, or if your queries of the DB are not deterministic themselves, then we’ve moved out of functional territory.


I think the OP is overly dogmatic and I agree with you that sometimes you just need circular dependencies

That said, I don't see what circular dependencies have to do with reflection, much less a functional style? For a trivial example, Rust lacks reflection but the following (functional-ish) code works just fine:

    mod ModA {
        use crate::ModB::B;
        pub struct A {
            ref_to_b: Option<Box<B>>
        pub fn into_b(a: A) -> Option<Box<B>> { a.ref_to_b }
    mod ModB {
        use crate::ModA::A;
        pub struct B {
            ref_to_a: Option<Box<A>>
        pub fn into_a(b: B) -> Option<Box<A>> { b.ref_to_a }
Am I missing something?

Could you give an example of a domain that naturally requires circular dependencies? I have been trying lately to practice seeing outside the functional-programming-tinted lenses, but I need a bit of help to do so :)

I think the GP's example is pretty reasonable. And like I said, I still don't really see what this has to do with FP. Clearly F# has trouble with it, but so does C++

I don't think the GP's example is reasonable, and nor would an SQL-writer. Could you give something I will find it harder to wriggle out of?

I don't understand how reflection is involved here? Don't you just have a data model (schema) that can be interacted with?

In the functional paradigm, language constraints aside, to model this "story" would need these:

- The definition of Account

- The definition of Customer

- A ledger consisting of: List of Account, List of Customer, List of Account-Customer relations

A clerk works on the records on the ledger with one hand and one pen, a metaphor for the service process working on the data in the database.

An act, such as money transfer, would be described as a function.

A customer creating a new account for himself is written as `fn createAccount(Customer, NewAccountData)` because from the perspective of the clerk/bank manager/service the customer, newAccountData, and the existing data in the ledger as objects which the clerk/bank manager/service must move around in a precise way.

The module which has 'fn createAccount` depends on the types `Account` and `Customer`.

In english it roughly sounds like,

"the success of writing the rule of creating an account for a customer depends on knowing the definition of Account and Customer."

The function is not written as `customer.accountService.createAccount(newAccountData)`, because the clerk doesn't schizophrenically pretend to be the customer and create an account for himself. The clerk just receive request from the actual customer, writes new account entry and customer-account-relation entry into the ledger, that's it.

There's simply no need to call CustomerService from AccountService and vice versa. There's no need for reflections because data types are all available at compile time.

> `customer.accountService.createAccount(newAccountData)`

I believe that most OO implementations would read

    accountService.createAccount(customer, newAccountData)
Care to elaborate if that was the main point of the clerk's schizophrenia criticism? Or if I'm misinterpreting just call me dumb, haha.

Oh, I meant to emphasis on that "customerService depends on accountService and vice versa" thingy. While in my writings "accountService is a member of customer" is a form of customer depending on accountService.

But my main point is that it should not be written that way at all.

Semantically, if we must include the subject, it should be

`theService.createAccount(Customer, NewAccountData)`

Or replace theService with CustomerAccountService or whatever.

Writing that way, with functions depending on types, avoid getting tangled from the so-called account-customer cyclical dependencies. Because account and customer don't need to know each other until a function needs to know both of them. There's no such thing as `customer.getAccounts()` because in the end the query would roughly look like `getAccountsWhereCustomerIs(CustomerId)`.

You're not dumb. It's just me mis-writing due to the fact that I'm writing this at 4 in the morning lol. Or maybe I'm misinterpreted parent comment and that confuses you. In that case, I'm the dumb.

Customers and Accounts are not cyclic.

A customer has an account. The customer is the domain root. Hence there would be no accounts if there were no customers.

A customer does need to to hve an account. An account must have a customer.

It's not cyclic in any way

Banks think the other way around - the account is important, the customer less so. Multiple entities, or no entities, might be authorized to access or control the account at various times.

I’m having a hard time understanding why you think F# is unable to handle the domain model presented.

It reads as an extremist take on Chesterton’s Fence: “the fence exists, therefore it is probably the only thing that could’ve worked.”

There is nothing in this contrived example that F# or Haskell (gasp, pure functional) wouldn’t handle with ease. It just would be modeled differently from traditional langs. I’d argue the alternative modeling forced by FP langs would be superior.

" I’d argue the alternative modeling forced by FP langs would be superior."

Then I'd like to see that. The other example of the functional someone gave more above, was not convincing to me. Much more complicated then the straight-forward simple, but evil cycle approach.

“Customers rely on Accounts (I.e. have one or many)”

Alternative model: the Customer relation couples a Person to an Account.

I think that’s a better model, as it allows Person P to be a Customer at bank B but not at Bank C, Person Q to be a Customer at both B and C, etc.

It also allows you to model Persons before you model Accounts, breaking the circularity, allows Companies to become Customers, etc.

Functional only actually can model this kind of real world circulair dependencies easily without having circular dependencies in the language level. Not sure what it even has to do with functional style in general though.

circular dependencies at runtime are not really a problem and often necessary. circular dependencies at build time/between modules are evil.

Your build time is someone else’s runtime.

You would forget everything about a customer if they closed their last account?

No, of course you would maintain a Set<Customer> as well (possibly even a Map<Customer, Account Set> for efficiency).

... if you model it that way you're not using cyclic dependencies at all and this is in fact exactly how you would model it in a functional way.

Their whole premise is that

> Customers rely on Accounts (I.e. have one or many).

I'm giving a domain relevant counter example to that premise.

And interestingly enough, that leads you to a clean way of modeling it without the cyclic dependency as you've just done.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact