Hacker News new | comments | show | ask | jobs | submit login
Lay Out Your Code Like You'd Lay Out Your House (frederikcreemers.be)
86 points by kroltan 3 months ago | hide | past | web | favorite | 45 comments

How we store our documents; how we structure our work: This is a constant background channel of communication to the entire team, working tirelessly every moment of every day. It strongly influences how we think about our work, and is one of the best ways we have of influencing the engineering culture of our organisation.

Melvin Conway: "Organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations."

The way that we organise our filing system is nothing less than the (asynchronous) communications structure of the organisation, and needs to be architected with the same care and attention to detail.

By the same token, no single scheme will fit all organisations, and each team should seek to find a way of structuring their work that plays to their strengths and gives them a unique competitive advantage.

Here are my thoughts from over the years:

https://softwareengineering.stackexchange.com/questions/1458... https://softwareengineering.stackexchange.com/questions/8189... https://github.com/wtpayne/hiai/tree/master/a3_src

Whenever we impose a structure onto a set of objects, we also impose our values.

Nowhere is this clearer than when the structure is hierarchical. Divisions close to the root of the tree implicitly carry a greater weight of importance than those close to the leaves.

This property means that it is difficult to find compromise within a single hierarchy; a driving force behind the prevalence of flat 'tagging' mechanisms.

In spite these drawbacks, the very rigidity of hierarchies offers us an unparalleled communications opportunity: The distinctions that we make at the root of the tree are highly visible and send a clear message about the divisions and items that we consider important.

Is it important that tests be kept separate from product logic? Is it important that projects be kept distinct from one another? Is it important that each component or service be kept isolated? How about maturity? Is it important that research code be kept distinct from production code?

We have the opportunity to form a strong opinion, to make a lasting choice, and to shape the culture and the attitudes of those who follow.

Or, alternately, lay your code out like you'd lay out your workshop.

In our houses we tend to use the same things in the same ways each day. There's just the one razor, keep it next to the shaving cream.

In a workshop, we need to cooperate with other people, and even for ourselves, it's important to be able to find a tool when we need it. All the X-acto blades go in the same drawer, next to the drill bits.

Which one is more like your codebase? If it's a blog, might be the house. I think more code is workshop-like, but YMMV.

What if the workshop grows into a factory? Then you have people walking into the gigantic tool closet, overwhelmed by all the tools they'll never use when they just want to find their wrench. I don't think metaphors really help here and it's mostly a matter of consistency and tooling. Instead of having to guess who uses a function or where it originates from, my editor gives me "go to definition" and "find all references". I also have "go to symbol", which allows me to search for classes or functions instead of having to browse the folder structure manually. I think it's best to optimize for the advanced tools we have to inspect a codebase instead of manual browsing and aesthetics.

The article was basically "what if struct of arrays instead of array of structs?"

This is pretty similar to how I like to structure my code - basically by feature rather than by type. I find there's a lot less mental overhead to working this way, but it can bite you if you end up with a lot of code that is shared between features. Plus what a 'feature' is can be more subjective than what a type of class is and some teams might struggle because of strong, contrasting opinions on this.

That said, when a project gets big enough I tend to end up relying on memory and search (e.g. IntelliJ's find class) to find files anyway - at that point file name conventions are more important than anything else.

When you say "file name conventions" at the end there, are you simply thinking of the importance of being consistent in their usage, or is this a reference to something more specific?

Consistent in their usage, of course - it helps if, for instance, your tests have something referring to the fact that they're a test in the name. It's common for them to have 'UT' on the end so that when you do a search for "MyFancySomethingServiceUT" you can be reasonably confident it'll find your test.

> But since the file is not located near the change, you need to search through the entire code base to see where the changed function is called in order to update the calls.

Regardless if the method changed is in the same file or another directory my IDE will find it for me right away.

This is an age old discussion but I think it has been made somewhat moot by the speed and power of our IDE's. Essentially there have been two approaches...

1 ) Put things that are the same together. Tools go in the tool-shed, food in the kitchen.

2) Put things that you need to use at the same time together. Some tools can go in the kitchen because you need a few tools there. Maybe a small fridge in the tool-shed is not a bad idea.

A new third approach opens up as our IDE's got more powerful. Put the files anywhere. This is akin to having a robot in your house that regardless of where you are, it will make a tool or food appear for you on request. This is similar to how I no longer spend time organizing old emails because I just rely on the email system's powerful searching features to find what I need.

For example, my IDE (Netbeans) has CNTRL-O, which lets me go to any class file directly without going to any directories. It also has usage search and implementation search. If you digging around directories, then you doing it wrong. I am still not sold on putting your files in random places, but I just feel where you put them matters less for most purposes.

In data retrieval systems, there are typically two ways you find things: searching and browsing.

What you're describing are the facilities that modern IDEs and editors provide to support searching. Those are great, and like you, I use those search facilities to jump directly to a specific point of interest a lot of the time when I'm developing.

However, searching is useful only if you already know what you want to search for. If you're trying to explore the overall structure of a large program, having the files (and the contents within those files) organised systematically still makes browsing easier.

another option is to store code in a database so that you can access it however you want.

I love this concept when you are dealing with a blog that has posts and comments, or a todo app.

However..in my experiences thus far, this type of cohesion becomes increasingly complex and open to individual interpretation when you are dealing with more complex business domains. Say for example, several different types of customers engaging in several different types of transactions that have just enough differences in implementation to be several distinct models. Often there are several reasonable ways to interpret "functionality", and you end up with a situation where you are actively building what can be a fairly tricky form of technical debt over the course of years as developers come and go. From what I have experienced, the best way to avoid this, is of course more formality on the development team about structuring what goes where, but then you just end up with a different but still just as rigidly enforced structure as the first example from this article. It also has the downside of being non-standardized.

If I were going into some creature's "house", when they may very well be an alien creature whose domain and existence I understand little of, I personally feel most safe in the house that had the more "absurd" structure. I may have to walk down a few more hallways, but I won't blast myself with a ray gun looking for a faucet.

In this situation the best thing to do might be to make a choice, go with it for a while, and see how well the subsequent changes cluster. If all the changes in patches implementing features tend to be in the same place, you got it right. If not, time to reshuffle.

Making a code layout like this is effectively making a prediction about what future changes are likely to entail. Predictions can be wrong, so we want to be able to change anyway.

I think that if changes are small, you can just update the structure in small increments as needed, just like you'd do any other refactoring.

The main argument I’ve come across against this is that text editors commonly put great weight on file names—opening by file name, showing the file name without its containing directory structure or with it harshly abridged (e.g. /h/c/B/p/controllers.py). It would be nice for this to be addressed better in editors, but it is a bit of a tricky matter to handle across varying projects.

Of course, as presented, having lots of files named “posts.py” is also fairly terrible in text editors.

One solution is for the filename to include more, e.g. posts/posts-controllers.py. This helps some tools at the cost of others and greater verbosity.

This summarizes my struggle when getting my feet wet with custom Ansible playbooks. Having a bunch of `main.yml` files open in your editor and trying to figure out what exactly it is you're currently looking at greatly increases cognitive load and distracts from the actual task at hand.

I like how emacs does it: when there's one file open with a particular name, show just the name; when there's more than one, there's a legend with as much of the path as needed to disambiguate.

* unique filename: filename.ext

* common filename, different directories: filename.ext<dirname>

* common filename and immediate parent directory name: filename.ext<grandparent/parent>

* etc

A hierarchical structure is simply not sufficient to express the various overlapping relations that exist in code. Having to apply a single tree structure is always going to be painful somewhere, even with an optimal solution. You're likely to get more payoff from reducing dependence on file system structure than from optimising said structure.

> But since the file is not located near the change, you need to search through the entire code base to see where the changed function is called in order to update the calls.

You can't really rely on all change being locally restricted, so you have to check the wider code base, anyway. Search through the entire code base should be easy, simple, and fast. If it isn't, that's a bigger problem than the directory structure of the project not being ideal.

I agree with your initial point: A single explicit hierarchical structure will never please everyone all of the time: There are too many implicit relations that could (and arguably should) be captured.

It might be tempting to try to maintain multiple directory structures, making use of hard or soft links into the code base, but this sounds like it might be fragile, not to mention hard work to maintain.

I am sure others have also tried to abandon hierarchical file systems altogether, placing the design into a database instead. This also seems like a lot of work, although I can see the attraction.

My solution is to embrace the conflict and take an opinionated approach to the hierarchy -- one that enhances and supports the development process -- but to supplement that with generated HTML documentation with multiple different hyperlink-based navigation schemes.

At some point, I also plan to integrate a javascript text editor into these HTML reports, at which point the whole contraption will have transmogrified into a sort of Kentucky IDE.

> It might be tempting to try to maintain multiple directory structures, making use of hard or soft links into the code base, but this sounds like it might be fragile, not to mention hard work to maintain.

There are a lot of projects organised around automating this maintenance—though they tend to pop in and out of existence, so I'd agree we're not at the state of having a reliable solution that'll be around next year, or next month. TagFS (https://github.com/marook/tagfs) is the system with which I've been familiar most recently, but it appears to have been dormant for four years.

I'm a fan of grouping by cohesion for the reasons the author lays out, but like any principle you need to understand how and why you're applying it. The author suggests grouping reducers, actions, and components by functional cohesion in a React/Redux app, but the point of grouping by layer in this case is to uncouple the layers in your app (UI vs. store vs. actions). In this situation, those layers represent the functions around which your code coheres.

We're beginning to learn and refactor towards this approach.

We've already layered the app between state and UI. UI components are grouped by features. We've not done this with state but it makes sense to. But we can't see a benefit to merging state and UI layers. There are multiple UI features that make use of a set of actions and selectors.

Ironically the page layout is broken on my iPhone using safari. One can debug this with desktop safari by going into responsive design mode (forget what it’s called)

I used Reader mode :) Reader mode saves awful websites all the time

Yep, same on Android. Weird, being a mobile design.

I find it pretty bad on the laptop too. Why people want large blocks of junk down the left of the page I don't know. I think it's half the reason G+ failed to do very well.

Hi everyone, author of the blog post here.

Thank you @kroltan for sharing this.

I know the mobile experience and accessibility of my site have some issues. I have a change coming t fix it, but feel free to report any issues you encounter.

First off I hate these types of analogies as I feel they cloud the issue. You can really think of this as two different semantic domains: the solution domain and the problem domain. You can break them down as (controllers, errors, storage, templates, validators) vs (posts, comments, users), respectively. His layout is pretty consistent with standard Java/Spring/Hibernate project structures. In my experience when you work on a project like this you are working on something in the problem domain like users and when you make a change you have to go hunt down each solution domain component in a separate package. It has always seemed to me that it is better to organize using the problem domain as it clusters functionality that you are likely to work on concurrently.

This reminds me of the Ducks pattern[1] that I've been excited about recently. And also Vue's single-file components.

[1] https://github.com/erikras/ducks-modular-redux/blob/master/R...

This type of layout works well for big apps too. The OP doesn't mention how to deal with sharing code, but it's easy.

This layout is similar to how I organize a fairly large real world project in my Flask course[0].

For example, here's some folders that house various functionality:







Each one has templates, views (route definitions), background tasks and models that are associated to that specific thing.

Sharing code is super simple between the 2. If you find yourself creating a general function that could be used in both, then you drop it in a lib/ folder that lives outside of those folders.

Overall this makes it very easy to skim a code base and see what it does. I've scaled this same pattern to having about 15 of those folders for distinct app functionality and hundreds of models, etc..

[0]: https://buildasaasappwithflask.com/

This sounds really good on the face of it but something in me is suspicious that this approach won't lead to even worse confusion in a real project. Maybe I've just been working in a certain way for too long and have gotten stuck in my ways..

It works well in the projects I've worked on, but I'd love to hear specific cases where it breaks down for you, since I've never worked in code bases developed by a large team.

well you might put your knives in the kitchen, but then you also need to have knives in the crafts area, or in the toolshed.

so if you lay out your code like you'd lay out your house your code will not be very DRY.

There are cross-cutting (unintended knife pun) concerns, and I Usually do give them their own directory.

I’m all for cohesion, which it seems was the main point the author was trying to make.

But I found the first motivating example (refactoring is hard) to be a poor argument in favor of cohesion. When making a breaking public API change, types and the ensuing type errors are going to be a way more powerful guiding force for identifying affected usage sites than cohesion. Of course, his examples were in Python, so maybe the author doesn’t have such luxury.

Nerves that fire together, wire together. Input that happen together end up mapped to nerves close to each other, so the neural maps in the brain end up in the same order as the parts of our body. Thumb and index finger input end up next to each other in the brain neural map.

We could setup our code to self-organize under similar principles perhaps.

actually I recall a time when I made a mistake in code placement that would have been avoided if I had followed this principle.

I had made an api controller, but then we needed to expand what our api covered to cover everything the app did (should have gone api first but again, my mistake, also partially inherited code base)

So of course I should have put the parts of the api that handled users into the user controller and the parts that handled images into the image controller and so on and so forth, but instead I thought (and here I can point to overwork dulling ones thinking processes) 'I have an api controller, this stuff now needs to be moved over from these other controllers into api controller' and in the end it became a horrible mishmash with some duplicated functionality I needed to fix later.

This article is horrible to read on mobile. It has a forced size which is bigger than what fits on a 5 inch display and I can't resize it.

I'm sorry, I was working on the mobile-friendliness of my site last night before I went to bed, but decided to prioritise sleep over pushing out the change. I wasn't expecting this post to get posted here, and now I'm kind o frustrated that so many people had a sub-par experience with my site. I'm pushing out an improved layout ASAP.

For content, I agree, but houses also have plumbing, an electricity network, heating, etc

A house where all the wire was bundled up in a single room would be weird.

On the other hand, sane architects line up all of the bathrooms and kitchens and laundry rooms into a three dimensional cluster, to collocate the plumbing in the smallest area possible.

Someone read uncle bob's clean architecture

Possibly not. It's not like Uncle Bob's ideas in the book are completely original.

Closed the tab after this bullshit phrase: "As we all know, Wikipedia is the source of all truth on the internet".

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact