Hacker News new | past | comments | ask | show | jobs | submit login
Write code. Not too much. Mostly functions. (brandons.me)
825 points by brundolf on Dec 21, 2020 | hide | past | favorite | 352 comments

I like the sentiment of this article. It's a great analogy.

Might be a little off topic, but it reminds me how I am happy that the Go programming language and its philosophies gained popularity even though I don't use the language regularly. Watching Go talks made me appreciate simplicity and clarity.

It made me accept that I don't always need to use every design pattern in the book. It made me think about the readers of my code, who might not always be experienced enough, or might not always have time to understand the brilliant architecture I came up with. I can have some repeated code sprinkled around in the codebase. I don't always need to have n+1 layers in my architecture where all the layers just call the next layer anyway. It might be better to use functions over a complicated hierarchy of classes. It made me appreciate simple tools and widely accepted conventions that result in codebases that feel familiar the second you dive in.

Of course, go is not the only community where these ideas are prevalent, and it's good to know your design patterns and architecture, etc... Finding the balance is not always easy, but it's good to have a popular, successful "counter force" community.

>I don't always need to have n+1 layers in my architecture where all the layers just call the next layer anyway.

This is by far the most common thing I’ve seen consistently in especially difficult to maintain codebases. Anecdotal for sure, but number 2 on that list is way behind. Extra abstractions for a future that has yet to happen and abstractions because the IDE makes it easy to click through the layers is the number 1 by far reason I’ve seen code based be very difficult to maintain.

If you just keep in your head, “Can I see exactly enough on this page to know what it does? Not more, not less?“ It’s an impossible ideal but that concept is a fantastic mental guideline for maintainable codebases.

Hard same.

The best organizational level technique I've found so far is to add the rule of three to code review checklists. An abstraction requires at least three users. Not three callsites, but three distinct clients with different requirements of the abstraction.

Obviously it's not a hard rule, and we allow someone to give a reason why they think that it's still a good idea, but forcing a conversation starting with "why is this even necessary" I feel has been a great addition.

I’m curious why three call sites isn’t sufficient. Any time I find I have three instances of the same non-trivial logic, I immediately think of whether there’s a sensible function boundary around it, and whether I can name it. If I can, it’s a good candidate.

Obviously for trivial logic that’s less appealing. And obviously all the usual abstraction caveats (too many options or parameters are a bad sign, etc) apply.

The risk with so much duplication is that if the logic is expected to remain the same, even tests won’t catch where they diverge. To me that’s just as risky if not more with internal call sites than with clients, as at least client drift will be apparent to other users.

Abstraction here likely means more than a function - maybe something like an interface base class?

Probably. When talking about object oriented programs, "abstraction" is oftentimes used as a placeholder for "abstract class" as opposed to a "concrete class". You can see this at play when talking about the SOLID principles and when you get to the "D" part people want to turn every class into an interface because it says you must "depend upon abstractions, not concretions".

I think this is where I’ve been most at odds with common OOP approaches (apart from the common practice of widespread mutability). An interface should be an abstraction defining what a given operation (function, module) needs from input to operate on it and produce output, and nothing more. Mirroring concrete types with an interface isn’t abstraction, it’s just putting an IPrefix on concrete types to check a design pattern box.

A function is an abstraction as well. In case of a function, a 'client' of the function is the call.

That’s not what I took from it, but even if that’s what was meant, I think I’d have the same reaction. In terms of abstraction implementations, a class is just a different expression of the same idea of encapsulation.

Given the context of the posts that it was replying to, my impression was that they meant the "rule of three" applied to an entire abstraction layer.

I still don’t think I’d react differently. A function is an abstraction layer. Maybe this is just me being unintentionally obtuse because I’ve worked so long in environments where functions or collections/modules of functions are the primary organizing principle, but when I encounter “premature abstraction” arguments I don’t generally understand them to mean “sure take those three repetitions of the same logic and write a function, but think really hard about writing a module/namespace/package/class/etc”. Am I misunderstanding this?

I agree with the sentiment. A pure function with one or two parameters is going to attract a lot less scrutiny than a whole module with multiple classes.

I recently ran into the very same thing.

Instead of creating a Spring service to call a repository for data retrieval, i instead called the repository directly, because there was just a single method that needed to be implemented for read-only access of some data.

And yet, a colleague said that there should "always" be a service, for consistency with the existing codebase (~1.5M SLoC project). Seeing as the project is about 5 years old, i didn't entirely agree. Even linked the rule of threes, but the coworker remained adamant that consistency trumps everything.

I'm not sure, maybe they have a good point? However, jumping through extra hoops just because the software is a large enterprise mess doesn't seem that comfortable either, just because someone decided to do things a particular way 5 years ago. It feels like it'd be easier to just switch projects than try to "solve" "issues" like that (both in quotes, given that there is no absolute truth).

I think its a judgement call to be made. Being consistent with a deliberate architectural decision that is actually useful is important. Otherwise you could potentially have a broken window effect where more and more calls leak out of the service layer with the justification being if it was OK in one place why not others? Putting it in the service means that it is ready for any new calls that might be added and future collaborators know there's generally only one place to look for these calls. Now maybe in this situation it would be overkill but with bigger and longer lived the project, the more consistency pays dividends.

Well, consistency in itself is a good rule to follow. The problem is , if a bad decision was made at the beginning of the project, maintaining consistency despite that is madness.

Hey, it wouldn't become a 1.5M sloc codebase if these rules weren't followed! ;)

Yep consistency truly matters, since its likely this won't be the only need for data retrieval and everyone doing their own special thing means the code becomes an unreadable, in-consistent mess that cannot fit in anyones heads and development velocity slows to a crawl.

From where I am, there are abstractions coded for APIs in the layers of - topmost API layer, then Business logic and the 3rd DAO layer. Even though there is only one implementation everytime of these layers, this structuring alone has helped maintaining the code so much easier, as everyone even across teams goes by this structure while defining any API. Can't even imagine just coding functions in large codebases without a pre-defined structure, it can become brittle over time.

Large functional codebases have their degree of organization too, be it modules, namespaces or something similar.

Does this also apply to UI? I think a lot of front end libraries entice developers to fall for those early abstractions.

Why 3?

Rule of three sounds catchy but logically it's just a arbitrary number.

Similar to SOLID and KISS, why pick some arbitrary (and also obvious) qualitative features and put it into an acronym and declare it to be core design principles?

Did the core design principles just Happen to spell out Solid and Kiss? Did it happen to be Three?

Either way, in my opinion, designing an abstraction for 3 clients is actually quite complex.

The reason the OP advocates pure functions is because pure functions are abstractions designed for N clients, when things are done for N clients using pure functions the code becomes much more simpler and modular then when you do it for three specific clients.

This is a good question, and I haven’t yet seen anyone reply with (I think) the real answer: it’s not the rule of 3 so much as “not 2”.

When you start adding a new feature, and notice it’s very similar to some existing code, the temptation is to reuse and generalize that existing code then and there -- to abstract from two use cases.

The rule of 3 just says, no, hold off from generalizing immediately from just two examples. Wait until you hit one more, then generalize.

“Once is happenstance, twice is coincidence; three times is enemy action” (Ian Fleming IIRC)

I think setting hard limits on design is a good thing. Creativity needs limits. If your limits can imply something about your desired design goals then that’s a good synergy. It also forces the engineers to think more about design rather than fall back on their goto pattern that may or may not fit the problem. Especially junior and mid level engineers might not have good heuristics on is their design any good or is it just following whatever cargo cult they were brought up in.

Like one engineer on my team implemented this crazy overkill logger and I asked a few questions why do it like this and the answer was that they had implemented it in another language at another company. After that I told them to not have more abstraction layers than concrete implementations when adding a new feature.

I think a good programmer should have an "intuition" whether it is worth to build an abstraction for something or not. If in doubt don't do it.

If in hindsight your intuition fooled you constantly, adjust it.

I agree but it's kind of too vague to have as a company/team-wide policy

Sure, but I wouldn't implement something like that as a policy, but as a guideline. So when someone really goes overboard into one or the other directionyou can point them to the guideline, but there is still some freedom in deciding on the spot.

If the need / opportunity to abstract something is highly subjective then it is best left to the team lead / senior architect. For all other obvious cases having a policy as outlined above strikes a healthy balance between autonomy and uniformity.

While I usually like the zero-one-infinity rule as a go to when there aren't any other constraints, when trying to build an abstraction it can be fairly tricky to suss out the parts that actually are share vs what is actually different. Two unique and independent users could share a lot of process &c randomly, 3 is a little less likely.

> designing an abstraction for 3 clients is actually quite complex.

tells how abstract abstraction is(to limited extent)

It has been shown over and over to be a good number for this purpose.

You don't design the abstraction for 3 different clients as often as you abstract it from code used by 3 different clients.

The rule of 3 is catchy like you say, which means programmers will have a better chance of remembering when it's needed.

But a catchy name serves only to be catchy it doesn't serve as justification for the rule actually being correct.

Ye I don't like these way too specific rule of thumbs either. It is superstition that is invoked during code reviews to not having to explain or justify your arbitrary nagging on the reviewing side or defending a bad layout on the other.

Stuff should be analyzed in its context.

> Can I see exactly enough on this page to know what it does? Not more, not less

Is there some book/website/SO post that tries to drive this piont home? Bascialyl some web resource I can link to other programmers to explain the value of coding as such.

This article is from a personal blog on a website with a URL $someguysname.ninja.

Maybe you should be the one to write the article you seek! Believe in yourself. If you find yourself with steadfast values that you find tedious to repeatedly communicate, but that you thinks others ought to know about, why not write them down? Who knows - if it's good and resonates with others, is sounds advice, etc. one day it may end up on HN too.

Not everything worth doing has already been done before!

One of the best products I've had to maintain recently was a cgi app with very few abstractions, many of the pages in the app didn't even have functions, just construct sql, read it and spit out html. If someone had a problem all the code was right there in a single file and the error could be found patched and deployed in minutes.

Over the years there were a couple of attempts at replacing this legacy system with a "well-architected" .net one but all the architecture made things harder to maintain and it only ever got to a fraction of the functionality. When there was a bug in those ones we had to not only find it but we had to go through every other bit of calling code to ensure there were no unwanted side effects because everything was tied together. Often the bug was in some complicated dependency because spitting out html or connecting to a database wasn't enterprisy enough. Deployment was complicated enough it had to be done overnight because the .net world has a fetish for physically separating tiers even though it makes many things less scalable.

90% of the corporate/enterprise code I've seen would be much better off being more like that cgi app.

Counterpoint - code like that is OK if the project is small and tidy, but over a certain size, changes become horrible refactoring efforts and adding multiple developers to the mix compounds the problem. The 'enterprisey' rework that you describe sounds badly architected, rather than an example of why architecture is bad. Good architecture is hard to do but I don't agree that means we're better off not bothering.

My first programming job was with a firm that never had money for paying developers, let alone tools. It was also a few years before Visual Studio Code was a serious thing. So I used "programmer's editors" -- those cute things like Notepad++ which had syntax highlighting and on some days autocomplete but no real code understanding. There was no middleware, no dependency-injection, and things like the database instance were globals. More or less, the things you needed to know were in a single file or could be inferred from a common-libraries file.

My second job, they splashed the cash for full-scale professional IDEs, and they couldn't get enough abstraction. I suspect the conveninence of "oh, the tools will let us control-click our way to that class buried on the opposite side of the filesystem" made it feasible.

I wonder if there's some sort of "defeatured" mode for IDEs which could remind people of the cognitive cost of these choices.

> Extra abstractions for a future that has yet to happen and abstractions because the IDE makes it easy to click through the layers is the number 1 by far reason I’ve seen code based be very difficult to maintain.

This is always tempting. A good argument against it is to realise that future developers (us included!) will know their requirements far better than we can guess them; if code needs writing, they should do it (as an extra bonus, we don't waste effort on things which aren't needed). The best way to help them is to avoid introducing unnecessary restrictions.

> The best way to help them is to avoid introducing unnecessary restrictions.

But that's the other side of the exact same coin. How do you know if a restriction today is good or bad for the future? Restrictions prevent misuse and unexpected behavior, in the good case.

Incidentally one of the biggest benefits I see of using a text editor like vim / emacs is that it really encourages good code management.

It's not to save the ~10 minutes per year in faster key strokes to manipulate your code. It's about the way it shapes your thinking about how you code.

I agree to some extent.

After using Intellij for about 5 years I switched to a less batteries-included code editor (currently doom emacs). I figure if I need an IDE to navigate our code as a senior developer on the project then less experienced ones don't stand much of a chance.

I still use Intellij for refactoring.

Without a doubt this is my biggest issue within the software industry. Massive amounts of indirection & abstraction under the guise of 'DRY'.

Golang and simplicity in the same sentence does not quite reflects my daily experience.

Want a Set? Golang does not have one, create a map[type]boolean instead.

Want an Enum? Golang does not have one, create a bunch of constants yourself that are not tied together by a type, or create your own type that won't quite make what an Enum is.

If simplicity means feeling like you are programming in the 80's, that is what Golang meant for me with simplicity.

Not having basic stuff such as Set and having to workaround with a map of booleans is not simplicity, as you will have to make it turning the code into a more complex blob to represent the same kind of data structure.

I could go on and on with the list of things that lack instead of things that are simple. </rant>

I recently had to write some Go code and coming from Java/Scala world, actually the "err != nil" thing didn't bother me as much as I thought it would. In fact I liked the explicitness of error handling. However, lack of enums really puzzled me; how is having to go through "iota" hoops simpler than "enum Name { ... choices ... }"? I did like the batteries included approach though - I could build the entire component purely using standard library - not having to deal with importing requests and creating virtualenv etc was refreshing.

FWIW the set equivalent in Go would be map[Thing]struct{}, you don’t need a bool map value.

That being said I’m looking forward to more collection types now that they have a plan for generics.

As I get older my code gets a little more verbose and a little less idiomatic to the language I am writing. I’ve been writing code, starting with C, since 95. Mostly Python these days, but I try to make it clear and easy read. Mostly for myself. Future me is always happy when I take the time to comment my overarching goals for a piece of code and make it clean and well composed with enough, but not too many, functions.

> well composed with enough, but too many, functions.

In my experience, code with too many functions is more difficult to grok than spaghetti code. It's like trying to read a book with each sentence reference a different page. So, I try to code like I would write, in digestible chunks.

> As I get older my code gets a little more verbose

I've seen too many of my previous projects die right when I moved on. Now I tend to write code as if it were written by a beginner: verbose and boring, with no magic.

On the other hand, no abstractions is like reading a book where each and every thing is spelled out in outmost detail. Instead of telling you “I’m fuelling the car”, I’ll tell you: “I’m walking to the entrance hall. I’m picking up the car keys. I’m putting on my shoes. I’m putting on my jacket. I’m unlocking the front door ...”. You see where this is going. And here we already assumed that things like “putting on shoes” are provided by a standard library.

There seems to be two types of programmers: one that can read a line of code like or theCar.fuel() and trust that you in the current context understand enough of what the call does that you can continue reading on the current level of code. This type of programmers don’t mind abstractions even if a function is called in only one place.

The other type of programmer must immediately dig into the car.fuel code and make sure she understands that functionality before she can continue. And of course then each and every call becomes a misdirection from understanding the code, and of course for them it is better is everything is spelled out on the same level.

I’ve seen quite a bit of code written by the second type of programmers, and if you don’t mind scrolling and don’t mind reading the comment chapter headers (/* here we fuel the car */) instead of all the code itself, it can be reasonably readable. But there’s never comprehensive testing coverage for this kind of code, and there’s usually code for fuelling the car in four different places because programmers 2-4 didn’t have time to read all the old code to see if there was something they could reuse, and just assumed that no one had to fuel the car before since there wasn’t any car.fuel() method.

I have had the good fortune to have never worked in a codebase with the characteristics you describe. But I have seen some issues with theCar.fuel(), and that’s generally around mutability and crazy side-effects. I think most of these are pragmatically overcome by adoption of functional paradigms and function composition over inheritance or instance methods.

Still lacking good tools in our own toolbox. If ides could expand function calls inline (not a header in a glassbox, but right in code), both worlds could benefit from that. Expand all calls depth 2 and there is a detailed picture. Collapse branches at edit-time based on passed flags/literals and there is your specific path.

Hmm. There’s a vim sequence to accomplish this that you could macro. But even so, don’t most IDEs give you somewhat more than a glassbox header? I’m almost certain I’ve seen people scrolling and even editing code in the “glassbox” preview pane in VSCode.

Afair, it doesn’t inline and overlaps with the code behind it. If that is not true, it may be closer to it, but my experiments somehow failed to show its benefits over “just open to the right pane”. Maybe I should check its config thoroughly. As a vim user, I’m interested in a method you described, is it handmade :read/%v%yp-like thing or an existing plugin?

Then you have an electric car, and you use the fuel method and add a special case for isElectric handling inside. And some other dev uses lamp.fuel since it already handled isElectric internally. But later, we have to differentiate between different types of charging and battery vs constant AC and DC power. Then someone helpfully reorganizes the code and breaks car.fuel because the car does have a battery too. And then ....

No, you don’t. And the alternative implementation is that you either go through all code where car is used and add conditionals for all the cases where kind of fuel matters. Or is very common for this kind of programming, just copy the whole car.roadTrip() where fuel is called and to the method electricCar.roadTrip and just change a few lines. Then of course all requirement changes or bug fixes must be done in several places thereafter.

My feelings about people that can’t handle abstractions is that they just don’t have had to create or maintain anything complex. Very few real world systems can be kept in ones mind in full.

I agree. This whole discussion prefering long functions seems like advocacy for bad code to me.

It is just ... I have seen both types of code and if written by someone else, coffee that at least attempt to segment things into chunks that clearly don't influence each other (functions with local variablea) is massively easier to read.

I think it’s honestly just folks talking past each other because these situations are isolated judgement calls, and some folks feel that

    // #1, in essence
    result = a => map => reduce => transform
is easier to read and understand, while others feel that

    // #2, in essence
    aThing = a => map
    aggregation = aThing => reduce
    result => aggregation => transform
is easier to read and understand. Folks in camp #1 think camp #2 is creating too much abstraction by naming all the data each step of the way, and camp #2 thinks camp #1 is creating too much abstraction by naming all the functions each step of the way.

Really it’s just these two mental modalities butting up against each other, because you will separate your layers in different ways for increased clarity depending on which camp you fall into. What makes things clearer for camp #1 makes things less clear for camp #2, and vice versa.

That’s my suspicion anyway: the premise of the discussion is just a little off.

The one caveat is that I want to easily be able to find out what fuel() is doing. Preferably nothing like car.getService('engine').run('fueling'). Code navigation is very important, preferably doable via ctrl+f since that makes review easier. Most people just use the browser tools for reviewing code and don't actually pull the branch into their IDE.

You're asking for code that passes The Grep Test:


Not sure why you are down voted, I completely agree.

So many levels of indirection is the recipe for the modern goto.

> In my experience, code with too many functions is more difficult to grok than spaghetti code. It's like trying to read a book with each sentence reference a different page. So, I try to code like I would write, in digestible chunks.

This is so true. The worst code that I've dealt with is the code that requires jumping to a ton of different files to figure out what is going on. It's usually easier to decompose a pile of spaghetti code than to figure out how to unwrap code that has been overly abstracted.

My experience has been that spaghetti is almost always in the real world mostly overly abstract and poorly thought out abstractions. You know you get a stack trace and you end up on a journey in the debugger for 5 hours trying to find any actual concrete functionality.

Compared to someone writing inline functions that do too much, the wasted brain hours don’t even come close

It's also often very deeply nested and follows different paths based on variables that were set higher up in the code, also depending on deeply nested criteria being met. Bugs, weird states, bad error handling and resource leaks hide easily in such code.

In my experience refactoring out anything nested >3 levels immediately makes the code more readable and easier to follow - I'm talking about c++ code that I recently worked on.

Decomposing to functions and passing as const or not the required variables to functions that then do some useful work makes it clear what's mutated by the sub functions. Make the error handling policy clear and consistent.

Enforce early return and RAII vigorously to ensure that no resources (malloc,file handles,db connections, mutexes, ...) are leaked on error or an exception being thrown.

And suddenly you have a code base that's performant, reliable and comprehensible where people feel confident making changes.

I disagree. I think the central thesis of Clean Code still holds up. You should never mix layers of abstraction in a single function.

That more than anything is what kills readability, because context switching imposes a huge cognitive load. Isolating layers of abstraction almost always means small, isolated, single-purpose functions.

I think the central thesis of Clean Code still holds up. You should never mix layers of abstraction in a single function.

I agree up to a point, but I find this kind of separation a little… idealistic? I prefer the principle that any abstraction should hide significantly more complexity than it introduces.

At the level of system design, there probably are some clearly defined layers of abstraction. I’d agree that mixing those is rarely a good idea.

But at the level of individual functions, I have too often seen numerous small functions broken out for dogmatic reasons, even though they hid relatively little complexity. That coding style tends to result in low cohesion, and I think the cost of low cohesion in large programs is often underestimated and can easily outweigh the benefit of making any individual function marginally simpler. If you’re not careful, you end up trading a little reduction in complexity locally for a big increase in complexity globally.

Here’s some counter-argument psuedo-code:

    // v1, mixing layers of abstraction
    x = a if exists, else first()
    y = b if exists, else second()
    result = third(x,y)

    // v2, abstraction
    result = getResult(a,b)
In v1, we have the semantics of x and y, so we understand that a “result” is obtained through the acquisition of x and y. Whether we need to understand this is a judgement call. But v2 opens a different “failure to understand” modality: “getResult” is so blackboxed that the only thing it really accomplished is indirection, without improving readability.

I love Clean Code, but I think it sometimes prematurely favors naming a new function and the resultant indirection.

Yes context switching is a huge cognitive load. Abstractions enforce context switching.

The primary motivating reason to have abstractions in the first place is to prevent context switching - i.e. you shouldn't have to think about networking code while you're writing business logic.

I’d say that’s a sign that either it’s the wrong abstraction, there’s implicit coupling (a distinct variant of the wrong abstraction), or both sides of the abstraction are in so much flux that the context switching is inevitable until one or both layers settle down.

> In my experience, code with too many functions is more difficult to grok than spaghetti code.

Because it is, just on multiple plates.

> It's like trying to read a book with each sentence reference a different page.

Yes!!! I've been trying to teach Juniors that if the function itself has 4 levels of abstraction, even if the names are readableFunctionThatDoesXwithYSideEffect ..... it is harder to understand, Ctrl+clicking downards into each little mini-rabbit -hole. Just keep the function as a 80-liner, not a 20-liner with 4 levels of indirection w/ functions that are only used once (inside parent function) ugh.

The key concept they always helps me is to minimize side effects per function. One thing goes in, one thing comes out (in an abstract sense). Multiple side effects starts getting dodgy as it makes the function harder to predict and reason about. I do err for longer easier to read functions. And don’t compose into functions until it’s clear you will actually reuse the code or you actually need to reuse it :) DRY is good but premature composition is just as annoying as premature optimization.

Yes! "AHA" (Avoid Hasty Abstractions) is the remedy to too much "DRY".

They all probably read Clean Code, the discussion on functions in that book may be the most harmful/costly to programming in the last 20 years.

Is there a good blogpost/writeup on this idea? I've seen it mentioned before in other threads.... And i agree 100%

That's a for a specific case of high performance code and removing duplicated work.

Referencing carmwack always has to be in the context of high performance code.

Even carmack himself has started like functional code. Which normally leads you down small pure functions.

No, that is not for a specific case of high performance... It's for the non-specific case of keeping the code clear, understandable, and bug-free. The style was chosen for these reasons, not because it is more performant. It just happens to also be more performant than the layers of indirection that also harm understandability.

For a procedural code base, avoid subprocedures that are only called once.

For a pure functional codebase, e.g. Haskell, locally-scoped single-use functions can be beneficial.

Clean Code is still a great read in 2020, but I think you’re right about some of the specific advice about functions.

I think you may find it difficult to test an 80 liner... There is way too much happening.

You still have to test all the 80 lines if they're broken down into multiple functions, so it's something that you have to evaluate on a case-by-case basis.

It might even make it harder to test: if you break a function wrong, then you might end up with a group of functions that only work together anyway.

For example: when you break a big function into 3 smaller ones. If the first acquires a resource (transaction, file) and the third releases it, then it might be simpler to test the whole group rather than each one separately.

Breaking an 80 line function into to 8x 10 line functions does not necessarily make it easier to test. Most of the time it just adds unit testing busy work, for no clear benefit. This becomes more clear if you imagine you wanted to test every possible input. Splitting the function in 8ths introduces roughly 8x the work, if each new function has the same number of possible input states. The math is more complicated in the general case, so you have to evaluate it on a case-by-case basis. Also, if you're trying to isolate a known bug, it might be beneficial to split the function and test each part in isolation.

Depends on the language. In general I find the way many unit tests are written to be very brittle. There is a balance here. If the 80 lines are clear and easy to understand they will likely be easy to test also. It’s very situational though. An 80 line function isn’t that bad. Check out the SQLite code base, which is extremely well tested, or the linux kernel. C code tends to push out the line count. Whereas 80 in Python is probably a bit much. Some libraries, especially GUI code tend to take a lot of lines, mostly just handling events and laying things out and there you often see big functions as well.

its java

Perhaps we just imagine different things, but I like when code is a list of human-readable calls to functions. The implementation of these functions isn't so important to understanding the code you're reading.

This works really well as long as you use pure functions, because their impact on behaviour is clearly restricted.

John Ousterhout's "A Philosophy of Software Design" is an interesting alternative to Clean Code.

"I've seen too many of my previous projects die right when I moved on. Now I tend to write code as if it were written by a beginner: verbose and boring, with no magic."

Theres nothing wrong with charming magic in your code, if it really does something special and is not just used for the sake of it - it only gets into dark magic, when you forget or are too lazy to add proper documentation in the end. Which ... happened to me, too many times.

But otherwise very much yes. Clarity and simplicity should be always goal number one. But since simplicity is hard to reach at times and time is short, it is always about the balance.

I once read a quote, possibly here on HN that said:

"Code first for the machines, then for others that will maintain your code and lastly for yourself."

And that I think for me nicely strikes the balance.

Hm, coding for the machine would mean to me, write processoroptimized code allways.

And I rather have clear, maintainable code - which is easier to work with and therefore less filled with bugs.

If you are comparing reading code with reading books, then surely you have read books that have unfamiliar words that you have to lookup the definition, and then you might have to recursively lookup the unfamiliar words in the definition as well. Then when you internalized the sub-definitions, then you return to what you were reading and have a better understanding.

The difference between code and books is that programmers can freely and naturally define functions. I wonder if some people complaining about too many functions never actually learned how to read code in the first place.

> In my experience, code with too many functions is more difficult to grok than spaghetti code.

In a way, it kind of _is_ spaghetti code. Even if there's no back references, it turns a single train of thought into a string of entrances and exits.

The best label I've heard for the excessive layers anti-pattern is "lasagna code."

Lasagna code isn't meant to be derogatory, just a description.

https://wiki.c2.com/?LasagnaCode https://en.wikipedia.org/wiki/Spaghetti_code#Lasagna_code https://matthiasnoback.nl/2018/02/lasagna-code-too-many-laye... https://dev.to/mortoray/what-is-your-tale-of-lasagne-code-co...

Generally used with a negative connotation. C2 also discusses how the layers can become entangled/stuck with one another and difficult to replace, which seems to fit the metaphor.

For describing layered code in a non-negative fashion, just saying "layered (or "modular") seems most typical.

It's easy to say ugh, but we juniors are more than willing to learn "the right way". This is the hardest part for me. I get anxiety about it and it slows me down.

How do I apply this to taking over someone else's 4 year old Magento project? We're out here doing our best, and sometimes our learning environments are in that context.

"the right way"

I would say, don't stress about it too much. There is no perfect way. Everyone makes misstakes. And about when to make abstractions and when not, is mostly about experience. There are modules worth optimizing and abstracting. And others are not. You definitely will make wrong decisions about it and later found out, this optimisation was a waste of time, or that quick and dirty approach really cost you much later on, we all did that and still do.

Much worse than making a wrong (design) decision is making no decision at all - because mostly you have to decide for something and then just go with it. Overthinking things seldom helps. What helps me sometimes is, putting a special hard problem to the side if I am stuck and solve something easier first. Then after some time, when I get back to it, things are much more clear.

But I also wasted too much time thinking about the right approach in a neverending, neverprogressing loop to achieve perfection.

Now my question is not, is it perfect or shiny, but: Is it good enough?

What matters is, that shit gets done in a way that works.

> wasted too much time thinking about the right approach in a neverending, neverprogressing loop to achieve perfection

A CEO from my past often muttered that "perfect software comes at infinite cost". It's key, imo, to identify which components of what you are building _must_ be perfect. The rest can have warts.

"to identify which components of what you are building _must_ be perfect"

Well, but by the words of your former CEO (and my opinion) those parts would then have infinite costs, too... if they really need to be perfect. I mean, it is awesome, when you do a big feature change and it all just runs smooth without problems, because your design was well thought out, but you cannot think of every future change to come - and when you still try, chances are you get stuck and waste your time and risk the flow of the whole project. I rather tend to think about the current needs first and the immediate future second, but everything after that, I spend not much thought anymore.

> risk the flow of the whole project

Agreed. What I mean by "perfect" is: for a given part/component/decision/etc, take the time (an always-limited resource) to learn as much as possible and contemplate more than just the seemingly obvious path forward. Take security for example. I'd rather 'waste time' now making sure I'm covering any gaps in that realm before shipping.

OTOH maybe some jacked-up abstraction/incorrect tool choice/ugly-ui/etc is something that can wait a few sprints or longer. At least you can plan when to deal with these. Security breaches tend to plan your day for itself on your behalf. :)

I am a junior developer too. Questions like this are better suited to your manager. Mine gives me constructive feedback at regular intervals, and I also reflect on my own work and look at other people's work.

"Lasagna code" might be a better term.

I meant to write “not too many” functions of course :)

> Mostly Python these days, but I try to make it clear and easy read.

Which is why I enjoy languages that let me do this without getting too hung up on performance. It's curious that you bring up Python, because idiomatic Python (especially where math libs are concerned) seems to vastly favor brevity/one-liners over all else. It's nice to hear that a veteran is favoring clarity.

> idiomatic Python (especially where math libs are concerned) seems to vastly favor brevity/one-liners over all else

I can’t speak to math libs, but in my experience with server-side development, Python devs tend to (often even religiously) cite PEP style guides favoring explicitness and verbosity. I think there may have been a shift as Python got a lot of uptake in scientific and ML communities, and I hope that hasn’t seriously impacted the rest of the Python community because, while I don’t especially love the language/environment, I deeply appreciated the consistency of valuing clear and readable code.

> cite PEP style guides favoring explicitness and verbosity.

Explicit is better than implicit, always has been, always will be. Granted, I've been writing backend/server-side Python code for 15 years now, so that might be one of the reasons.

For what it’s worth, having spent the last few years writing server-side TypeScript, I’ve evangelized “explicit is better than implicit” fairly aggressively. A lot of even seasoned TS developers are still mainly accustomed to JS interfaces, and fairly often their first instinct is to cram a lot of semantics into a single variable or config flag. I’m glad I spent a few years working with PEP-8 fanatics. It made me much better at thinking about and designing interfaces.

I'm someone who came to the server side of things from the scientific Python community. IMO, that community is still learning how to incorporate Python's best practices to cater their very specific needs.

For example, if you're writing a plotting library geared towards data scientists, you're almost forced to pick brevity over verbosity even if that means violating some of Python's core philosophies. Data scientists usually come from non-compsci backgrounds and almost 90% of the codes they write, don't go to production. So, they usually prefer tools that help them get the job done quickly and they write tools following the same philosophy.

Right. And a lot have come from other languages like R where that’s more common.

If I were building a library for something like that, I’d build the core idiomatically, then expose an idiomatic API with aliases for brevity. I’d make sure the alias is documented in the docstring, and types refer to the idiomatic names. I know TIMTOWTDI isn’t entirely “pythonic”, but it’s a small compromise for probably a good maintainability boost.

There is a point where fitting a little more code on one screen actually helps. Usually not though. Our brains can only see a screen at a time. There is some optimal mix of terseness, especially when you know your reader (probably you in a few months!) will grok it, vs verbosity. If I find myself untangling a single line down the road in my brain it was too complex. Python is already so expressive! We all find our style, but generally I know I did it right if I look back at code at think “wow that’s easy to understand” vs “hmmn, what was I thinking there?” Heh.

The bold utilitarian approach of Go might face some valid criticisms from seasoned programmers, I myself had to empty my cup(mostly Java) to get onboard Go and I'm glad that I did.

After a spine surgery my programming time got severely limited and so I decided to code my future projects with utility focused languages. I had used Python in the past, but the performance tuning once the application scales is counterproductive and expensive to say the least.

I wanted a language which has predictable performance, decent standard library and most importantly not waste my time; time I can focus on my health. Go was the answer, even if it meant that I had to let go of some of my decade long programming patterns and practices.

Now my only wish w.r.t to Go's future is for it to stick with its utilitarian philosophy and not succumb to pressure of including features which might compromise it and leading to the several forks of Go.

How do you feel about coding without generics now, and what do you think about Go's ambition to add them?

I'm coming from a C# mindset and thinking of learning Go, but I'm so used to generics...

That's what I meant when I said that I had to empty my cup and It's unnecessary for most if they're happy with their current language.

As for the inclusion of Generics I'm divided, I'm eager to use generics again in my current Go to language but on the other hand I'm worried if this is the direction Go language design team is going to take then where will it end?

>>It made me think about the readers of my code (...) I don't always need to have n+1 layers in my architecture where all the layers just call the next layer anyway.

Your assertion doesn't make sense. N-tier architectures are primarily intended by the needs of said reader of the code, because it provides a clear understanding of how the overall code is organized.

More importantly, it provides a clear idea of what code is expected to call which code, and makes it clear that dependencies only go one way.

I have no idea what leads people to believe that ad-hoc solutions improvised on the spot are helpful to the reader instead of clear architectures where all the responsibilities and relationships are lined up clearly from the start.

In practice, it rarely turns out that way. I have to deal with large, mature Java codebases for some of my work. The good thing is that the code does just about everything well and rarely breaks. The downside is that when something does break, and I have to debug the code. At some point in the Java world, best practice became building abstraction on top of abstraction on top of abstraction. And often these abstractions just call the next abstraction. Well that makes finding the offending line of code extremely difficult and time-consuming unless you are an expert of the codebase. Had the exact same code been written with less abstractions, debugging would be a lot easier.

I am not against abstractions, but I think they lead to hard to read/debug code when overused. I think they need to be used wisely rather than the default.

> At some point in the Java world, best practice became building abstraction on top of abstraction on top of abstraction.

It really doesn't. There is nothing intrinsic to Java that forces developers to needlessly add abstractions.

If your codebase has too many unwarranted abstractions to the point it adds a toll to your maintenance, it's up to you to refactor your code into maintainability.

And no, n-tier architectures do not add abstractions. They never do. At most, you add an interface to invert the dependencies between outer and inner layers, which does not create an abstraction. Instead they lift the interface that was always there,and ensures that you don't have to touch your inner layers when you need to fix issues in your outer layers.

It's hard to stay simple when the number of users grow. Go will probably not stay simple for much longer (with generics and whatnot).

One thing that I don't understand about the ecosystem is the hate towards GOPATH. Why introduce a complex dependency system for a package manager when you can just pin submodules with git and reap the same benefits? :)

GOPATH is hated because it's poorly thought-out. It's poorly thought-out because Go is designed by Google, who uses Bazel for dependency management. GOPATH is only there because you can't expect everyone to adopt Bazel in order to adopt Go, so some half-assed solution gets designed to get the language out the door.

In simpler terms, the people who designed the language don't use GOPATH at all. That's why it's terrible.

I don't think GOPATH is poorly thought out at all. Dependency-environment-locating is a PITA. Off the top of my head, I can't think of a single package management system that doesn't use universal installs, FOO_PATH or "giant local dump per project".


- apt, yum, brew

Team PATH:



- PYTHONPATH (which Conda, virtualenv, etc modify)


Team redundant local blob:

- Node

- pipenv

Rust is probably the least-half-assed (most full-assed?) model, with both a sane user-wide default for cache (~/.local/cargo), a way to edit that default, and project location flexibility.

But I actually love the Go notation that I've opted to organize most of my code around the ~/namespace/src/domain/repo scheme. I never lose track of where a folder is :)

> 've opted to organize most of my code around the ~/namespace/src/domain/repo scheme. I never lose track of where a folder is

Yes I do the same! I don't lately write any Go but I really appreciate the organization this way

Nix isn’t any of these? It installs each replicable version of a package once, but it’s not visible outside of the project that uses it.

Never worked with Nix, though it looks interesting.

It has rough edges, but I find it one of the better developer experiences.

Is your home dir chock full of namespace folders?

Nope, just two or three. Most lives in ~/ao (easy to type on dvorak), some is in ~/rd (random), some is in ~/tmp. I don't really work on enough variety of projects to deal with collisions.

How does pinning submodules give you the same benefits as a package manager?

I take issue with some of the decisions that went into Go, but I definitely respect the overarching philosophy of keeping things simple and not giving teams enough rope to hang themselves with

Instead you only give them string, so if they want a rope, for any reason, they have to make it themselves.

Every time.

I think functions are a good enough abstraction for many things. A few years ago I tended to make everything a class in Python. Nowadays I rarely need more than functions. Learning Rust made me realize just how arbitrary my aesthetical ideas about code where. When I tried to go the class based object oriented route in Rust it failed spectacularly because I was unable to navigate the maze of ownership in no time. Once I let go of these ideas everything became incredibly straightforward. The spell has been broken.

That being said I think module borders have become more important to me. Keep seperated what is meant to be seperated.

This can be summarized as

"It's better to repeat yourself than use the wrong abstraction."

It happens often in the attempt to be DRY we add a parameter or some condition to handle a new variation to what seems like a universal logical construct in the code. Do this enough times and the code is no longer comprehensible to any of the people who wrote each variation, let alone a newcommer. We mistake some commonalities with a universality. We become zealots.

> By "functions" here I mean "pure functions".

After programming for Clojure for quite a long time (~2 years), I fully share this sentiment. Using pure functions for business logic (and also using simple data structures instead of, say, classes and encapsulation) seems to generally makes the code more simple and maintainable in the long run.

> Of course the qualifier is "mostly": this isn't a dogma. Writing a 100% functional system ("going vegan", if you will) often requires you to jump through a bunch of extra hoops to get all the functionality you need.

Also this. Sometimes going fully functional makes things much more difficult, so a little of "impurity" is also fine.

For years now I've felt the same way about functional v.s. imperative programming, and where functional languages go 'wrong', and what they get right.

There are exceptions of course, but I personally feel that there's three main 'kinds' of code:

    1. functions that define some input/output relation  
    2. query methods on data structures  
    3. modification methods on data structures  
    4. the bodies of the above functions
In a purely functional language all four are purely functional, but this is (IMO) needlessly restrictive. It leads to recursion where iteration is more natural, awkward choices of data structures or even plain impossibility of certain algorithms/data structures (ask a functional programming zealot to implement an O(1) hash map in a pure way—they will usually stammer, try to move goal posts, before finally admitting it's not possible).

Personally I feel that 1 & 2 should be 'pure' and not modify (observable) data and have the same results, but 3 & 4 are perfectly fine if not natural to be imperative and have mutable state.

>It leads to recursion where iteration is more natural

Viewing iteration as more 'natural' than a fold seems down to mostly taste. And hell, if you really want iteration, you can easily get that in both effectful and non-effectul variants through monads.

>ask a functional programming zealot to implement an O(1) hash map in a pure way—they will usually stammer, try to move goal posts, before finally admitting it's not possible

Except that no one - not even Haskell zealots - will argue that you never need effects, but simply that effects should be encapsulated. In Haskell, nothing prevents you from using mutable state if you really need it and mutable hash tables can be easily implemented using something called functional state threads [1][2].

[1] https://www.microsoft.com/en-us/research/wp-content/uploads/...

[2] http://hackage.haskell.org/package/hashtables-

> In a purely functional language all four are purely functional, but this is (IMO) needlessly restrictive.

I concur. My two favorite languages nowadays (Elixir and Clojure) are far from being purely functional. They are functional enough that mutating state is awkward, but if you do need it it is there.

I also think having immutable data structures by default is a saner choice IMO.

> It leads to recursion where iteration is more natural, awkward choices of data structures or even plain impossibility of certain algorithms/data structures

Anecdotal, but I rarely use recursion/for loops, opting for defining auxiliary functions that works in one element and using map/reduce instead. Really, I don't remember the last time I needed recursion to solve a problem (for example, I know recur exists in Clojure, but I never saw it in the code that I work everyday).

> I rarely use recursion/for loops

Same here. I use c# and Typescript mostly. In c# you can use Select for map, Aggregate for reduce and there's a huge selection of other list processing operations that make use of lambda expressions. In typescript/javascript there map, reduce, etc

I often write a loop recur that I will later turn into a map reduce as you say.

(can't edit anymore but s/three/four/)

I used Clojure (and ClojureScript) in production for several years, and more or less share your sentiments on this. The only strong objection I have is that the lack of types made even simple data structures an occasional nightmare. Especially when most of the composite types can easily be substituted for many functions, it’s trivial to accidentally recurse a string or a keyword where a data structure should be. And the runtime errors (especially in cljs) can be utterly baffling.

I’ve taken what I learned in clojure (and I can just say the article, even just its title, adequately sum it up) and use the same approach in TypeScript. And it’s a breath of fresh air.

I didn't say anything about types, but we use plumatic/schema as a "type system". It is not ideal (for once, it is checked at runtime instead of build time, so we only check types during tests and requests, otherwise the impact would be too great). It is not ideal, but it works and keeps our sanity.

Yeah something like that is essential in a dynamic language. To be honest though I don’t think I could go back to runtime-only type checking and having to write those tests where a compiler tests so much automatically.

> where a compiler tests so much automatically

Compilers don't necessarily check anything. Static analysis tools (including static type checkers) check things, and some compilers also include static analysis tools. But static analysis tools are often available beyond whatever a compiler provides.

And, some compilers (sbcl) can do fairly sophisticated static analysis on dynamically typed code.

Yes, you’re right. It’s often but not always the case that type checking and compilation are part of the same tool and flow. Thanks for adding that.

Have you considered Clojure Protocols for this?

I used them. But they’re still runtime-only.

This nugget stood out for me: "In my experience most codebases have a pure functional subset, and I believe writing that subset in a pure-functional style is nearly always a win for the long-term health of the project."

This is probably often true. In high level applications, the idea of classing or abstracting out your i/o from the core functionality is appealing from a security and reasoning perspective, but not sure if serious developers think that way.

Would you need to understand the rationale behind the codebase from a functional perspective, and even the economics of the business logic behind the features? It's kind of an architects view of "this thing essentially reduces to a queue and if I optimize for this, I get more value."

IMO it's useful for code which is heavy on logic/decision-making. Code which is mainly about hooking systems together and managing state benefits from it much less.

I also find that people who are used to a certain kind of project (e.g. heavily logical) and spend years on it have this tendency to assume all code is the same way.

> code only those things that people at a junior level would recognize for what they do

Couldn’t agree more. But my colleagues make a point of keeping cryptic code cryptic because comments explaining context and reasoning are for noobs. I guess it’s a mid level developer fallacy and until they’ve had enough of pulling their own hair out over this kind of code, they won’t change their mind.

> Bad programmers worry about the code. Good programmers worry about data structures and their relationships.

— Linus Torvalds

That remark from L. Torvalds is, I think, about how to approach problem solving. I'm sure he does have some opinions about code style and organization, but I feel that they would be shelved under a different conversation.

Programming is ultimately about reading and transforming data. When presented a problem, the bad programmer (presumably not well versed in DS theory) thinks of it first in terms of "steps to reach a solution". Whereas the more skilled programmer is able to identify which set of DS is a good match to a specific problem. That is, which combination of structures allow for an efficient access and manipulation of the data in the context of the problem at hand. The implementation then stems from that insight.

Obviously this skill is mostly valuable in performance critical code, which the bulk of our trade generally doesn't intersect with. And in an age of fast processors and abundant memory, it's common to see O(N) data structures applied to O(1) problems and barely anyone notices the cost. Thus keeping us in the comforting illusion that we're better programmers than we actually are.

I'm not sure what exactly Linus means by this, but I read this as: for good programmers, coding is about managing data structures and their relationships whereas bad programmers are more concerned with just getting the thing to work.

there are two kinds of programmer: one thinks code is an artifact and that you can produce quality code. the other thinks code is excrement and only exists as a waste product necessitated by the immaturity of our information modeling tools.

I think it's more that bad programmers focus on surface details like code style. All too often I've seen a code review of a complex feature get derailed by nitpicking of inconsequential things like variable names, formatting, etc.

I rarely comment on code style for this reason. I want the review to focus on functionality, not style. I don't really believe the notion "imperfect code style is a code smell" anyway.

> I don't really believe the notion "imperfect code style is a code smell" anyway.

Blaming developers for something that can be automated? Yeah, something is off here, though not necessarily the review.

> nitpicking of inconsequential things like variable names

To me it sounds like a good thing when there's nothing else to comment on. Making code more readable is a win for everybody.

I have a contrary view on code style. If there's no semblance of consistency to the code you write, how can you possibly formulate a consistent and sensible architecture?

The style itself is inconsequential, but no consistent style is a red flag to me.

> how can you possibly formulate a consistent and sensible architecture?

Similar arguments could be made about actual architecture: "If you can't have well-designed and consistent houses, how can you possibly have a well-designed city?"

The best architectures I've seen, by far, had the worst code. Code is not a systemic level issue. The way the pieces fit together is.

A sad fact of good architectures is they actually enable bad code to exist without severe consequences, because that badness is localized.

Yes. For example, a distributed system composed of various microservices, where the microservices can use different code styles, different programming languages, be maintained by different teams, etc.

Variable names are the nouns of programming though. It’s not really related but I think having an “editor” try to read your text and give you improvement ideas is not “shallow”. The code is there for programmers to read and reason about so optimizing for “easy to load into your brain” code is important too.

let c = ... versus let customer_address = ... can save a lot of sanity throughout the years

I check both when doing reviews. Style and "cleanliness" of the code is important as other people are also expected to work with it. Also I follow the saying that code is more often read than (re)written, so that should not be neglected.

Structure of the code, as in relationship of classes, data flow and high level design is also important and checked during review.

Variable names are the only thing that can reasonably guide you in time of need. Thinking that variable names are not all that important is not a good take.

useful variable names are absolutely a win though

My take is that bad programmers represent information as data as an afterthought.

> what exactly Linus means by this

“I’m a good programmer”, I imagine. I wouldn’t disagree :p

> In my experience most codebases have a pure functional subset, and I believe writing that subset in a pure-functional style is nearly always a win for the long-term health of the project.

I came to the same conclusion ~1 year ago when writing rust.

It's pretty common to have some "high level" method in rust then you split it up, some small parts go into other maybe private methods but a lot goes into split out functions (through this depends on the task, and sometimes could be made into methods if rust would support partial self-borrows).

The only think which sometimes bothers me with that approach is where to place this split out methods. As long as I don't need their functionality in other places I want them to be keep close to the function because of which they exist. But they are function and not methods so placing them in the `impl` block isn't right. Alternatively making them free functions in the method using them seems better but also isn't quite right as this "blows" up the method...

This is a problem you will face regardless. It is fundamentally a code organization problem and faces all the problems that other organizational problems face. Which is to say, the best you can hope for is to have a reliable way of generating the report you need when you need it.

Which for me suggests that it doesn’t really matter which way you do it, as long as you 1) dogmatically adhere to doing it the same way every time and 2) have the tooling to effectively manage the downsides of your choice.

Hah, I wrote it in a comment myself a while back :) (I'm the author of the article): https://news.ycombinator.com/item?id=24919615

Oh, nice! Seems like we got the same thing out of the article.

Oh yeah! I didn't even notice our comments were on the same article haha

Write posts. Not too plagiarized, and only from yourself.

Phew, I remembered tweeting that comment (https://twitter.com/watware/status/1323610182560161792) and was about to flame this post for lack of attribution. Glad I read on! Good stuff.

I agree with the sentiment that things should be made as simple as possible. The sine function really has no reason to be anything besides a function. I am not sure, though, that writing the simplest thing possible results in mostly pure functions. In my experience programming is mostly about managing a state. My programming jobs have generally been about tracking what the state of some other piece of hardware and/or software is. It seems hard to escape state in that case. In my spare time I have lately been writing a compiler like thing. That seems to be, among other things, about maintaining a stack of all the stuff that has been defined/declared previously. If so many things are about maintaining a state, how practical is this 'mostly functions' thing actually?

I would claim that the benefits of 'mostly functions' strongly depend on the task at hand.

For the field of compilers, I can for example see value in making program analyses pure functions that just compute information about the program and separate them from the program transformations that use this information to (impurely) manipulate code. This makes the analyses more reusable and probably makes reasoning about correctness easier.

For other tasks in the compiler, pure functions can be a pain. My favorite anecdote for this is that of a group of students in a compiler's course who insisted on writing the project (a compiler for a subset of C) in Haskell and who, when discussing their implementation in the final code review, cited a recent paper [1] that describes how you can attach type information to an abstract syntax tree (which is an obvious no-brainer in the imperative world).


[1] http://www.jucs.org/jucs_23_1/trees_that_grow/jucs_23_01_004...

An ad hoc solution is also a no-brainer in Haskell. They didn't need to read a paper to solve this issue, they did because they wanted the fanciest solution that is extensible in all dimensions.

I recommend watching this talk:

“The Value of Values” https://www.infoq.com/presentations/Value-Values/

It explains what the difference is between state and value and why most (almost all) programs actually have very little state and can be written mostly stateless. It was a big eye opener for me.

I would say functions are important, but absolutely pale in comparison to modeling[1].

If you are unable to describe, on paper, what your problem domain actually is, then you have no business opening up an IDE and typing out a single line of code.

I will take that further. If, with your domain model, you are unable to craft a query that projects a needed fact from an instance of the model, the you should probably start over. Dimensionality and normalization are a big deal with your model. You have to be really careful about nesting collections of things within other things if you want to allow for higher-order logic to project required views. This is something we struggled with for a really long time. Every rewrite conversation began something like "well customer and account need each other in equal measure...". And it took us god knows how many iterations to figure out the relationship should not be an explicit property either way.

Put shortly, Modeling is the core of it all. Start with simple top-level collections of things. E.g. instead of modeling Customer.Transactions[], model Customers, Transactions and CustomerTransactions. This fundamental principle of keeping your model dimensions separated by way of relational types can help to un-fuck the most problematic of domains. These will start to look a lot like EF POCOs for managing SQL tables...

At the end of the data, data dominates, and SQL is the king of managing data. If you embrace these realities, you might be encouraged to embed SQL a lot deeper into your applications. I spent a long time writing a conditional expression parser by hand. Feels like a really bad use of time in retrospect, but I did learn some things. Now I am looking at using SQLite to do all of this heavy-lifting for me. All I have to do is define a schema matching my domain model, insert data per instance of the model, and run SQL against it to evaluate conditionals and produce complex string output as desired.

[1] https://users.ece.utexas.edu/~adnan/pike.html

What if code is needed to explore the problem domain? There is utility in discovery, especially for analysts/data scientists who tend to write a surprising amount of code.

I think you’re both right. If you frame the original comment as “don’t write final production code without thorough modeling” it works both ways. If you want to counter with “well our non-final code always goes to production anyway!” You have a cultural problem that needs addressing.

To do that, you need to be willing to delete code you worked hard on. Lots of people aren't good at that.

And some think asking your boss if it's okay is a good idea, (spoilers: they'll say no). That is just a way to pass blame for a decision you can't stomach.

It's a incremental process.

You model a certain experimental ("discovery") thing, then you implement it then you analyses the result then you change the model etc.

And sure in many cases in practice people might not skip the modeling but don't properly write it down in any later one usable way. Especially during initial experimental discovery phases. Which isn't good. But understandable and can be fully ok. Honestly especially for boring simple web API's this is pretty common. It's just important to know when to stop ;=)

This is a fair argument for taking a stab in the dark. That's certainly how we started out.

The thing about modelling is that it often works on a higher abstraction level then the programming languages provide and that a bunch of (often performance related) thinks are (preferably) not represented in modelling.

This IMHO makes most tools to generate code from models just painful to use.

But I still agree that you have no business programming something which you can't somewhat model in a higher abstraction level.

> SQL is the king of managing data

Hm, not so much IMHO. SQL Is terrible bad at it in some contexts because it's inherently made for a 2d table projection which is (more or less) only joined to larger 2d table projections of data which often in it's nature is neither 2d nor maps well to 2d representations. And while you can extend SQL to support that or work around it with e.g. recursive queries it's not very nice to do at all.

I like this talk: "Domain Modeling Made Functional" by Scott Wlaschin https://www.youtube.com/watch?v=Up7LcbGZFuo He's working in F# but the concepts map.

There's a fascinating book "Data Model Patterns: A Metadata Map" by David C. Hay that's pretty much a Pattern Language or catalog for data models. You can just implement the subset of Hay's patterns that make sense for your application.

Relying primarily on relational modeling reminds me of Out of the Tar Pit[0]. The well known paper suggests a combination of functional programming and relational data modeling.

[0] http://curtclifton.net/papers/MoseleyMarks06a.pdf

I definitely agree with this. Some of the fasted implementations I have ever done were after spending a bit of time modelling the problem solution. Basic data flows, classes (using verb/noun parsing of the requirements doc) and system architecture were all decided before I wrote any code.

The implementation itself just flowed, allowing me to focus on smaller details that can't be modelled (e.g. error handling). By copying the design of classes and function names, I didn't have to backtrack and redo anything, I didn't have to think about names of things - which were pre decided and so consistent throughout the codebase and my code dovetailed nicely with parts that other people implemented.

wiring pure functions decouples processing from data. I find this encourages developing well defined models naturally as well as making the model much more agile and malleable

With enough stack space you don't need to decouple your data from your pure functions at all ;-)

I definitely agree, I've realized that most programming languages store data in a hierarchy (structs within structs) which you then "query" in a very static way with the "." operator. Normalizing data and storing it relationally seems way more flexible for a lot of use cases which is why most databases are relational and not hierarchical.

However, I've tried to figure out how to actually store and query data relationally in a language like C++ and haven't been able to figure out a good way. I don't want to use SQLite because of performance, this needs to be close to real time (like a game or something similar). I just want a way to store data relationally in memory in C++. I'm still trying to figure out the right approach here. C++ is very static which makes it difficult. I've been able to figure it out in Javascript though.

Could try something like Apache Arrow https://arrow.apache.org/docs/cpp/tables.html

>"If you are unable to describe, on paper, what your problem domain actually is, then you have no business opening up an IDE and typing out a single line of code."

I agree with this part assuming that graphics and video along with the words are allowed.

>"I will take that further. If, with your domain model, you are unable to craft a query that projects a needed fact from an instance of the model, the you should probably start over."

This is very narrow minded approach that will only work for a very limited set of possible domains. Simplest example to the contrary: the domain is a creation of efficient way of solving some math related problem. What query?

I think you may underestimate the potential scope of a domain model and the capabilities of SQL. It is certainly math. That is actually the incredible part. That its all just math underneath 20 different joins which express a very complex and meaningful view of the domain facts.

I challenge anyone to present a problem domain which cannot be meaningfully represented in terms of tables in a database. I would prefer if this were bounded by the set of problems you would use any software development strategy upon, but I welcome a more difficult problem as well.

>"I think you may underestimate the potential scope of a domain model and the capabilities of SQL."

I am being practical. There are many languages that are Turing complete but ill suited for particular domain. It's been proven that SQL is Turing complete as well. However should you propose using SQL to write implementation of say FFT you are not likely to find much of a sympathy.

I would not use SQL to implement the actual algorithm, but I would certainly consider using it to hold all of the data around such an operation as required. For instance, tables like Samples, Spectrograms, etc.

Hierarchies are possible with an RDBMS, but I'd suggest it's the wrong tool to model them with, unless they're static of course. Really any graph that can't be encoded in the table relationships themselves.

What are the good learning resources to get started with this? Thanks

I have trouble understanding this kind of talk. What's a problem domain, what's its dimension and normalization, and what's the high order logic all about? Can we use plain words people from our grandfather generation can recognize?

Problem domain = Shopping, Banking, Coffee Shop, Factory, Airplane

Dimension = Customers, Accounts, Users, Widgets, Inventory

Higher-Order Logic = combining basic functional building blocks in order to compose more complex functionality. SQL enables direct, declarative access to the whole space of higher-order functions. E.g. You want the list of widgets made 3 quarters ago but scoped to one factory line, and only when a certain rotation of employees was on the factory floor? You got it. That's like 10-15 lines of SQL.

A higher order function is a function that takes another function as a parameter or returns a function as its result. Famous higher order functions include "map" and "filter", for example. Javascript, for example, uses higher order functions all the time. See [1].

The term "higher order logic" typically means program logic that uses higher order functions. An object-oriented programming style is inherently higher order because objects typically contain functions and are passed to methods.

The term "higher order logic" can also mean a system of logic that allows statements about logical statements. [2]

[1] https://en.wikipedia.org/wiki/Higher-order_function [2] https://en.wikipedia.org/wiki/Higher-order_logic

Thanks. That is clear.

BTW all those terms are like a century old.

I only saw them occasionally, never tried to understand them before.

I recommend using more than sql tables for data modeling

'observe due measure; moderation is best in all things' -- Greek poet Hesiod (c. 700 bc)

The quotes on nutrition remind me of arguments about the health implications of MSG, in that the argument tends to devolve to "is MSG good or bad" rather than "how much MSG is good or bad."

> At the risk of stretching the analogy, maybe the equivalent is "code only those things that people at a junior level would recognize for what they do".

Spot on. And I would add: Keeping in mind that junior level person could be under excessive stress. Perhaps something has failed. Perhaps they are looking at that section of code for the first time.

Context matters. It affects readability, comprehension, and understanding.

Straightforward > Cleverness

> ... a practical tip is to eat only those things that people of his grandmother's generation would have recognized as food.

So, Lisp?

I find it interesting that React ignores this advice about pure functions.

From the react docs

    import React, { useState } from 'react';

    function Example() {
      // Declare a new state variable, which we'll call "count"
      const [count, setCount] = useState(0);

      return (
          <p>You clicked {count} times</p>
          <button onClick={() => setCount(count + 1)}>
            Click me
It's clear "Example" will be called every time it's rendered yet "useState" is NOT "pure". Some magic state is being kept because the first time it's called "count" will be initialized to 0 but after that it won't. Same arguments, different results = not "pure"

The React model is actually superb at managing state. It makes you explicitly acknowledge it directly, and codifies it either with properties or useState. In practice it's a fantastic model, as it does let you write "mostly functions" as the article suggests, but you can't have a UI without state, so it's great that they make it easy to handle.

Yeah, I tend to agree (and this is why I don't like hooks). I'll write pure functional components all day, but if I need to introduce state, I'm going to use a class component and just acknowledge that state as state and not try to pretend it's still just a function.

I use hooks because they cut down on boilerplate, not because I'm trying to pretend a component is just a function.

Correct, `useState` (and `useEffect` and `useContext`) are not pure. They are ways of managing impurities while writing mostly pure functions.

I know my request might be off-topic ... the original article is from a blog that does not export an RSS/Atom feed. I am looking for recommendations for a service which can let me scrape a feed by guessing the structure of the articles.

I've thought about implementing RSS for it at some point, good to know somebody has interest in that. In the meantime I tried to keep the HTML semantic, so hopefully you're able to find a way to scrape it!

I did exactly that [0]. Check out the live version [1].

[0] https://github.com/damoeb/rss-proxy

[1] https://rssproxy.migor.org/

What's the difference between writing OO code that depends on internal state and writing a pure function that expects an argument that is a data structure of a specific type (and thus has internal data that could be different)? Is the pure function no longer pure if the argument is a data structure thats complex and the values within the data structure dictate the outcome of the the function? Or is it a pure function because if you pass the same data structure with the same internal values the function will return the same value?

Purity is a question of mutability, nothing more. If the function mutates its arguments (or its closure, or its global environment), it is impure. Any useful program will of course need to do these things at some point, but there's a lot of logic that just goes from A -> B (or A, B, C -> D, or whatever), that doesn't need to concern itself with these things, and should be insulated from them. There's nothing inherently impure about taking large data structures as arguments, though it does make it trickier to enforce immutability in most languages (compared to primitive values).

It's worth noting that it's entirely possible to write "pure methods". Unfortunately most languages don't really let you a) have mutable structures, and b) write enforced-immutable methods on them. Rust is the only one I know of: a method can take a &self instead of a &mut self, which prevents it from mutating self (recursively, which requires knowledge about ownership unless your language is 100% immutable like Haskell or Clojure, which is why this feature is so rare). What I tend to do in other languages like JavaScript, C#, or Python is to use property-getters as a convention that strongly suggests purity; unfortunately that's about the best you can do.

In multi-paradigm languages the decision on whether to make something a pure "getter" method or a standalone function is mostly one of aesthetics. Standalone functions give you a bit more flexibility in use, but sometimes the foo.prop syntax is more readable.

Interior mutability allows you to mutate behind &self.

Yeah true, though I think of that as a bit of a trap-door along the lines of unsafe { }. There's still a reasonable guarantee under "normal" circumstances.

Passing a big structure to a function doesn't make that function impure. But passing a structure containing objects with impure methods does make the called function impure.

Passing a big structure to a function is a bad idea in any paradigm because it's a big dependency. Instead, you should write the function to operate on the data that it needs and pass just that when you call it.

OOP is merely a way to arrange your code. There's no magic sauce, it's just topological distortion without deep semantic significance. In other words, there isn't really a difference. Maybe the code is easier to understand in the OOP style, maybe not, but that's in one's head.

It's the difference between c++ and c.

He's not suggesting pure functional programming - and beyond a certain point the simplicity associated with functional programming will be completely negated by the complexity of the arguments.

From my understanding of what he said, I doubt that he'd advocate for pure functional in circumstances of significant necessary complexity.

So, I have a question. I have a kind of serial number, and different systems expect slightly different formats. One system likes dashes, one doesn't, one system likes an extra two digits, while another system likes an extra four numbers. Should I write around n(n-1)/2 pure function converters between n systems? Or one class with n methods? (If you're curious, I'm talking about oil well API numbers. It's not rocket science, I'm just curious what y'all think.)

If you REALLY have to use all these different formats, I'd have one canonical one used everywhere in your app, and 2n methods to convert to different ones at api boundaries.

I'd have one canonical one used everywhere, and 2n pure functions to convert at API boundaries. By making them pure functions you guarantee (or at least suggest by convention; depending on your language's type system) that those conversions will have no side-effects. As methods, you can never be sure as the caller whether there will be ramifications to calling them in new places.

Another advantage of that approach is that if you make sure you always persist the serial number in the canonical format, and immediately parse the incoming value into the canonical form, then comparing 2 serial numbers is easy and consistent - including e.g. when constructing db queries.

I encountered a similar problem where an identity number could be represented in different formats, and the solution had been to store the string representation in all its glorious permutations. Doing db queries to find anything by that key was then impossible until the representation had been made consistent.

Then when e.g. creating reports for different systems we could format the output as per that systems expectation.

Ah! Thank you, that is a very good middle road.

This sounds like a graph problem similar to the google unit conversion problem (https://alexgolec.dev/ratio-finder/). You could probably do a data driven solution, but the gist is convert to and from one normalized form. Then it's n*2 functions.

This is probably what you're implicitly doing with the class solution you're talking about.

Why n(n-1) pure functions? It makes no sense that you only need n methods in a class and suddenly n(n-1) pure functions. You will always need n(n-1) functions if you want to specify ALL the conversion logic. Think about it. If you're not writing it in the method, then that logic must be written down somewhere.

I think I get where you're coming from though. The class hides a type representation that is used as the "internal" representation of the serial number so you only need to write 2n functions. N function to convert to the special serial types, and N functions/logic to convert the special serial types to the internal serial type. The latter N functions are placed in the constructor not as functions but as a series of procedures to deduce the type of the parameter and do the conversion internally. (The effect of this is identical to overloading the constructor).

Let's say A represents that internal serial type, with all other letters in the alphabet representing the serial types of your oil well. You're essentially doing the same as writing:

  AtoB :: B -> A
  BtoA :: A -> B
  CtoA :: C -> A
  AtoC :: A -> C
If the goal was to convert B to C with classes you do this:

with functions you do this:

Your BtoA logic is simply hidden in the constructor of Serial. But basically the exact amount of written logic is occurring here.

I think your question was a trap. One class with n methods is obviously better then n(n-1)/2. I think you were just unable to see that it's basically all functions and expressions in the end. When you use classes you are simply tying these functions to structure and internal variables making them less modular, but the amount of logic is exactly the same.

But overall, if you want to know which methodology is logically better and more resistant to technical debt then I will tell you.

The functional approach is better.

Because the functional approach modularized BtoA. BtoA can be reused in other contexts in the functional approach but in the Object Oriented approach the logic of BtoA is tied together with CtoA, DtoA, EtoA and all of that in the constructor. Likely if you needed that logic as a one off... say to print the serials in internal receipts... you would likely be copying and pasting that logic from the constructor and duplicating it in another class when you follow the object oriented approach.

This is the main reason why the author of the post promotes pure functions. Greater modularity and greater resistance to technical debt.

How big is n and how often does it change?

Major deja-vu moment for me. I've been carrying that headline around (as a quote) for a few months at least as an expression of a sentiment I've had for years, and thought I've seen it in a few places (and on HN), yet Google only shows me this blog as a source (which obviously isn't possible and almost feels like gaslighting). Does someone have an older reference for this?

I (the author) and another person independently thought of it in the comments of an article a couple months ago: https://news.ycombinator.com/item?id=25501263

The quote is based on one from Michael Pollan’s book In Defense of Food, around 2007. The original is "Eat food, not too much, mostly plants." It’s been reworded into at least a few things since then.

Let's go full FP and demand immutable data too.

Making a concrete argument like that is inconsistent with the passive principle-free bromide of the post, so you’ll have to find demands like that elsewhere.

I think there's going to be many different Venn diagrams of combining OOP and FP. Someone should identify and name them. Immutable data is hard in languages that don't support it unless you're happy with a 80/20 solution.

My beef with mutable data is that it's often very hard to see when it's supposed to be modified and by what.

I don't get upset by a local var i in a while loop, although I often don't see the point.

The problem is the dark mutable data. All those objects being passed around, are they just being read?, are they being modified by this method call?

So if mutable data is used responsibly, it's not a problem. But there's no guarantee that it's used responsibly. Unless all data is immutable. Then it's guaranteed

You can rely on a type system that help you to use mutable data responsibly, à la Rust.

I'm not familiar with the type system of Rust but it seems like a really nice language.

Ofc a type system that helps you deal with mutable data responsibly goes a looong way to alleviate the problem but it's not a guarantee. And often times such a type system comes with weird and unexpected quirks. Here are some examples from C#:

For example "readonly" is not immutable, it's a compile time guarantee that a value can only be assigned in the constructor of a type and to that instance of the type only.

This means that for type Foo with readonly int bar, you can have a constructor Foo(...) : this(...) and the value of bar is mutable in the context of those constructors. For most intents and purposes however, the "readonly" only field however acts immutable enough to give a reasonable degree of immutability.

The you have something like private setter functions. They go a long way to guarantee encapsulation of state. But there's no compile time guarantee that an instance of Bar int baz { get; private set; } won't mutate the baz value of any other Bar instance. In-fact, it's a common misconception that private modifiers make something private to the instance, whereas it's only a compile time guarantee that it's not visible to any other type.

Furthermore, private modifiers don't actually prevent anything from actually utilizing it in runtime. You can simply use reflection or other techniques and do what you will.

Then you can do other weaker forms of type-checking "immutability", for example only exposing getter functions in a IFooReader interface.

These things all help alleviate "dark mutability" to different degrees but the underlying values are still mutable. I guess the point I'm trying to make is, yes, it massively helps to have a type system that guarantees encapsulation and immutability to different degrees. The caveat is that we are still at the mercy of that type system and the way it enforces immutability is often non-obvious and less immutable than one might expect.

And functions should be no bigger than your head.

This is my least favorite programming advice. Splitting functions for no reason other than "it's too long" is a bad practice. https://news.ycombinator.com/item?id=25263488

Most of the illegible code I've ever written was because I kept splitting functions up, thinking I was following good practice, but was really just making emotional/aesthetic decisions.

When you go back to edit your code, it's like calling a 1-800 number and getting re-routed to 15 different departments to finally find the person you need to talk to.

I really agree with this.

I see this all the time with "business rules" problems.

If you have a situation where you can't make levelled abstractions, you've got some thorny interconnections in your logic (and those may be fundamental!!!). The way I handle this is with a "gauntlet pattern." You still can split your logic up into parts, but you just do it by "rejecting" certain logical chunks at a time within a function and comment above each state of the gauntlet.

It looks something like this:

// marketing told me we should never do this ever under any circumstances

if (!a) {

  return CASE_1;

// if the user enrolled like this, we should check if there's this bad property about that user

if (!b) {

  return CASE_2;

// oh, you crazy people in finance

if (c > 4 && d < 0) {

  return CASE_3;

return CASE_4;

The key thing is not to get hung up on duplication or having exact control flow like how the business thinks about it. You want 1) to return the largest percentage or simplest cases out first, 2) keep everything flat as possible without layers of branching, and 3) be able to associate tests to each one of those early-reject lines.

The nice thing about this is the reduction of cognitive load. By the time you get to the 3rd case or whatever, your mind can already know certain things are true. It makes reasoning about rules much easier in my experience.

I think the trick is to split it the eight way. If you do it the wrong way, you end up with functions becoming layer upon layer of indirection that feels like a rabbit hole you need to dive further and further into to understand what is actually going on. But if instead you keep the core control flow in the original function, but move sensibly named chunks of into into helper functions - that way the original function ends up reading like pseudo-code, and you don't really need to actually look at any of the helper functions to understand what is going on.

The "right" way to split, as advocated in the article I linked, is not based on naming but based on state. Have all your split out functions be pure. Retain all the state mutation in one place where you can keep an eye on it.

Use an editor with folding and this is just unnecessary indirection and jumping around to read a linear series of steps that happen one after another. If a function is only called in one place, it should rarely be a function. It may help a small amount when viewing a stack trace, so you can tell at a glance without looking up the line and jumping to the section, if that is a big enough advantage, can make it a lambda and leave it all in place as long as the debugger/stack trace will list the lambda assignment name.

This is the first time I've heard "functions should be no bigger than your head" and I actually really like it. Rather than prescribing some arbitrary function length, it highlights that functions are meant to be understood and the appropriate length should come from that.

I can't really agree to this. Through without questions there are always things you should not split out.

Splitting a functions which consists of multiple logical steps is most times a good idea because it makes testing much simple and tends to show you where you accidentally had subtle cross-cutting concerns or unintended cross-talk between sub-domains (domains in sense of modelling, not http).

What is important is to properly name functions (and if you can't you probably shouldn't write that function).

Also it's important to learn how to live with abstraction, i.e. to reason about code wrt. a specific problem without needing to jump into the implementations about every function /method it calls.

Through the later point is much easier with a reasonable use-full type system. And with this I mean useful for abstraction without making abstraction to hard and without allowing to many unexpected things. Languages like rust or scala have such a type system (as long as you don't abuse it) but e.g. C++ fails this even through it has a powerful type system.

At least this are is opinion.

And without question if you can't cleanly split something out, then don't split it out. If you can split it out but not name it reasonable either your understanding it lacking or it should never have been split out.

That advice should be understood in the same way as "sentences should be no longer than a few dozen words" or "paragraphs should be no longer than a few lines". Of course adding a line break at an arbitrary point in a long paragraph doesn't make it better. But a solid wall of text is a red flag.

I thought I hated small functions too, til I realized I just hated scrolling. The moment it's off my screen, it's out of my head.

Assuming you just need helper functions for f(), compare

  int f(int x)
    return g(h(x));
  int g(int x)
    return ...
  int h(int x)
    return ...

  f = g . h
    where g = ...
          h = ...
Turns out my language choice was the problem, not over-abstraction. If you can fit it all on one screen, then have at it tbh.

The logical conclusion of this line of thinking is APL or K. Some people swear by it. I admit terseness is appealing for solo coding, and I loathe the bloat of Java etc, but when taken to the extreme terseness starts being a problem for collaboration.

I think you're right about those langs and collaboration; all the procedures are right there in front of you, but you gotta keep track of the data shape in your head. Just as much work as scrolling page-spanning functions imo.

That's what makes point-free Haskell such a sweet spot for me: terse, symbollic control flow and nice combinators, but types to guide you. I try to use it as a better J.

ive developed a defacto rule of thumb / intuition that a function size should basically be as big as it can be without being inconvenient to test. which means that its pretty small most of the time

What if my cranium is extra large? Do I get to write bigger functions?

Python's developers must have enormous heads if that's the case: https://github.com/python/cpython/blob/b8fde8b5418b75d2935d0...

God I love reading code like that. Makes me feel better about the complex mess I produce thinking if it could be done in a clean way.

It's mostly a giant switch case for opcodes though, that's not really what is meant here.

They could have had inlined functions or macros if they would have wanted to keep the function lenght down for the table.

Don't write functions bigger than your co-workers' heads, either.

Well this explains why the most difficult people to work with always write the most disgustingly large god functions

Because they have big heads?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact