Hacker News new | past | comments | ask | show | jobs | submit login
The Software Engineering Rule of 3 (erikbern.com)
95 points by tim_sw on Aug 30, 2017 | hide | past | favorite | 45 comments



Since no one has stated it, this looks like a rip-off of https://blog.codinghorror.com/rule-of-three/ (which comes from https://www.amazon.com/exec/obidos/ASIN/0321117425 ).

Of course there are many other places such as the mentioned C2: http://wiki.c2.com/?RuleOfThree , my point is just that the Erik authoritatively says he is postulating it, but it's already all around the internet.


Author here. It's slightly embarrassing that this turns out to be an old idea – it wasn't my intent to rip it off. I'm fairly sure I must have seen it a long time ago and then forgot about its origin. In retrospect I probably should have googled it.


I was just thinking about this last week because of that old CodingHorror post so I was quite surprised to see it here as a new idea. I was actually just expecting a link or two at the bottom of the article.

Also, I have misused rip-off here; I didn't mean to say you defrauded/cheated/stole anything, just that it's an old idea already going around. I apologize for incorrectly using this verb and accusing you of ripping off this idea.


The article now mentions this at the bottom. I think if he were intentionally trying to rip it off he wouldn't have called it "rule of three" because that makes it too obvious. I think he just forgot having seen it.

It's kind of fortunate that it worked out this way; I like seeing people comment on a well-established idea as if it's newly postulated.


Yes, I misused rip-off and now after a quick search I found that it has a much more negative meaning than I intended.


Perhaps instead of "rip-off of" you meant "riff off of". :-)


I think the real solution here is just in learning more about your problem domain and teaching yourself to think ahead better. To borrow the example from the article, if you know that you're working with a small domain and auth will only differ in the parameter names and URLs, then it's ok to generalize like in the first example.

If you want to be able to build something that works with any bank ever, then you need to think from a higher level: you have a login operation that creates a session, and then a fetch-statement operation (that takes the session as a parameter), and so on, and those are your general building blocks that make up your interface.

No, you may not get it perfectly correct, and you may need to rethink and refactor at some point if you come across something different from what you could imagine. But that's not really a big deal, and you shouldn't be so afraid of refactoring that your go-to strategy is to always copy and paste similar code everywhere.

The awareness of this sort of thing perhaps isn't inherent; it comes with experience, but I don't think the take-away is to shy away from general functions. I've found that it makes things easier in the long run to think in terms of high-level operations and interfaces that you implement. Clean APIs aren't just for customers; you should architect your own "internal" code in the same way as if you intended to make it a public interface that random people could use. There's certainly a pitfall in trying to make things too generic, but after a while you develop an intuition around finding where the best balance lies.


More specific: being too generic can be solved by downcasting in consumers without affecting the module boundary; being too specific cannot be solved without changing inter-module interfaces.

Having a generic callback interface exec(Map(object,object)) is ugly, but better than changing 76 upstream and downstream modules' interfaces with code ownership and release cycles across different vendors/organizations.


FYI, you can remove the "amp" from the path:

https://erikbern.com/2017/08/29/the-software-engineering-rul...

to get site that's a little easier to read in a desktop browser.


Ah, interesting. I actually rather liked the AMP version on my desktop browser, with the full width and the giant pictures. Found it strangely compelling and I assumed is was a neat design touch.


I had the exact opposite reaction. One of the things I miss from Firefox when I use Chrome is the reader feature. It strips away styles and formats all the text in a thin single column so I can actually read it. It's most useful for me in dealing with sites that have interesting content and horrible styles (looking at you, academic folks)


Even a rule of 2 would be helpful in a lot of code I've worked on.

It seems like nobody wants to build a function to do the thing thats needed, they have to build an abstraction or a framework.


While I kind of agree with the rule of 3, in the example given, I would just rename BaseScraper to ScraperWithFormLogin when encountering the 3rd instance and not derive the 3rd from anything (or create an abstract BaseScraper) - there's still a high likelyhood there will be another scarper with form login.


Why not just pull it out into a function, "loginWithForm(session, 'user', 'password')"? Why is it interesting on a class level that the scraper is using a form based authentication method? What if a page is switching between two authentication methods just for laughs?


I think you've hit it on the head. Mostly you shouldn't be trying to codify what a scraper is, you should be collecting little utilities that are useful for building scrapers. The more lego-like the better.


The C2 wiki has a Rule of Three page: http://wiki.c2.com/?RuleOfThree


> The problem is we’re overfitting massively to a pattern here

Seems like this will be true regardless of the number of units we're breaking down at. There can always be a new outlier later, and trying to predict future use cases is usually a losing battle. It's not like you can't refactor yet again later.


So this is the second system syndrome x2? ;-)

https://en.wikipedia.org/wiki/Second-system_effect


This just seems like a combination of choosing the wrong abstraction and applying the DRY principle in a way that doesn't actually lead to less or clearer code.

In this case almost none of the lines contain any superfluous information. At best you could try to simplify things slightly by writing it as follows:

    class ChaseScraper:
        def __init__(self, username, password):
            self.credentials = {'username': username,
                                'password': password }

        def scrape(self):
            session = requests.Session()
            sessions.get('https://chase.com/rest/login.aspx', data = credentials)
            sessions.get('https://chase.com/rest/download_current_statement.aspx')

    class CitibankScraper:
        def __init__(self, username, password):
            self.credentials = {'user': username,
                                'pass': password }

        def scrape(self):
            session = requests.Session()
            sessions.get('https://citibank.com/cgi-bin/login.pl', data = credentials)
            sessions.get('https://citibank.com/cgi-bin/download-stmt.pl')


Same goes double/triple/multiplier-of-your-choice in the UI layer. While back end code usually has process and rules behind it, so many things in the UI are:

- actually identical, but by coincidence

- perceptually identical, but not

- perceptually identical, but technically unrelated

- arbitrarily different in special cases


We have this issue with our code quality tool telling me that I have identical html in some places. It's not at all helpful, but I want my jsx files included in the figures for quality.



Agree with respect to code organization but I can't help but feel the the number three is somewhat arbitrary and it nudges the reader in the wrong direction.

Instead of waiting until the third, I evaluate by asking myself if the duplication is accidental, how likely is it for the next instance to be different, and what will any extra arguments look like. If it doesn't feel right I leave the repetition.


The problem with this approach is, if there are 10 such different cases, which can be made a one-liner, the dev would instead make those 10 cases to 20, while waiting for 3rd to come in either of them. Instead of being afraid of future, do some abstraction. If a 3rd case is massively different, then based on the timeline either modify the abstraction, or create a new-functionality.


I only really agree with #3.

#1 – It should be swift and easy to extract code, if not that's a different problem.

#2 – You need to be patient and research ahead. You don't really have an option if you're in a big project. You get used to it.


Code that is swift and easy to extract makes doing it a low-cost operation; just because you can do it quick and fast doesn't mean you should, is the point made in the article.


Fair enough. I just never allow for duplicate code. I don't wait for duplicate code to be present on three clases, I attack the problem the first time.


I've used a modified version of this. I run a process by hand. I run it a second time by hand. The third time I automate it. By then, I've seen enough to know what I'm actually trying to do.


My 3 rules of sotware engineering are: 1 - KISS (keep simple and stupid) 2 - DRY (do not repeat yourself except to avoid violating rule 1) 3 - there are only 2 rules


@erikbern nice post. Just a quick typo I noticed: in your example code for the CitibankScraper, the class name is still ChaseScraper.


And here I was hoping it would be a reiteration of "Fast, cheap, and good: pick any two".


Premature optimization is the root of all evil in programming. - Donald Knuth 1974


So what's the third, ideal pattern to use here?


A better rule is to wait until you have all the requirements.


You'll never have all requirements


Well, I wait until I have 90 percent, and the remaining 90 percent will usually manifest after delivering the release candidate.


Exactly, and as soon as you do, some external factor will cause them to change.


If there ever existed a project where all the requirements were defined exhaustively and accurately up front, then I haven't heard of it.


It happens in other disciplines all the time. E.g. architecture, automotive industry, ...

The problem is really: we're allowing the requirements to change just because it seems possible (in software). But it leads to crappy code. Managers should understand this.


Not allowing requirements to change like in other engineering disciplines leads to products that nobody wants or components that don't quite do what you need so that you have to implement various workarounds.

Software is so tremendously successful because it's much easier to adapt it to changing requirements.


you say

>seems possible

but in software a better statement is that it is possible. Once you've built a bridge you can't just raise all its base pillars by half a meter, but leave everything else the same. I mean there is literally no physical way to 'change one line of code'. Someone would have to go out there and start tearing it apart or doing something physically.

In software you can certainly do that: that doesn't mean it's a good idea. But if you have good test coverage and a good architecture it's even possible for nothing to break.

So it's not that it seems that way - it is that way. There are people running popular, live applications who SSH into production and change a line of code that changes live behavior. Every day this happens.

So it's not just that it seems possible. It is possible. Software is just different from architecture, or the automotive industry in that regard.


You actually get requirements?!? Luxury! /s


Pretty much everybody uses agile methodologies to handle evolving requirements. Without adapting to changes in requirements projects fail.


This was intended as humor, right?


funny, until this this very moment, this post had exactly 3 comments.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: