No, functions should perform one task and perform it well. Not every conditional test inside the function needs to be its own function and have its own unit test. Sometimes a function that does just one thing and one thing well has more than one step to do that thing, and those steps don't necessarily belong in their own functions. Testability and composability are important, but that has to be balanced against locality of reference and context. When I see a colleague write in a code review that pieces of a function should be factored out of a larger function "just in case they'll be reused" I step on it hard. This is related to "over generalization" but not exactly the same.
// func buildPath(pathName): // builds a path between start and end control nodes fetched from the scene graph
// func updateFrame(timeElapsed): // gets current location and move to next location using timeElapsed
Now both these functions are not too big, but not too small either. However a reviewer mentioned I should split and I refused because they don one thing and do it succinctly.
This useless dumb fascination with test practices and being rigid about it, causes grief, both for the code and the coder!!!
> that has to be balanced against locality of reference and context.
I'm also against splitting stuff like crazy, even though I tend to write small functions. For me, the primary concern is "how easy is this function to understand"? Sometimes splitting out code into separate functions actually hinders readability - especially if it makes reader jump around, or uses state passed implicitly between many small functions.
I find good functions to be structured like good explanations - the function body should give you a general and complete understanding of what it does, and low-level details get outsourced to helper functions, that you can recursively review if needed.
I think the former is of limited use if any, while the latter is an invaluable tool to managing complexity of a code base. That people may assume the former when the latter is meant is unfortunate.
My email is my profile. If you can reach out with what you would like to contribute with in case you have anything specific in mind, great. Else just ping, we can talk about potential contributions you make.
If you look at it this way it is easy to see when a function needs to be sub-divided. When an idea relies on another idea, it should be broken out. Otherwise, in the case that the lines are part of the main "idea" then it should remain.
When grading homework for the course I TA for at my college it is evident that this idea is not understood. I have to admit, I don't actually understand it. It's just something I've taken to doing. I can't describe what I mean really and I don't think I've ever seen a good explanation.
I think it's obvious that you practice the same way I do, but I still see a difficulty in articulating these thoughts.
It goes without saying that, basically, we know that some things need to be moved out, and some things don't, but it's impossible to tell or describe to someone when these actions are taken.
Does anyone know of some place that describes this well?
I know it may not be the popular belief but I would much rather work with a larger well written 50 line function that does what it is supposed to do vs having to navigate around a bunch of smaller 5 line functions that are never re-used anywhere else.
The problem is that method decomposition is not always done well, and when there is unexpected behavior or implicit state changes, it can make bad code even harder to navigate. I'd also rather read a well-written 50 line function than a poorly decomposed cluster of 10-line functions. With proper care, though, method decomposition is super helpful.
Since libraries have different use cases and characteristics than normal app code they can and should be treated differently.
A 10% cognitive cost for code that is used by thousands of developers is a very high price to pay for an initial 3x speed boost to one developer.
A 10% cognitive cost for code that needs to be understood 3-10x before it effectively expires (like most app code) is a great tradeoff for a 3x initial speed boost.
I'll get my app in my customers hands in 3 months, wear a little "tech debt" on the dev side, and get crucial feedback. You can take 9 months to deliver a functionally equivalent app with slightly better internals.
5.1. Sandwich Layers
Lets take a concise, closely bound action and split it into 10 or 20 sandwiched layers, where none of the individual layers make any sense without the whole. Because we want to apply the concept of “Testable code”, or “Single Responsibility Principle”, or something.
But you have expressed this way better than I did, thanks :)
Tests (unit, integration, system, acceptance or any other kind) should do two things:
- while developing, help checking that the code is doing what it should (e.g. fixing the bug)
- after developed, ensuring that the code keeps doing it.
(ideally also ensure that it doesn't do anything not supposed to do, but that's much more difficult)
That's why the best tests should aim for weak spots and try to give a good assurance that, if there are defects, the test should have a good chance of failure....
Those weak spots sometimes will be in single functions, but other times they won't.
Creating tests just for the sake of having them, not adding a better chance of them failing when a problem arises is just wasted time. Or worse, coupling the tests with implementation details that are irrelevant (for example, this function is divided in two or three because is big, just to add readability) is just pointless....
Many challenges cited in this article become relatively simpler to navigate when thinking in FP terms (rather than OOP).
FWIW I don't prefer small functions for reusability but for reasonability. It's a lot easier for me to discern what that block of 10-15 lines of code is doing if it has a descriptive name and a type signature.
There's no one right way to do it, really. My objection is mainly toward a zealous adherence to some subjective principle here.
Most of the arguments I've had over the years on this subject were cases where we were both right. To them X was more reasonable and to me Y was, and the two were mutually exclusive.
As I write this it makes me wonder if this whole topic is a fool's errand, are we doomed to forever be trying
To solve an unsolvable problem?
These function definitions are also typically defined in call order in the same file.
If you need to know what the individual functions do you can scroll down the necessary few lines , but IMO as long as you name your functions honestly that isn't generally required to be ablr to understand what both what it's doing and where you're likely to need to make your change/fix at a glance.
This style does require a lot of trust that everyone is following the same strategy - as soon as you have to doubt that accuracy of a named function in this approach, the benefit is lost. So it becomes important to keep them up to date - changing what function does must also change its name.
My IDE allows me to leave the "documentation" panel open attached to a side of the window. As the cursor/caret touches an identifier, the documentation for that function/method or constant/variable appears automagically. But of course, I type reasonably well and write Javadoc/JSDoc on my stuff, so there is something to see. Others seem to rely on gargantuan "self documenting" (bullshit!) identifier names for the IDE to auto-complete, rather than actual documentation. Pet peeve.
There was no code re-use.
Sounds insane by today's practices but in reality it worked well more often than not. Business functional changes almost always applied to just one or a small number of pages. You could change those pages with impunity and be pretty confident that you would not break anything in any of the hundreds of other pages in the site.
Rarely this approach caused more work when a change did affect dozens of pages. But on balance it made most changes much easier to implement and test.
The trick is to strike the right balance between repeating code and testing it. I've seen codebases become unmaintainable piles of almost repeating code that was never tested beyond a developer opening the page and checking the behavior manually.
To prevent such messes is one key responsibility of a developer.
You should do this in a way that doesn't introduce duplicate code but this method is a really good way of onboarding yourself on a new project: find a part of the project that almost does what you want and go from there.
A lot of stuff we take for granted are either accidents of history, or powerful counter-reactions to the accidents of history.
There is a practice, and it turns out to be bad. Mild discussion of the virtues and vices would, in a world composed of Asimovian robots, be sufficient to update the practice to something better.
But that's not how humans work! Typically an existing practice is only overturned by the loudest, most persuasive, most energetic voices. And they have to be. Humans don't come to the middle by being shown the middle. They come to the middle after being shown the other fringe.
So a generation changes its mind and moves closer to the new practice. Eventually, that is all the following generation has ever heard of. The original writing transforms its role from mind-shifting advocacy to the undecided to being Holy Writ. The historical context, and with it the chance to understand the middle way that had to be obscured to find the middle way, is lost.
My previous role was as an engineer teaching other engineers an XP-flavoured style of engineering. I often referred to our practices as "dogma", because we are dogmatic. But if we aren't, less learning takes place. Dogma is most instructional when someone later finds its limits.
When I was learning to coach weightlifters, I was told something that has always stuck with me: "As a coach, you will tell trainees a series of increasingly accurate lies". You can't start with nuance. In the beginning, it won't work.
What I taught was a way of working. I didn't deviate from the practices, because the principles are easy to state but hard to truly grok.
Going back to what I said earlier, this is the difference between weightlifting drills for various parts of the movement, versus discussions of physiology, anatomy, anthropometry or physics.
You start with the drills.
Learning when and why to break the rule takes longer. It helps to first go through a bunch of concrete examples.
I sometimes referred to Pivotal Labs as a debating club that produces code as a by-product. Everything is up for debate. "Strong opinions, weakly held" was a frequent motto.
But that didn't mean we started from scratch. Almost all projects start with the core practices and stick with them fairly tenaciously and inflexibly (in the face of the circumstances we have seen before), in order to facilitate the immersion.
Yes. I need a different word. I am not conveying this well at all.
When you strip all your useful tools and concepts away, you're forced to rethink how you can organize with just data types and functions. Surprisingly, you can do pretty well with just these.
It's the sort of thing that helps with recognizing when you're looking at FizzBuzz and when you actually need to use a generic factory.
Having large arrays of const data to drive 'C' programs can - emphasis can - lead to much more manageable 'C' code.
That said, I wish the "strong types" crowd would take a look at all of the temporal coupling that their OOP designs are causing, with their update-at-will practices. Now that we have working garbage collector software plus adequate hardware support, why not make use of that?
FP needs to become a more widely used practice.
Something as described in this talk:
Some are just lost if autocomplete in the IDE doesn't enter stuff for them, and the more verbosely the IDE spews, the better, cuz it looks like "work".
So, Lua tables? ;)
Thou shalt not index from one.
The problem was that Lua reified 1-indexing into the language when they optimized to make arrays faster and then created length operators. At that point, 1-indexing got baked into the language.
I just did this yesterday... copy/pasted some code from one function to another function, tested that the new function works and moved on, and then when I went back to work more on it, I wrote more generic code that can be called by either function. Don't have to overthink things before writing the first function.
The former almost always works while the latter almost always fails.
The former gives a little dopamine hit and a boost to the ego because that's being "an architect" while the latter feels more like being a janitor. Ironic really.
I am note quite sure, though, if we can already talk about an abstraction then.
What may make sense is to make something similar to lemma in proofs. This is strictly for ease of debugging or readability.
De-duplication is "equivalent" to ( LZW style ) compression, and LZW compression is a solved problem. But local conditions may insist more on de-duplication.
Point #2... seriously? Keep your code consistent, and if you identify opportunities for reuse for things that are related then do it. Copying and pasting code, which is what this guy is advocating for, is not good.
Well obviously this is why we have variables and functions, etc. And hopefully this helps enforce DRY ("don't repeat yourself.")
But, reusing a component leads to coupled code and that can also spell disaster. Sometimes it really is better just to copy stuff. Maybe this is a matter of taste; some people like to have the perfect design and if that makes them a happy coder then i guess you should let them do that. I find that, more often than not, the "perfect design" is not worth sweating over. My style is tracer-bullet: get it working today, write some tests maybe, and then refactor it.
Reminds me of an anecdote about getting people onto the moon. When that was their goal, the very first thing Nasa did was make sure they could hit the moon with a rocket. Then iterated from there.
The term is actually from the book Pragmatic Programmer (a good book, though I'd say mostly for beginner-to-intermediate programmers, since a lot of software blogs are essentially just repeating the contents of this book ad nauseam).
I tend to do that too - get something working fast, and iterate over it. Focusing on perfect abstractions makes no sense if then you find out you're aiming in the wrong direction.
I find I have been terrible at obsessing over abstractions I could be proud of. I try to instead keep my eyes on solutions I can talk to.
I think the specific scenario was alluded to when someone comes to read your Foo action, you want to avoid: "Found the entry point to Foo, looks like it first sets up something using Utils A. Now I need to get a rough idea of Utils A. Finally, back to Foo. Only, now it makes use of two library functions from B and C. Now, off to get a rough understanding of B and C. ..."
It is possible, of course, to quickly get through all dependencies. Or, to just ignore the understanding of A, B, and C to get whatever you needed into Foo. Doing this, though, typically muddies whatever purpose A, B, and C originally had so that maintaining them will be near impossible.
Aiming for design that's easy to refactor and/or replace bolsters application longevity at the expense of code longevity. I like that. It's like the ecosystem longevity is achieved at the expense of the longevity of individual organisms...
Now then, who has ideas, or best practices or even just anecdotes? I'm eager to hear those!
For my part, I follow design philosophy that revolves around persistent data structures. My code is supposed to be just an accessory to the data. I don't think I'm explaining this well, it's just where I put all my focus on. Another principle is to try capturing users intent rather than the outcome of users actions. This way I can redo the code that detived outcome from actions and fix some the earlier erroneous behavior.
Fred Brooks: "Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowcharts; they’ll be obvious."
It seems that just as in literature, in software engineering too, the essence of writing is rewriting...
No longer have it but such tools definitely influenced my belief languages must make code easy to iterate, change, and throw away. LISP with types, Design-by-Contract, and incremental compilation is still the champ. Think, type, result in 0.2 seconds, test, rethink, retype, etc. Just keep flowing.
The mainstream stuff needs support for these things.
[...] The best quality of a Design today is how well it can be undesigned. [...]
Examples? Grunt, Bower, Hibernate, Apache Commons. How long did it take the Apache Commons project to properly add generics to its libraries? How is backward compatibility and developer availability holding back projects down the line?
Additionally, in order to use a library, you need to have a good knowledge of the problem it tries to tackle. By overly relying on open source software, you might blunt the competitive advantages of your business. It's an example of exorbitantly relying on abstraction.
This cost varies a lot, for example a library to left pad a string may be easily replaced, but if you choose a library to implement some on-disk format, then that can cement things in. If the library becomes unsupported one very well may end up maintaining it alone, for example.
In the more general case more issues are attached: choosing a deployment stack, a documentation or translation tool etc. can result in decisions that are essentially set in stone. These also tend to be composed from multiple tools, leaving more wiggle room. Deployment for example might be done with AWS and Ansible on the higher levels, but use in-house tooling for figuring out the details.
Similar arguments apply to code de-duplication and DRY: de-duplicating code isn't free, and not always the right thing to do.
See also: http://blog.liw.fi/posts/dependency-cost/
But, presumably, you used that specific format because you needed it. So, it makes sense that you take it over if it gets abandoned.
If the problem is that you didn't need that specific format, then you should rip it out completely when (and only when) that library becomes unsupported.
A company I used to work for was closing its local office, and moving the work we did closer to the home office. One of the lead/architect types asked my friend who was still there at the time "Why didn't you use something like Spring Batch for this?". He accurately responded "Because it didn't exist in 2005."
That said, I'm glad something like Spring Batch (at least for Java) is around now. I've had to implement a partial version of something like that 3 times in the last 20 years :-)
(Both a C version and a Java version for turning client file dumps into mass mailing print output; a Java version for doing ETL type jobs for in/out bound client integration feeds)
That said, I've seen under-engineered batch jobs that mix incremental input and output in a giant spaghetti loop, rather than taking the time to do a bit of partitioning (e.g. - ye ol IPO, Input-Processing-Output), so that you can do things like selectively run certain input units, or recover from bad data in individual input units.
Oh, luxury, I tell you - Grunt, Bower, Hibernate and Commons are the most obscure, unmaintained third-party "solutions" that you have to deal with? I'm over here dealing with vanity projects like "Dozer" and "XmlBeans".
Or, are you just saying that if you have a complicated problem you will probably need to know specifics of how these tools work? I can buy that. I do question how many people are truly doing something complicated where these tools wouldn't be sufficient.
Also, consider how the main contributor (cowboy) last commited in May 2015. This is just a cursory github-health-scan (tm), but it tells me two things:
- Don't expect any large changes in the grunt code-base in the coming years
- The code-base is probably not easily modified (either caused by low code quality or of the risk of breaking builds).
A lot of people consider that a good thing.
I could mention some flaws of TeX, and observe how many are related to input, output and environment:
- archaic syntax. Even though it has an interesting design philosophy, it does not adhere to the syntax (input) most programmers are used to (you know if you ever tried to write TeX)
- does not take advantage of modern system architectures. Needs to process a TeX file multiple times.
- does not allow animation (this was not a requirement, nor a possibility back in the 70's)
- not much interactivity (yes, URLs are possible, but it's mostly a hack). Web wasn't available back then.
- output format (DVI) limits possibilities
So, even though I like TeX and its attention to detail, it was already dated when I started using it in 2000. In our landscape, adaptability is an important trait of libraries.
I pretty much agree with all points, except number 9 could probably be "Not challenging the Status Quo" instead of "Following the Status Quo".
Breaking the status quo just for the sake of it would be a mistake, while healthy challenge of status quo with open mindedness to accept to not change anything is probably a better direction.
> Areas of code that don’t see commits for a long time are smells. We are expected to keep every part of the system churning.
Or it could be an indicator of mature features that were well designed and implemented to minimize future headache: they just work and have very few stable dependencies, if any.
Also let's not forget the opposite. I've worked in places where everyone just wrote their own spaghetti, no concept of version control, and every time there's a small change it takes ages for the only coder who wrote it to untangle and modify it. Basically a steaming pile of turd, used to invest real money in the real market. The worst part about it is when you call them out it's YOU who doesn't understand the requirements.
Premature optimization is the developers version of pixel perfection. They are both the enemy of getting anything done.
I would love to read more practical articles like this. Any URLs people could share?
If just maxing out the computer's RAM and CPU count solves your problem, then it's not big data.
Suck relevant DB entries into memory. There is no step 2.
Step 3 is using a real database.
Circles into square pegs. ;)
Not everyone thinks or codes like this, and those that don't are not all greybeards.
...I don't get it? This sounds like it should be saying we try to "plan ahead" too much, but then the description seems to say we don't do enough.
2. Reusable Business Functionality
I think this is arguing against doing too much design work up front? From what I've seen, incrementally growing a system tends towards the opposite problem unless I'm aggressively looking for refactoring opportunities.
3. Everything is Generic
This isn't a case of too much vs enough, it's a case of correct vs incorrect. If you can guess how the business requirements are likely to change, making things generic in the right way will make that change much easier to do.
4. Shallow Wrappers
Yeah. Unless you have actual advanced knowledge that you'll need to switch out a particular library, this should be done on-demand as a refactoring step before such monkeying around. Except for things where you need a seam to make testing easier.
5. Applying Quality like a Tool and 5.1. Sandwich Layers
...does anyone actually think this way?
6. Overzealous Adopter Syndrome
Maybe, but keep in mind these can also be used to clarify intent or to intentionally constrain future changes.
The examples look like things where pursuit of whichever <X>-ity didn't actually work, rather than cases where it wasn't needed.
8. In House “Inventions”
These tend to be a result of either very old systems that date back to before an appropriate reusable version became available, or organically grown systems that had parts gradually come to resemble some existing reusable thing (that initially would have been overkill and more trouble to use than it was worth).
9. Following the Status Quo
Or in other words, "don't fix what ain't broken" isn't actually good? How is this "over-engineering"?
10. Bad Estimation
How does this fit the theme? I thought the standard way to improve estimates was to put more thought and detail into them, which means the problem here is actually under-engineering (well, that and general noobishness).
Edit to add:
Important Note: Some points below like “Don’t abuse generics” are being misunderstood as “Don’t use generics at all”, “Don’t create unnecessary wrappers” as “Don’t create wrappers at all”, etc. I’m only discussing over-engineering and not advocating cowboy coding.
So... if you disagree you're wrong and misunderstanding the article? If it's that misunderstood, it's the article's fault for failing to communicate effectively.
My view may be biased by my experience, but I understand it that - no matter how beautiful logical structure you invent that makes all the requirements fit perfectly, the business will quickly come up with a new case that doesn't really fit anything. Business requirements are unpredictable because they make no fucking sense - they're combination of what managers think customers need, what marketing thinks it needs, what future visions your boss has that he didn't tell you (or that he doesn't himself even understand yet), all with a sprinkle of peoples' moods and the subtle influence of phases of the Moon.
See also the motto of American Army - "If we don't know what we're doing, the enemy certainly can't anticipate our future actions!" ;).
Or maybe things can look that way, if the devs don't understand the business.
The business is doing something to make money. The requirements will somehow work to support that something, or will be something that your business contact thinks will support that something. If they don't make sense, that means your mental model of the business is different than your business contact's mental model. And unless your company is unbelievably dysfunctional, that's an opportunity to talk and reconcile those models.
> TL;DR — Duplication is better than the wrong abstraction
Woah, horsey, hold on a moment!
While it's true that an abstraction can get you into trouble, that's not always true.
Over my many years I've heard a number of people say: "We have 100 copies of the same site, but slightly altered for each client, and we don't have time to go back and refactor them, we can't upgrade them, and we're three major versions behind. Want to come work for us? (Silence.)"
I've only heard one person say, "We refactored X and it bit us in the ass, because the developer didn't check when he/she was altering it and accidentally changed behavior for everything."
So instead changes must be made outside the generic shared code and you end up with 100 slightly different sites that can't be upgraded.
* It's vs. not v.s.
* If you want to copy a site 100 times, go for it. But I know very few that ever thought that was a great idea, and each time it was a very specific case. Yes there are client-specific features and multi-tenant sites, but that's not what he said. He said "copy vs. abstraction" which is the opposite of refactoring.
As an aside, I'm really having trouble understanding how people in the HN community could be thinking I'm wrong on this.
I think I need to go to a forum that's more grown-up if this is how things are here now.
The pitfalls of over-abstraction, trying to be more generic than necessary, more "clever" than the business requirements, etc are platform agnostic.
That's what is meant by "modern software" if you just read on-line blogs. On-line tends to focus around on-line (also, it's hipstery and hot, so it gets lots of attention).
That said, the article is rather general and definitely applies to desktop-app programming too.
I have a small bone to pick with 4) Shallow wrappers. My design process involves ignoring existing solutions for problems and then finding things to match my desired design. Often, this requires absolutely no wrappers, sometimes it requires shallow wrappers, sometimes more involved wrappers, sometimes implementing it all myself.
I do agree that you should not blindly wrap anything.
All in all, a great article. I think this should be required reading for aspiring designers/architects.