Hacker News new | past | comments | ask | show | jobs | submit login

I think this highlights a common issue when development is (or feels) rushed. You either end up with developers having only done the first part of each of these pairs (repeating themselves, ball of mudding, etc), without time to clean up as part of each iteration, or, you find developers immediately shooting for the latter half of each pair (DRY, modular, etc) without having done the former, and so you get abstractions that make no sense, overly complex interactions in a shared function as they attempt to be DRY, etc.

This latter is also, I feel, what informs a lot of the monolithic frameworks used for 'enterprise' development, Spring and the like, where a predetermined architecture and structure of the app is imposed by the framework, and which leads to, once you get down in the weeds of dealing with odd edge cases and things, hackery on the part of the developer, or framework bloat if the framework attempts to address the most frequent of those cases.




Couldn't agree more. Sandi Metz has an excellent talk touching on this topic. Developers exaggerated willingness to keep things DRY and elaborates a bit on why: it's one of the easiest things for a not-so-experienced developer to identify and one of the easiest things to teach.

Edit: One of the best quotes from that talk is (paraphrased): "The wrong abstraction is a lot more expensive than duplicated code". https://youtu.be/OMPfEXIlTVE


Funny you mention monolithic code. The post expressed a very important thought:

> we’re trying keep the parts that change frequently, away from the parts that are relatively static

Modularization and good code contracts help "sequester" code parts with different rates of change.

And this is my guiding hunch when I am splitting up code and deciding on module surfaces: how confident are we in the subject matter that given code describes? This does not replace actual proper concept modeling, but it is an intuitive feel for helping future iterations go smoothly.


Intuition is great but we often forget that we can use churn stats too.

find . -name "*.rb" |xargs -n1 -I file sh -c 'echo `git log --oneline file | wc -l`: file'|sort -nr

Applying a time period window is good too as some parts of code churn for a while and them reach stability.


What does that command do?


By my read, it prints a list of all Ruby files in your project sorted by number of commits against them. It may take a long time to complete depending on number of files and number of commits.

YMMV, I am not a BASH shell.


We put a variation of the script to music also. Seriously. It's on iTunes and Bandcamp:

https://itunes.apple.com/us/album/snippets-programming-episo...

https://fowlerandfeathers.bandcamp.com/releases


http://explainshell.com/explain?cmd=find+.+-name+%22*.rb%22+... doesn't quite get it either, but I think you're right. I'm not near a machine with Bash to check.


Actually, it finds all Ruby .rb files, then prints out that file every time a change was made to the file. It then does a count of each of the times the file is output, and then sorts the list with those files that were modified the most frequently being at the top, and those modified least frequently being at the bottom.

In other words, it shows (as the poster says) file churn in a Ruby codebase.


It finds all of the files with the .rb extension in a directory, then feeds that in to git log --oneline. wc -l then counts how many lines resulted from git log, which basically tells you how many times each file has been modified. Lastly these numbers are sorted from largest number to lowest.

The gist of it is that you figure out which files get modified the most. This lets you know where your most frequently changed code is.


Your first paragraph just nailed. You were able to mix the post content with real world mechanics.

One thing I find really hard is where to draw the line to an uncleanable ball of mud.


For myself, I find the most important thing is to have clear interfaces (contracts). That is, I can write the hackiest code inside of a module, but I will spend time upfront to make sure that what that module exposes is the cleanest I can make it.

Then, I can isolate and fix a function by itself. It may have been written to be 200 lines long, filled with hackery and half measures, but my complexity is contained within it, and the functions it calls out to (so nothing calling it should need to be substantively changed). Those called functions, in turn, may also need improvements, but my focus every step of the way was keeping the interfaces clean, so I can always get down to some base set of functionality that has no dependencies, fix that up, make the necessary fixes to keep things working in the functions directly calling them, test, and then move up to those calling functions and repeat, until I get back up to the big ugly 200 line monstrosity. Every step of the way I can make sure things are still working, and I don't have to substantively touch anything above the 200 line monstrosity until I've cleaned it up.

So by keeping my interfaces clean, I can figure out how, inductively, I can make progress.

The lack of clear interfaces is the real problem; if you don't put in the effort when writing your code to keep those clean, you end up with circular dependencies, implicit workflows and state between functions ("to call X you first have to call Y to get a foo, pass that into X, then call Z to reset the foo value"), and other nightmares, that move you more to needing a full rewrite.


Also, plain old data interfaces are very desirable. They decrease coupling. Get the data off the interface and restructure it (renaming, preprocessing...) in a suitable way before implementing the actual functionality. This decrease in coupling makes representation changes in the interface more practiceable.

Of course sometimes stateful (procedural) interfaces are a must, but it's surprising how many painful OOP classes can be replaced with a "const struct foo" interface with a clear meaning to the data in it.


Higher order functions are one tool to help transform procedural interfaces into data interfaces---after all that's the whole point of treating functions as data.


What exactly is "the whole point of treating functions as data"?

Of course you can declare everything is data. You don't need higher order functions for that. But real data is simple and introspectable. Computation is not.


Yes, even pure functions are not introspectable. All you can really do with a function is call it on a value and get a return value. (For the sake of simplicity, I am ignoring side effects here.)

Let me try to rephrase with an example. Eg you might have an API for a file that allows you to open a file, manipulate it, and then close it. That's a very procedural interface.

As an alternative, think of an interface like the following:

    withOpenFile(filename, manipulator)
that opens a the file, calls the manipulator function on the contents, and automatically closes it.

Or compare map, filter and reduce vs manually iterating over a collection of items.

I did really like your grandfather comment (https://news.ycombinator.com/item?id=11099224). I hope I've cleared up my point?


nitpick: in Javascript you can actually call functionName.toString() and get the implementation. AFAIK angular did this trick to implement their dependency injection mechanism.


Yeah, but that's evil. And even with the source code, the only decidable properties of Turing complete systems are trivial. (Eg you can't even tell in general whether a given function will eventually return or loop forever from the source.)


That doesn't show the actual values in the closure. Which are subject to change anyway.


Thanks :-)


My experience has been that shitty code tends to be shitty because it accesses "magic global state," so it's hard to coral it into a single module almost by definition. This happens usually because of rushed deadlines; it's often hard to plumb through an extra parameter if there are many touchpoints or a lot of tests to fix, and easier to just stuff something into a global variable and ship the code.


That's why a language that forbids magic global state (or at least makes you clearly tag every tiny instance of it) can be so useful.


Then someone invented the singleton.....


In eg Haskell a singleton needs to be tagged with IO. And that makes sense, because singleton are clearly evil global state.


+1. Clean, data-driven interfaces also promote meaningful tests as opposed to cringe-worthy "mockup object"- driven tests.


Same here. Write your shitty code in an isolated private part. Expose decent parts and let that be used elsewhere.


Yeah.. Spring batch comes to mind.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: