At the time of writing this, this "article" is #1 on HN default sort. It's an incredibly short, and fluffy post (self reported as a 3 minute read), with little more substance than an urban dictionary listing. I'm baffled.
On the topic of yak shaving though: I recently felt like I wanted to do more technical writing. The prerequisites of this idea involved building a static site generator and redesigning my website. While building the SSG, I of course imagined all the features that would be expected, a tagging system, JSON output in addition to markdown rendering, custom syntax highlighting for codeblocks etc etc. I had that idea at least 1-2 years ago, and I've only recently written my first post within the past 2 months. I think I enjoy tinkering with build systems much more than writing.
I think we are many who read hacker news for the comments.
The downside is, it’s not so much the article that becomes relevant, but the topic. HN felt like talking about yak shaving and all we needed to get started (you included!) was a headline.
It also creates a vicious cycle where people who do care about article quality are more likely to disengage with the site -- and these are most likely the people who write the highest value comments.
I don't think a culture of 'only reading the comments' is at all a good thing.
Sounds like the path reddit took. Doubt HN will turn into it in perpetuity but I could already see some of that* leaking in the last 2-3 years, although thankfully it is harshly reigned in by moderation and fellow posters alike.
*by that I refer to comment chains just making bland puns, joining in threads for no reason other than saying something, even if it adds nothing and at times is straight up incorrect, etc.
To me, more important issue is allowing users to read, but not comment. Obviously everyone doesn’t have something meaningful to add to every topic, but if someone has read 100 articles and did add anything meaningful, something is wrong with HN.
Agreed - it's like watching the emergent complex behavior that an AI spits out after you give it something unique. You're curious - what on earth could the HN crowd have to say Yak Shaving? The input doesn't matter at that point to a degree, it's the unique/novel thing that was outputted.
You could search Algolia for the last time it was discussed, but it's much for fun to bring up the topic again. Who knows maybe the emergent behaviour is different than the last time it was discussed.
Do you get different emergent behaviour depending on the time of day/week?
Have demographics of the site altered such that the prevailing opinions are vastly different?
What is the mean time to only tangentially related items being discussed?
I have written comments on articles that were posted before. In many cases the discussions and participants are quite different. And they complement the comments of the original post.
HN completely random, yes there are patterns, but if you looked at 1000 articles that have been posted 5 or more times on HN you would see it’s completely random to if a topic get attention or not. Same with an emerging new story, multiple users will post it and frequently only after it’s posted a few times over hours or days does it get traction; and even then, there’s obviously a lot of posts that never get attention, but do elsewhere, but were within the scope of what’s on topic on HN.
Agree, though HN also clearly has the “Ask/Tell HN” if someone feels like taking about something and they’re by default down weighted vote wise, which to me says the posts linking to content should have substance.
For the people that read HN but don’t comment, please load/scan a link before upvoting. Not doing so will have a net negative impact on HN if enough people are not checking the links.
> The prerequisites of this idea involved building a static site generator and redesigning my website
To clarify: that is not yak shaving, that is just plain old procrastination. Yak shaving includes tasks that you have to complete in order to do whatever your original intent was. For example:
* I want to add new feature to my code.
* The feature requires a database migration.
* To apply the migration to test environment I need to set up AWS credentials.
* To set up credentials I need to log in to AWS console.
* My password hasn't been rotated in a while so I need to update it.
* etc
> Yak shaving includes tasks that you have to complete in order to do whatever your original intent was.
The example (and definition) in the actual article disagree with you:
"You want to bake an apple pie, so you head to the kitchen.
In the hallway, you notice some paint chipping on the wall.
So you walk to the hardware store for some paint."
You obviously don't need to deal with the paint chipping before baking an apple pie.
Wikitionary gives both definitions: a blocking task or a procrastination detour
Similar thought: I think you could stay there's a spectrum of yak shaving. I wouldn't as the article call 'going down the rabbit hole' yak shaving as rabbit holes feel more intentional.
My feeling of things related to yak shaving:
Rabbit holes - intentionally following some thread of associations looking for some insight or unexpected solution
Yak shaving - While working on something important you iteratively find something more important that you feel you need to deal with first. Supposedly but not necessarily it's something you have to deal with to be able to solve the original problem.
I agree that the articles example of yak shaving is a special case of yak shaving where it isn't very clear that the shaver feels that the distraction is more important. But I suppose such feeling is always a bit implied in the word distraction.
> Yak shaving - While working on something important you iteratively find something more important that you feel you need to deal with first.
No; yak shaving is something you must deal with to continue working on your original "something important". It is solving blockers, not following distractions.
If you want to write and for it to be a habit you can reduce the friction of writing.
Create a GitHub repository and create a journal repository and create a markdown heading for each entry. It works.
I'm on 4 pages of 100+ journal entries each. See my profile, I journal computer ideas, algorithms in the open.
Using WordPress or using complicated software shall take time away from writing regularly. And you need to open a document and write, not abandon your blog due to forgetting how to post
> I had that idea at least 1-2 years ago, and I've only recently written my first post within the past 2 months. I think I enjoy tinkering with build systems much more than writing.
This is very much an easy trap to fall into! What helped me was not sweating over the small stuff and setting up an instance of Grav, though I think that most of the turnkey blogging solutions out there would work (e.g. Ghost/Bolt as well, maybe not self-hosted Wordpress as a first option due to large surface area): https://getgrav.org/
What I really like about that solution in particular is that it is flat-file based and also has an admin web dashboard that's a separate plugin that can be enabled/disabled (some might prefer writing text files directly, with front matter and all) and has separate URL path that can be put behind basicauth (in addition to built in auth), client certificate auth or anything else.
It's not perfect, of course, and has given me the occasional headaches, but it's also good enough for my blog: https://blog.kronis.dev/
That said, I still struggle with my homepage - instead of going back through 5+ years of projects and describing all of the noteworthy ones, putting up a few galleries of screenshots, listing technologies, ordering them by relevance and also making sure that it doesn't contain too much data... it's just sitting there, on my TODO list. It's been that way for a while now.
The HN new queue is interesting because you often have a sea of articles with absolutely no votes, and then one or two articles with dozens of votes. While votes are probably power law or exponentially distributed, that skew seems a bit too extreme.
When quasi-blogspam articles like this rise to the top so quickly, one has to wonder whether it’s the result of vote manipulation.
Yes, been there, done that: tinkering (or yak shaving) makes it easy to forget about KISS. And sometimes results in too many open tabs in one's browsers ;-0
But even the best of us are prone to fall for it, and sometimes they even produce excellent results, cf. Donald Knuth and the TeX / Metafont ecosystem:
Between 1977 and 1979 computer scientist Donald E. Knuth of Stanford University created the TeX page-formatting language and the Metafont character shape specification language, originally as a way of improving the typography of his own publications. [...] -- https://historyofinformation.com/detail.php?id=3339
My experience is that programmers always try to make things as simple as possible, and when they fail at that, it's because they don't know how, or don't have enough time, or maybe the "simple" solution isn't persuasive enough that it covers all the cases, so a "more complex" version is preferred, or something like that; not because they forgot "whoops" I was supposed to make things simple not complicated.
The only times I have heard "KISS" suggested in earnest have been from bad managers.
> But even the best of us are prone to fall for it, and sometimes they even produce excellent results, cf. Donald Knuth and the TeX / Metafont ecosystem:
I don't see how this has anything to do with failing to keeping things simple, yak shaving for sure, but this yak (if it's a yak) did need the shave: there weren't any typesetting systems comparable to TeX at the time (and by some opinions there still aren't). What else can you do? Have someone spend hours typesetting and resetting a thousand page book every time there's an error, or turn the job over to a computer and automatically typeset it?
This is just solving problems. Solving problems you have is good, and where people go wrong, it's when they are solving problems they don't have (but perhaps think they might), or perhaps they underestimate how much of their problem the computer could actually solve.
Emergence at work. There is much to be learned from just the behavior externals of HN.
Uniqueness, mystery, narratives and humor/silliness are probably big drivers of attention on forums and news sites. Like the other commenter suggested, the gold is typically in the comments. I don’t have data to back up this assertion, but it would make for a good study/article.
>At the time of writing this, this "article" is #1 on HN default sort. It's an incredibly short, and fluffy post (self reported as a 3 minute read), with little more substance than an urban dictionary listing. I'm baffled.
That’s not the definition of yak shaving I’ve ever used. I’ve always thought of it more like:
- I find a bug
- I track it down to a library I’m using. The fix is in a later release.
- I go to upgrade the library, but that brings in some transitive dependency upgrades that accidentally broke an API used elsewhere.
So now instead of fixing my bug I’m rewriting API call sites across my project.
I think my understanding of yak-shaving was more, can't get to the thing you were trying to do because of cascading blockers. Less distraction, and more unknown scope and complexity.
I quite like the Malcolm in the Middle[0] example of yak-shaving, though it's a combination of both blockers and distraction.
My understanding of yak shaving is that if you need to create a system for selling ads then you'll start by developing a graph layout algorithm that produces a set of optimal layouts for routing power in a data center (because you don't want to be called out if your system doesn't scale.)
That's a much more correct example than the one in the article. The "real-world example" in the article describes doing something completely different than intended, not out of necessity but by distraction.
> yak shaving - Any seemingly pointless activity which is actually necessary to solve a problem which solves a problem which, several levels of recursion later, solves the real problem you're working on.
The other side I see to this is that if I don’t refactor this code, who will?
I guess that opens the debate to does the code actually need to be refactored? Sometimes not, but I have a very hard time leaving a 2000 line function alone, or a class with many levels of unnecessary inheritance.
What I’ve started to think is that there are people who get things done, and people who refactor.
People who get things done ship code fast. The good ones know when to make trade offs of purity vs practicality. The bad ones write unmaintainable code. They tend to view programming as a job, and code quality as a burden. They may refactor and make unrelated improvements, but it’s uncommon.
People who refactor still get things done, but most of their work have rabbit holes tacked on. They delivered a bug fix, but they also greatly improved the test suite and tooling. They tend to value purity, best practices, and might view programming as a craft/art. The good refactorers make the high value changes that are a net benefit to the team; they might care about cleaning up important parts of the code, improving developer experience, or adding test coverage to critical functions. The bad refactorers rewrite a bunch of code because it wasn’t pure, and the team, customers, and business have no observable effect from it.
I think teams need both kinds of people on it. Too many doers will lead to fast progress at first, but eventually grind to a haunt. I felt this was a HUGE problem at my team in AWS, and a major factor of why I left.
Too many refactorers is equally problematic. Your code will be of great quality, but velocity will be unacceptable for the business.
I personally am a refactorer. probably more of a bad refactorer than good, but it’s something I’m working on.
Maybe this is a rationalization for my behavior so that I believe it’s important, but my time at AWS shipping 3 features a year with a team of 16 says that code quality really does matter.
What you're saying and what the article is saying are two different things. The whole point of the article is not "Don't refactor code"; it's "Never fix a bug and refactor in the same pull request".
The obvious question in any organisation with a layer of project managers is how to explain that you've delivered the fix and you haven't moved on to fixing the next most important thing or adding some other new functionality.
Yes, that's probably a sign of poor project management or organisational brokenness. But it's also probably the reality for most people.
I just mentioned it in this comment[0] but I disagree with this premise too since sometimes it's more work to fix the bug without the refactor:
> It also misses the point that sometimes refactoring makes it _easier_ to fix the bug, and that a large part of fixing a bug is understanding exactly what's happening with the code, which refactoring can also make easier.
Building on this, I see a lot of areas of code that are effectively untouchable because of how much energy would need to be invested to understand the code and its context before making a change. So if a change has to be made to fix a bug or add a required feature, it's done in the most low-impact way, which only makes the code that much harder to understand in the future.
Meanwhile, I can't get refactor-only commits through because nobody wants the code to be touched unless absolutely necessary.
We're in the middle of migrating a massive PHP codebase to Java. The code I'm talking about is newer Java code. In a few years, people will be wondering why the Java code is as much as a mess as the PHP code and, if I happen to still be around, my response will effectively be "I told you so" (in kinder, more productive terms).
I've basically determined that my only option to ensure continued or improving code quality is to combine refactoring and bug fixes in the same merge request, while still keeping them in separate commits. That way whenever I touch a part of code for bug fixing or whatever else, I leave it behind in much better state than it was before.
> I've basically determined that my only option to ensure continued or improving code quality is to combine refactoring and bug fixes in the same merge request, while still keeping them in separate commits. That way whenever I touch a part of code for bug fixing or whatever else, I leave it behind in much better state than it was before.
I agree, in practice I think this is the best option, with some kind of timebox for the improvements made.
One thing I'd like to add is setting up some kinds of tests. If nothing else is available, E2E tests for the critical paths of the business are a good start. For (micro)services, getting API contract tests to work is also invaluable.
Refactoring while bug fixing can be good. But depending on your metrics, it might be underappreciated and seen as time wasting.
The guy the goes around blasting bugs like there's no tomorrow looks good to the boss. When he leaves someone will have to refactor his mess and that's wasting time again.
I have had this conversation yesterday, and the refactorer asked this very question. Here's my answer:
The question is improperly formulated. With your behavioral patterns of immediate refactoring, the question seems to be: "if I don't refactor this code NOW, who will?".
I'd say that if you don't refactor the code now, someone else, maybe future you, may refactor it later.
Most sane companies will protect engineering from impulsive feature requests and will add them to the backlog for future consideration.
I'm trying to instill the same protective mindset in my teams, and help them not do refactorings immediately but rather consider them as potential work as dedicated items in the sprints or so.
Doing so, prioritizing becomes a natural part of the refactoring/tech debt/tech risks agenda, because we have an overview of the work to do globally.
In my experience (three years, two companies, so not much) refactors are beyond last place for prioritization. Even my current company which does a great job of handling tech debt doesn’t track refactoring well. How would you even do it? Imagine how many tickets you would have about the things that need to be improved.
I guess this is delving into queue theory and how the work queue is growing to infinity because there aren’t enough workers.
Personably I just allocate X% of my time to this kind of work and spend it on the most productive improvements. You don’t have to die on every hill you see. And if most people do this it results in a constant shift to something better while still achieving business goals.
I tend to think of yak shaving as the annoying but _necessary_ activity of fixing things you need to fix to close your bug. Not frivolous refactors. More like adding decent logging to a complex system, cause otherwise, how will you know why your feature doesn’t work?
Yes the example is wrong. It’s about making an apple pie, but figuring out you have no apples, so then you’re going to the supermarket etc etc.
It’s about “necessary” things to achieve a goal, without appropriately weighing costs vs benefits nor considering alternative solutions (perhaps making a chocolate pie instead).
Exactly, what is mocked as Yak Shaving is the important stuff. But try telling your non technical manager why you need to add that logging system, which requires its own testing and deployment.
In this scenario you are the one focused, the non-technical people that shuffle tasks around playing 'agile' are the handicap. It could take a couple of hours or a couple of days to get that logging done but explaining to the non technical manager (who refuses to learn) why this needs to be done is a task in itself that cannot be underestimated.
Me too. Where I work it's slogging through the process of doing just about anything. A simple task ends up being a to do list that's 10 items long. There's a 60 day use it or lose it policy on just about everything so you may find you need to completely rebuild the dev and testing environments which include automated as well as complicated manual processes before you can even begin to fix the bug.
“ Never fix a bug and refactor in the same pull request.”
I’m sorry, but this is backwards. Bugs are many times caused by badly written code and you can tell when it’s the case. Refactoring the code many times fixes the issue without ever having to figure out where the needle was in the haystack.
I guess in every field there are platitudes and prescriptions. At the end of the day, I try to follow first principles and ignore them and just focus on building a great product.
What I see in this article is a disregard for the costs associated with context switching. My argument is, if you think you can handle the rabbit hole, and you think those related tasks will need to be done anyway at some point, head off to Wonderland. Because you have the context of the situation fresh in your temporary memory, so you’ll get it done faster than if you switch contexts and come back later.
It also misses the point that sometimes refactoring makes it _easier_ to fix the bug, and that a large part of fixing a bug is understanding exactly what's happening with the code, which refactoring can also make easier.
Overall I disagree with this article, both in it's definition of yak shaving (as being unrelated to what you're doing), and in it's assertion to never refactor and fix a bug in the same PR (now I'm not saying you should refactor things every time you fix a bug of course).
I'm going to throw an addendum on to this one: if you fix a bug, and then do a refactor, I'll agree, it's probably best to split the PR's up. But that's just the common sense of keeping changes small and logically contained. I'll also agree that it's best not to interweave the two.
Refactors usually take much longer than the bug fix, and while it acrues technical debt, there may be more urgent things to take care of. The article is about focusing on your initial goal, and then filing the refactor as a next step action item, instead of just growing scope endlessly.
I understand where you’re coming from. I would argue that if the amount of refactoring required to make the bug clear takes that much longer, then all the more reason it should be prioritized. This is really the purest definition of tech debt, because there may be other bugs present in the code you are unaware of. This is assuming no tests cover the bug, because if they did, it wouldn’t have made it into production. Honestly you should be doing it all, because you have to understand the full scope of the issue to properly fix the bug, test it to prevent regression, and in order to test it it must be testable. So I would say if there are no tests, and it is not testable code, the least amount of refactoring you should do is make it testable. You actually don’t even have to write the test if you really want to cut corners. Just through single responsibility principle, dependency injection, and writing code that could be tested is enough to bring it 90% the way there. You can even break the dependencies and theoretically as long as you don’t violate the interface the functions you refactored should hold up. The simpler and more broken down the code is, it gets to the point where you say, this function has one if statement and two return statements, writing a test is actually redundant compared to the code. If it’s not mission critical code, you can really cut corners, if you’re in a hurry…
I think it should be more like, try to separate "formatting" and other changes into separate commits instead, you know like ones that change the whitespace all around, and other layout stuff, so diffs of the actual fix are easier later.
I find it slightly insidious how the entire concept of yak shaving is framed in the article as a lack of attention span or directed focus on the part of the developer, instead of a consequence of being required to work in a suboptimal environment. In my experience the majority of yak shaving is spent removing direct impediments to solving the task at hand. Distraction is always an issue, but I feel that falls under the bracket of "doing the wrong task" rather than "the path to complete the task is indirect and convoluted".
With this kind of framing we can look at the cause of yak shaving as a failing of an individual, rather than a lack of an accurate and holistic view of what the task entails given the current state of the system.
https://projects.csail.mit.edu/gsb/old-archive/gsb-archive/g... The OG only specifies that it's "related" to the task, and makes it pretty clear that one could come up with a significantly cleaner path to the original goal - Yak Shaving overlaps heavily with "flimsy rationalization"
Many times I've found myself in situations like 'yak shaving'. To do this I first have to do that, and to do that I have to do some other thing, thus falling into a never ending stream of tasks that make me lose focus of my original goal. It seems to me something very common while performing unexpectedly complex jobs.
My point however is not to discuss this. What bothers me, and only a little, is the label chosen to describe situations such as these. Why 'yak shaving' and not something else?
I mean, the Ren & Stimpy connection is meaningful for the author, and that's ok. But if every time I want to refer to it I have to explain why it's named as such, or even explain who Ren & Stimpy are, maybe I'll be better served by some other, perhaps more appropriate label.
Calling it 'Going down the rabbit hole' is no doubt better, at least to me, but not completely. This expression captures a different mood, namely that of finding ever more connections in an endless stream that does not necessarily make me forget my original goal. For instance, while writing a paper, many times I find myself going down the rabbit hole, ending up with these monstrous footnotes that try to record how deep that rabbit hole goes. But when that is done, I go back to the main topic and move on.
So what could I call it other than 'yak shaving' or 'going down the rabbit hole'? I'm open to suggestions.
> Calling it 'Going down the rabbit hole' is no doubt better
It may be more recognisable, but I wouldn’t say it’s better. The metaphor of going down the rabbit hole has you go deeper into the same subject while yak shaving has you go into increasingly tangential tasks to the point an external observer wouldn’t recognise your primary goal.
In other words: the longer you spend going into the rabbit hole, the more you familiarise yourself with the matter; but the longer you spend yak shaving, the farther away you are from finishing the original task. Yak shaving is antithetical to going down the rabbit hole.
This person has obviously never been in a situation where to get the data to fix the bug you need to modify the instrumentation, which requires a new data format because you are the first to use dicts of dicts of floats, and when adding that you end up with a random seg fault, so you debug that and find out that the stack frame pointers disappeared last week when GCC was upgraded by a vendor patch on your colleague's laptop, but not yours, and so he checked in a "temporary" fix for that. Frame pointers restored, you find an off by one in the serialization of floats which you fix. So you go to collect your data but the data aggregation system, which said it supports your data type in fact does not in the current version, so you upgrade it, but for that you need a different GCC update. But that GCC needs a different libstdc++ which causes compile errors in your app. So you find yourself in the Makefile adding a way to use a local copy of libstdc++, and you can't remember why you were even fixing the bug, because in the meantime the docker image used for the mock database has been updated, adding a new bug that masks the bug you were supposed to be solving (but you won't find out about that for 3 more months).
Your scrum update for 6 days running is "maybe tomorrow", and your cat runs away because you are so preoccupied with the bug you forgot to feed it, and your yak needs shaving because in a daze on the way home from another day wasting your life at this stupid job, you forgot to pick up razors.
Yak shaving. It is weird, because sometimes a serious refactor is virtually indistinguishable from jack shaving, it looks the same from the outside and it feels the same from the inside. The only difference is that the yak shaver often labours under the illusion that what they are doing is indeed important, while the refactorer knows what they do is important.
I think in programming it is easy to get lost in scale. We all know these videos where you scale in from the universe into a single atom. In programming tou operate at similar dieseperate scales. You need to have the high level perspective ("I want to get to this mountaintop!"), the mid level perspective ("I chose to scale the mountain via that route") and the detail level perspective ("I put that finger into that nook of the rock to my left and shift my weight slightly to the right..") and of course there is also a tooling perspective ("What kind of gear do I need to scale that mountain?").
When you operate just at the detail level it can happen that you end up on the wrong mountain, branch off into the wrong route etc.
To make things worse you can also have a meta-perspective ("How will this help me beyond the mountain I am currently on?").
I've often been in the situation that someone left me some code and I have to fix a bug in it or add a feature but in order to even understand it I have to refactor it so it fits into my head.
I thought that I have ADHD, but apparently I just need to stop shaving the yak..
distractions are real, at some point I pomodoro technique helps, however when I get interested/distracted with something else at the code then it's pointless
I find that one or two hour timers help. I force myself to have a tea break or a snack. It lets you zoom out for a minute, and remember what you set to work on that day.
This brings me to my next point: know what you're supposed to be working on. I like to write it down at the start of the day while I have breakfast, before I touch a computer.
> If you truly need to make all of these other enhancements, do them in separate pull requests. Never fix a bug and refactor in the same pull request.
Somehow, in practice, I've seen the fixing plus enhancements work better in getting the code in a better shape.
Also, why are we not good with doing both at the same time? If its because it makes things harder to review, I've found that pairing on the review and adding clarifying comments to the PR is a much better way. If its for the commit history, is it always more important to have a squeaky clean commit history than to have easier to read (I'm presuming this is the intent of the refactor) code? Is there some point after which the code is bad enough to sacrifice some commit history clarity to get better code?
> You may even decide that these enhancements are distracting you from your immediate goal of shipping a feature. In this case, there is nothing wrong with addressing them in a tech debt sprint.
Has anyone actually seen this work? In my experience, once the original bug is out of the way there is no motivation or vision on how or why to make the cleanup and the tech debt tasks are left to rot forever, while the code slowly turns into mud. Does anyone know how to avoid this?
IMO the thing that works best is to do the yak shaving first before the bugfix. But also, timebox the effort for yak shaving - you're only allowed to do a certain amount.
Maybe do a PR with the refactor first, and another PR on top of it with the bug fix. That way you can test that the refactored code doesn't change any behavior, and then you can do the bug fix (hopefully more easily once the refactoring is done). An that way the refactoring doesn't get dropped as not important, since it's literally a dependency to getting the bug fix merged.
In the particular case of refactoring, I don't think this is a good idea. I can understand that multi-file system reorganization refactorings shouldn't be done with simple changes, but small enough refactorings can easily happen when working on a piece of code.
That is because-
- You have the best mental model about a piece of code and its system the best when you are working on it, not when you read a ticket description of what should happen. This leads to overall better efficiency and less context switching.
- I think that the principles of "leave things better than you first found them" and "with every not-so-minor change, one should think about how would you architect the system the best way possible at this current time" great principles.
- I find that "Refactoring sprints" never really work great. They tend to be inefficient and rarely prioritized tasks by the management. As developers we have responsibility over code and make sure that its in the best state possible as an implicit description of our profession.
Even though it pays to be focused, I think there is merit in the exploration of code-base as it helps figure out new ideas and avenues of improvement.
From their example I think what the author means by "Never fix a bug and refactor in the same pull request" is actually "Never fix a bug and do unrelated refactorings in the same pull request." Many times it is much easier to fix a bug or implement a feature if some refactoring is done first. In order to increase clarity in the following code review I recommend using short, descriptive commit messages following some type of standard. Two that I know of are Conventional Commits[1] and Arlo's Commit Notation[2].
> Never fix a bug and refactor in the same pull request.
We’re kinda going the opposite way: If you find something that should be fixed, then fix it.
This is, of course, meant for small things, like renaming or rewriting expressions to make them more readable. But we have had to make this change, because our experience is this is necessary to avoid code rot.
I think that the two ideas are separate. You should fix problems that you see, but you should do this in separate commits (and ideally PRs).
There are a few benefits
If your refactor is large, you can get a bug fix in ASAP in a small PR, and get the large PR reviewed later. Sometimes I’ve done a refactor with a stubborn CI integration test or two that might take a few days to fix due to long feedback cycles.
Aside from that it lets your reviewer enter a different mindset for each change. The bug fix is more easily understood, and the reviewer expects a change in behavior. For a refactor the reviewer can usually review in a more relaxed manner because they know the behavior can/should be the same, and good unit tests should identify any differences in behavior
What the author describes isn't really the kind of yak shaving I find myself doing as a programmer. Yes, fixing a bug in a single dedicated pull request is one technique that can be helpful in a very specific situation. But it's as if in the apple pie situation, the author said that the way to avoid yak shaving is to always have prebaked apple pies in your fridge. One quickly discovers how bad yak shaving can get when they go to store the apple pies in their fridge and find out that the fridge is full. Or when they go to open a pull request and realize that the CI tests aren't running because their credit card has expired and they can't update the card info because the person with access to that information is on PTO, and they spend the day trying to resolve the billing issue instead of just making a simple pull request to fix the bug.
If you follow the authors advice to its logical conclusion then all changes to the code base are narrowly focussed tweaks - where does the longer term thinking come into this?
If I’m implementing a new feature, should I also disregard the need for refactoring?
A more nuanced approach is needed. You need to learn when to make changes additively and when to reshape the code to fit your new use case (and how much reshaping is required).
As an aside: I think tech debt sprints (if needed regularly) are often a sign that you aren’t developing software sustainably day to day.
I disagree, if you realize: "we need to refactor this file/module/class/etc." then it becomes a new task. Or more general, "we need to refactor the organically grown architecture": it's a new task. Working on a task should not prohibit your thinking about additional tasks, just add them to the backlog.
I find the 'never fix a bug and refactor in the same commit' advice somewhat questionable.
If the code is a mess and I'm going to refactor it anyway should I spend a lot of time trying to fix the bug in the old code only to refactor it afterwards? Or should I refactor and deliberately leave the bug in there only to fix it afterwards?
To me doing them together often makes sense and I don't remember code reviewers ever complaining. But maybe I'm missing something?
it's about atomic commits. every commit should be a smallest possible self contained change. but tbh I also often do it at the same time, and if one is hard to untangle from the other during the commit, I just commit them together. to me, that's the lesser evil than not doing the refactor.
I have the opposite problem of yak shaving. If I see something that should be fixed I do everything in my power to rationalize procrastinating it. My main strat is simply "I'll pretend I didn't see that", even when it's something that could have catastrophic consequences if not fixed. If it's something that is required for the task in hand I use all my creativity to find an ugly hack to bypass it.
Refactoring is also a process for the developer to build an understanding of what the code is supposed to do. This in turn helps figuring out bugs as well.
Personally it relates to the scope of the change/bugfix. Sometimes a nice refactor merge drags forever because the understanding of the fix/task you're working keeps changing until solidification, if it is an extensive refactor then it's more work to keep up with updates and conflicts.
It says not to combine refactoring with bug fixing. But like any advice, it's relative: a drive-by cleanup can be easier for everyone than reviewing a separate PR. And sometimes a refactor makes the bug (and all related ones!) vanish or trivially caught.
To quote Kernighan and Plauger: "Don't patch bad code: rewrite it".
I think doing a refactor and bug fix in separate PRs makes sense. But I would say do the refactor first, and then do the bug fix on top of it. This solves two problems:
* It can make the bug fix diff much easier to read/understand
* It makes the refactor a dependency of the bug fix, so it doesn't end up getting dropped as "not important enough"
Of course, if the refactor automatically makes the bug go away, then it's fine to do a single PR, because the refactor is the bug fix.
I guess it's an article for programmers who too often have tasks/issues called “Refactor component N” or “Pay a technical debt in module G”. Majority of the programmers rarely have dedicated paid time to do just refactoring.
Small refactorings and cleaning alongside bugfix are fine. Big refactoring should happen independently, of course.
I agree with the author. However, I need to do things that must be done now.
Because best code is the enemy of good code. Easy to fall to refactoring and lost everything in the end.
Good code is code that works and solves a problem right now.
Otherwise, we will not have clients waiting for a "well-polished" formatted product that is easy to read and maintain for us.
And that is the problem of the whole infrastructure.
We needed solutions for X, Y, and Z just yesterday.
And then, if it "shoot," we like idiots yearly fixing all problems that have been done earlier.
Speed is critical. More important than perfectionism or wish for well-structured, easy-to-read, maintained code.
Is that awful? Yes. I want my code to be good as it can be.
But is it possible to do it in a limited time when an urgent problem must be addressed? Nope.
And over time, we increase complexity, multiply complexity, and then leave because not able to maintain it anymore. That's why we touch different things while fixing everyday problems because we do not want to fix them later. Balancing complexity in future. But this is fake. This future never happen in 99% cases.
The biggest problem is that if we do not rush right now, there will be no tomorrow for the product we develop.
> Speed is critical. More important than perfectionism or wish for well-structured, easy-to-read, maintained code.
The thing is, poorly-structured, hard-to-read, unmaintained code is the enemy of speed. I don't think it's ever a good idea to go too far in either direction. If you spend all of your time on refactoring/cleaning up code, you never get anywhere in terms of functionality. But if you never refactor, you're development speed slows to a crawl, and making meaningful changes becomes impractical.
> The thing is, poorly-structured, hard-to-read, unmaintained code is the enemy of speed.
Sorry for debating with you.
But any modern code editor has tools which solves the problem of "well-structured" code.
What about "easy-to-read", this is depends on language, and the programmer.
I've seen so damn much beautiful code, with 1 character long vars.
And such code extremely hard to understand. The code is compact and beautiful.
But for understanding the code - required a lot of time.
I think, and my own experience proof that.
If you not in rush for finishing things just in time while the things actual - you lose. Always. And there will be no second or third chance.
That's why need to write how you used to write.
Only experience & practice give you ability to write good code.
Focusing on re-writting some code parts while fixing a bug - big problem.
If this re-writting thing does not change anything instead of "better understanding and better ability to read" - this is bad time wasting.
If you not in rush in developing things and doing right actions in right time while the problem or request actual - you lose. Always. Without second chance.
That's why need compromise and write shitty code for winning competition in short run.
And then, when you will have audience for your product, you can always fix here, and change there. Nobody will complain about bugs or issues until it leak personal data.
I'm not saying you should just do random refactorings. What I am saying is that if you are fixing something, and realize that you can fix it while improving the code at the same time, it is typically worth it. Obviously if the refactoring is a huge undertaking, it's not going to be a great idea to pair it with a bug fix. But if your are smart about it, you can usually keep the code base clean/well-factored incrementally.
The problem with "quick" fixes is that they often end up being slower, because hacking a fix into a complex code base often creates new bugs/problems, and you end up rolling back your broken "fix".
> Only experience & practice give you ability to write good code.
I agree, which means that always writing shitty code to get it out the door means that you will only practice writing shitty code.
Sometimes rewriting things allows you to fix things in multiple places at once instead of one at a time.
My coworker copy pasted a bunch of stuff into each page, so when I actually templated them I could fix the bug he had in every page, not just every time I find the bug
I think a much more thorough and measured discussion of this was done by Gwern in his piece about socks https://www.gwern.net/Socks in which he describes yak shaving as a failure cascade and talks about ways to avoid it.
I would have loved this more if the plot twist was that the author first decided to recode their perfectly fine static site generator from Perl to the new hotness and along the way wrote a script that automatically rebuilds and deploys the site every time the executive toilet at American Express flushes.
I really like the Mikado Method [0] when you find yourself shaving yak, at least when coding.
When you see something that bothers you, you've got two choices: either fix it now, or don't fix it now. Don't fix it now is easy: make a note and move on.
But if you need to fix it now: stop what you're doing and _roll back your changes_. This forces you to keep your changes small and self contained. There are other steps involved, such as writing down your steps to build a dependency graph, but the simple act of rolling back your changes is quite powerful. Don't want to throw everything away? Then maybe the thing you're trying to fix is not that urgent, and you can deal with it later.
Tangentially related: I wish there was a way to record my code changes as executable actions on the AST, and attach it as metadata to a commit. As a simple example: I want to change the signature of a method, rearrange the parameters and add a new one. All you get from source control is some text changes, without an understanding of what those changes mean. If you need to reapply those changes you're stuck with the diff tools of a text editor. But if my IDE could understand those changes, it would be trivially easy to redo the refactoring actions. It might not be automatic (what value should this new parameter have if there is no default?), but at least you have some higher level tools available for dealing with that. To take this further: the changes might not even be in your codebase. A library was updated with breaking changes? Just execute the changes you can, and prompt for action when the changes fail.
I feel that recording your changes as actions on the AST is a powerful concept that needs further exploring. Sure, some things don't make sense to record on a higher level. If I add a new method it's much more readable to just add the code as text. But for anything remote complex it would be great if I can express my intentions as executable actions instead of (or in addition to) textual changes. Find all calls to functionA where parameterX is-a typeY and parameterY.value2 is not null, and make sure the caller sends the result of functionA to functionB in a new transaction. Kind of like how you can record steps with Selenium IDE and use that to create something with clean api calls.
And the best part: because you've already made the change it's easy to test if your higher level change results in the same change in your source code.
The frustrating thing isn't the small problems, in chasing these small problems one might find a big one that isn't as easy to fix: https://www.youtube.com/watch?v=AbSehcT19u0
On the topic of yak shaving though: I recently felt like I wanted to do more technical writing. The prerequisites of this idea involved building a static site generator and redesigning my website. While building the SSG, I of course imagined all the features that would be expected, a tagging system, JSON output in addition to markdown rendering, custom syntax highlighting for codeblocks etc etc. I had that idea at least 1-2 years ago, and I've only recently written my first post within the past 2 months. I think I enjoy tinkering with build systems much more than writing.