Compared to the first edition of the book the second edition has the same functionality but is much better organized, which will make future editions much easier to develop.
No other book improved my coding style as much as the first edition of Refactoring. On a superficial level, it teaches you how to "clean up" code, according to various patterns. But what it really does is give you a feeling for why the code should be improved in the first place. Highly recommended.
Once one adopts the refactoring mindset, its lessons and approaches can also infuse new code aka "prefactoring". :) Ken Pugh wrote a 2005 book called "
Prefactoring: Extreme Abstraction - Extreme Separation - Extreme Reliability 1st Edition", which won a 2006 Dr. Dobbs' Jolt award.
I think almost every programmer would benefit from reading "Refactoring". It sharpened my skills quite a bit, and I was already experienced before encountering the book.
The book was an easy read and was an efficient use of my time. I expect that rewrite in Javascript will further improve the information density.
I bought the book ages ago and remember not really finding anything of value in it. I don't mean to sound like someone from /r/iamverysmart but I recall thinking I would benefit in the way I did from the design patterns book: by giving me terminology for things I already did. But as far as I remember, the names were really obvious ("rename parameter", "extract method", etc.) and the instructions equally so. Can you give an example of something you learned from it?
I remember having exactly the same experience, including the anticipation that I would benefit from it the way I would from Design Patterns. For most of the refactorings in the book, a bullet point in a list would've seemed a bit excessive, never mind a step-by-step sequence of instructions. In the end I literally threw the book away – the only technical book I've ever done that to.
I distinctly remember one of the things in the book, which in retrospect isn't even really a refactoring, called something like "null object". The idea was that if you were going to return null from a function (e.g. get student by ID but that ID doesn't exist) then you could instead return a concrete singleton object that represented null. That way, if you forgot to check for it then at least your program wouldn't crash. Now, the merits of this idea are debatable and I don't think I've ever used it in the years since. But I strongly remember thinking at the time that it was the only idea in the book that wasn't totally obvious.
There was a good chapter early on about testing, which I think was my first exposure to that idea. It was a bit brief but that's not a surprise given it was a bit of a digression from the main topic. A whole book on that might've been quite a bit more interesting.
> if you forgot to check for it then at least your program wouldn't crash
If I forgot to check for null (or missing data) then I would rather have my program to raise an exception (to prevent data corruption). Then my error handling code will email that exception to me. I will analyze exception message (NPE) and stack trace. Then I will fix my program and deploy it to production.
Depends on the context. In game engines it's quite common to just render a generic pattern when a texture could not be found, instead of crashing the whole game. That's a good use for this pattern, not so good example would be in a bank transaction backend.
I use this pattern fairly regularly in a variety of ways. Here's a quick piece of example code that might help explain this a bit better. I'm writing this example in C#, but I usually work in other languages (Java / ruby / Python / JS),
before:
public Student GetStudentById(int id)
{
var record = db.findStudentById(id)
if (record == null) return null;
return new Student(record.id, record.name, ...);
}
Then in say a student record controller class:
public View Get(int id)
{
var student = getStudentById(id);
if (student == null) return View("NotFound");
return View("Display", student);
}
After:
[ContractAnnotation("=> notnull")]
public StudentSearchResult GetStudentById(int id)
{
var record = db.findStudentById(id);
if (record == null) return StudentSearchRsult.NotFound;
return new StudentSearchResult(record.id, record.name, ...);
}
Then:
public View Get(int id)
{
var student = getStudentById(id);
return new View(student.ViewName, student); // look no null check becase student.ViewName is always valid here
}
Note: of course this is entirely simplified for the purposes of example. Perhaps your real world example doesn't return null from the db, but a RecordSet with either 0 or 1 elements. Perhaps your db uses exceptions that you translate into results like this.
The cool thing there is that you get tool support (from e.g. ReSharper) telling your that this is entirely safe to do without a null check and tool support when it's not.
But then the 'Student' object does not represent a student but represent the result of a search which may or may not be a student, and this result decides its own view. Effectively the model is coupled to a specific view, which seems wrong to me. I would much prefer to have the search return an Option<Student> and then have the controller select the view based on this option. This also makes it explicit that the search may nor return a result, which is not really explicit if it always returns a Student instance.
Thinking more about this, null object are meant to be neutral with respect to some behaviour, but that behaviour isn't really specified alongside the object. It seems like it's a weaker reinvention of the idea of an "identity element" from e.g., the `Monoid` typeclass. I say "weaker" because `Monoid` specifies the neutral element (`mempty`) alongside the behaviour (`mappend`), whereas the null object pattern does not.
No really, it is almost the opposite. The mistake was to have null be a valid value for all object types. Languages where nullability is represent in the type system (e.g. Typescript) does not have this problem. But the null-object pattern introduces the same problem in a new way.
In languages without "everything can be null", the null-object pattern is what is always done.
The problem with "everything can be null" is the type system is prevented from finding very common "you can't do this thing" errors.
Without it, everyone uses null objects (or Maybe, a generic for of this) and there are no segmentation faults/dereferencing errors. (Yes there can be other sorts of errors, but the problem has been made far better.)
The "Null object pattern" is a particular pattern where missing values are represented with an object with "neutral" behavior. This is not the same as an actual null or an option type.
I've read and enjoyed quite a few of Fowler's articles. The things I've taken from his writings have been concretely valuable so far too. I'm not experienced enough though to evaluate how his ideas stand up in big or complicated projects.
I can't comment on other articles, but I would be careful with some suggestions about mocking everything in automated tests. I've seen a couple of totally botched projects because of that approach.
Yes, the biggest problem was the lack of integration* tests: you can get a false confidence that everything is working, but you only know that the individual cogs are built to spec, not that the machine is working.
Second problem was that refactoring became much harder, because the tests were deeply tied to the implementation (are methods X, Y and Z called with such and such arguments?).
* using the definition of "integration" as "integrating components of the project", not "integrating with external services" (which is the definition I personally prefer).
For most people, I think his online catalog of refactorings and blog articles are sufficient. Though I think the first edition is a good book, I wouldn’t need it if I didn’t do research on refactoring tools.
What's the target demographic for something like this? It seems like the trivial part of refactoring.
I've seen a similar, entire book dedicated to database refactoring that had roughly the same pattern. Chapters named akin to "The Change Table Name Refactor" felt kind of bizarre, when what I was hoping would be contained was advice about refactoring large databases in practice (moving data to new service boundaries, etc). Practical advice on building systems to move data and such seems to be just missing
I read the first edition when I was nearly 2 years into my professional career. I understood the basic 'what' and 'how' of software development.
Refactoring helped me start to understand the 'why' and 'when'. Why is design A better than design B? When is it time to tidy up this code? Which choice will be easier to change later? That kind of thing.
By code abstractions, do you mean low level data structures[1]? There are those but they're different from mid level design patterns[2][3] or high level architectural patterns[4].
Well, I can probably mine 10 tops books and come up with a good list of 100 abstractions. But i am sure someone has done a much better job than me on this..
> When I choose a language for examples in my writing, I think primarily of the reader. I picked Java because I felt the most people would be able to understand the code examples if they were written in Java. That was the case in 1997, but how about in 2017? [...] Such a language needed to be widely popular, among the top half a dozen in language popularity surveys. It really helps to have a C-based syntax, since most programmers would recognize the basic code structure. Given that, two stood out. One was Java, still widely used and easy to understand. But I went for the alternative: __JavaScript__. [...] But the compelling reason for choosing it over Java is that isn't wholly centered on classes. There are top-level functions, and use of first-class functions is common. This makes it much easier to show refactoring out of the context of classes.
This is very interesting choice of language. I wonder whether typescript could have been a candidate as well. Although not as popular as JS, I believe most of JS community has migrated to typescript (due to obvious benefits) and it is more closer to a C-based language.
Whilst a fan of typescript myself I think the last statement is incorrect, I'd be surprised if >20%-30% of projects started in JS this year are typescript.
I'd be shocked if it was higher than 10%, and that's not even the bulk of JS work: the vast majority is brownfield, existing projects, where changing the toolchain is decidedly nontrivial. TS is nice, but it's a long way from ubiquity.
_has_ may be incorrect but I do feel the switch is happening [0]. Angular which is the most popular client side JS framework is using Typescript. React is also moving in the same direction with typescript-react-starter and tsx support.
If you know typescript, you're familiar enough with javascript to use the book, but if you don't know typescript then its a bit tricky for those readers. It just means fewer people have to dust off their language skills to benefit from the book.
Well, I'm very fluent in Typescript, and often find JS code extremely difficult to understand. The types are a great aid to understanding code. It also seems to be the case that a lot of JS code does not use ES6 polyfills, unlike Typescript code, so it tends to be less clean.
Edit: Upon rereading I see you are talking about the book, not actual projects, so my reply is not all that applicable. Though I'd actually say that Typescript is much more understandable than JS to someone unfamiliar with both.
I am a huge fan of this book. The Pragmatic Programmer and this book are the two that I recommend most.
When I first read Refactoring, IntelliJ wasn't released yet and refactorings still needed to be done by hand. Automated refactorings are so incredibly useful when it comes to improving code quality over time that I don't know what it was we did without them.
Which way you really want a refactor to go can depend on the presence of the type system. As well the reliability of refactoring doesn't.
For example consider the thing you constantly see in Java where you construct an object then assign each property through a method. The type system will catch it if the property is not passed correctly.
In a dynamic language like JavaScript the same is better done with an inline object. The reason being that the type system won't catch it if there is a mistake, and so you want to go for the easiest way to write it. On the theory that less code is less maintenance.
However now it is hard to rename a property and be sure that your refactor is correct. Because where does a property come from? It could come from anywhere - over the network, from an API call, the name of a database field - it is easy to miss it.
Which means that not only should the refactor go in a different direction, but how safe you feel in doing refactors will also go in a different direction!
This is a funny subthread considering the first useful refactoring tools were for dynamic languages[0], not static ones. Static types might have some advantages for a refactoring tool, but dynamic languages have their advantages too that might make a refactoring tool easier to create.[1]
Can refactoring tools miss something? Sure, but this holds true for various static languages too. Don't sell the nonsense of 100%. Find me a Java IDE that won't miss reflection calls. (Personally I can point to a tool for Java that moves the bar above 0, but it also isn't 100%.)
Just because you're used to shitty dynamic languages isn't a statement about dynamic languages in general or the possible tooling, anymore than being used to shitty static languages is a statement about static languages in general or the possible tooling. I also see someone conflating "dynamic" with "untyped"/"non-typed" -- there are very few untyped languages, let alone in wide use, "dynamic" doesn't mean the same thing.
I'm looking forward to the JS version of the book just for recommendation purposes because I hope it will put a rest to so many of these threads about refactoring in the absence of a static type system being somehow way harder or impossible.
Ultimately I think the discomfort is from fear, certain programmers practice fear-driven development. "What if the tool (or me, if I'm taking 5 mins instead of 5 seconds to do it) misses something?" It's the same fear with people who can't imagine refactoring without unit tests, but who need to refactor legacy code to get it under test. You can do these things, they're not particularly difficult, but if you're suffering from fear, I'm not going to convince you. Maybe a book + examples would though.
Ralph Johnson: "We originally thought that the lack of static type-checking would make it hard to build a refactoring browser for Smalltalk. Lack of type information is a disadvantage, but the advantages of Smalltalk made it a lot easier to make a refactoring browser for Smalltalk than it would have have been for C++ or Java."
Types help eliminate an entire class of unit tests one would need to write before refactoring. Things like newtype also allow enforcing certain constraints while refactoring. So yes, while the end goal is the same, the process is simplified.
I’m not against types. In fact I mostly prefer typed languages.
I’m just stating that there is nothing fundamental about code-refactoring which ties it exclusively to the domain of typed programming-languages, something I consider fairly self-evident.
I mean obviously one can refactor terrible code in any language to help improve its structure right?
That this remark would cause such sub-thread defending types in general (I never said they didn’t have value) has me honestly quite surprised.
Oh man thanks for posting this. Did not expect to see this come. I have both of them ("Refactoring" and then the "ruby edition") - it's a part of my vocabulary.
Also thanks to other posters who describe their view of the updates of content.