
Lessons Learned from Software Rewrites - amartinsn
http://alexmartins.me/2016/07/28/lessons-learned-from-software-rewrites.html
======
Terr_
I've been dealing with a corporate ERP system for some time now, and --
whether rewrite or refactor -- the biggest challenge has been that nobody is
willing to assert how it _ought_ to work.

At best, it's "reverse engineer it, tell me what it does, and then I'll tell
you what feature I'd like changed."

~~~
rtpg
This is a pretty eternal problem in software development. I now think that it
is our jobs as developers to properly identify how things should work.

We're often not experts in the domain we're working on, but by asking
questions about a user's workflow on a more holistic level, we probably have
more insight into what workflow would be the most productive from a systems
standpoint.

Of course, this is a back-and-forth. Sometimes users aren't sure of what they
want. And sometimes we're not sure of what is the "best" system for what the
users want.

Instead of asking the user where the button should be, ask them when they
press the button.

~~~
blowski
Behavioural driven development (BDD) encourages this process. The developers
flesh out specific examples with the business people using readable language,
and that becomes user acceptance criteria.

It's no panacea, but it helps because it tries to prevent the business people
dictating tech specifics, which the developers take too literally.

For example, a business person says "when I press the submit button the
results must be stored in MySQL". What they really mean is that whatever they
entered into a form must be accessible later, but the developers now believe
they must use MySQL and have submit buttons.

By focusing on the "When I complete the form with data x, when I look at the
form again later, then I see data x" you've moved the conversation to the
business requirements and away from the perceived technical requirements.

~~~
auvrw
> Behavioural driven development (BDD)

[https://cucumber.io/](https://cucumber.io/)

is a particularly awesome idea for specifying behavior. really want to try it
out for a project sometime: the idea is to write the spec in these domain-
specific "gherkin" files.

i find it comparatively difficult to get get buy-in on collaborating on
anything machine-readable, however, unless it's from people who are
comfortable with general-purpose programming languages, anyway.

nonetheless, in an industry where it's fairly standard for designers to build
in domain-specific languages, i don't think it's an unreasonable skill for
product managers to consider acquiring.

any user experience stories from cucumber, etc. on rewrites or otherwise?

~~~
lifeisstillgood
In my experience it's a joke. It's a way for devs to write executable document
strings for the tests they also write.

Absolutely no business person who is responsible at the level BDD works (the
head of marketing for example) ever ever ever writes a working BDD description
or frankly bothers reading them.

Maybe just maybe they are useful intermediaries for Business analysts but most
of those have been fired.

Fitnese tests have more traction but then this is just exposing your API for
user tests

~~~
sanderjd
This matches my experience. I used to somewhat blame the business folks for
being too lazy and/or pompous to bother with their side of the bargain, but
I've come to think it's just generally a solution to the wrong problem. The
hard part is correctly translating from what is asked for to what is
implemented, and executable requirements don't have anything to say about that
step. In cucumber, I can take any arbitrary Given / When / Then statement and
implement it in any way, including ways that are completely wrong and (worse)
ways that are subtly wrong. The translation code just becomes yet more code
that may or may not be correct, and in my experience, it can be a lot of code.
When you find yourself wanting to write unit tests for your acceptance test
pseudo-english translation layer code, you're deep in a rabbit hole.

~~~
auvrw
i think this (the parent post) is truth: i've had conversations a/b behavior
specification where edge cases simply aren't taken into account. so the
specification would say, for example

"it does this when anyone clicks a button."

whereas there's often some implicit question, like

"what happens when two people click the button at the same time?"

and that kind of concern -- where the logical implications of requirements --
is not always handled by top-down requirements management... i'm almost
certain i saw a fowler article on "conversational" requirements management
recently, and that made a lot of sense.

~~~
blowski
In general, BAs should not be writing the stories, but they should be able to
read them together with the developers.

I find that it helps, but it doesn't solve the whole problem. The most useful
bit is the stories with examples, as those are what are so often missing from
specs.

The problem with edge cases is that they are infinite. If there's an edge case
that's likely to happen (ideally based on evidence of past behaviour), then it
can either be in the spec or just in functional tests.

I like to think that BDD is like writing the documentation first. Would you
put the edge case in the documentation? If not, it probably isn't the BA's
responsibility to decide what should happen, and it shouldn't have a Gherkin-
style story.

But it's not a silver bullet. It's a tool that helps solve some problems on
some projects in some teams. Usually, it helps in cases where the domain
experts are not very technical and where the domain problem is a fairly simple
set of business rules.

------
cottonseed
When I hear about rewrites, I always think about this comment:

[https://news.ycombinator.com/item?id=11554288](https://news.ycombinator.com/item?id=11554288)

~~~
iammyIP
I expected some wisdom, but that comment basically says that a rewrite either
isn't needed because programmers are already excellent in the first run, or
the rewrite will end with a similar shit code in paraphrase, because
programmers have learnt nothing. Also it takes time, which costs money...

~~~
hinkley
This is probably going to sound weird, possibly cheesy, but cottonseed got
something out of that post, possibly more than I intended to convey, so I
hesitate to follow up for fear of taking that away.

What I think I was getting at then, and certainly believe now, is that some
people grow with the code, learn from many of their mistakes, and do the long
quiet job of expunging their worst errors as they go. Their code is the best
proof that they could accomplish a rewrite, because it already is a rewrite.
They just didn't need to stop the world to do it, or at least not all at once.
Perhaps a month here and a couple weeks there.

For the rest of the time it has been like the Ship of Theseus. All of the
parts are new, and yet it is the same ship.

Meanwhile the rewrite guys are continuing with their mess and hoping they get
a do-over, putting off things hoping for Some Day. And on that magical day,
all of the bad habits of the entire team (management included) will correct
themselves overnight.

Bad habits are too hard to break. The best of us know that you don't abstain
from bad habits, you crowd them out with new, better ones. But that takes
time, practice and determination.

If you want to condense it down to nothing new under the sun, that's fair, and
I would offer that I wanted to believe the rewrite folks were onto something,
but at the end of the day, breaking down the problems and using relentless
refactoring seems to be the only thing that works, without changing the
definition of success to achieve victory.

~~~
animal531
Additionally, as time passes you get new people working on the system that
does not understand it as well as the original designers/developers, simply
because they already have the system and their knowledge becomes pigeon holed
inside of it. They don't have the original scope.

And then to top it off; if you did rewrite the system you not only need to
reverse engineer N years of code, but also N years of bug fixes and random
miscellaneous snippets that made it into the system and that no one who still
works at the company understands the purpose of any longer.

------
hcarvalhoalves
Sometimes a better approach is to wrap the system you want to deprecate rather
than have a parallel implementation. First, because you might not fully
comprehend the current implementation to replicate it. Second, because in real
life often there's not enough time, budget and coordination between teams to
migrate old clients.

Since their main requirement was having a REST API for new clients, and
apparently the old implementation worked fine, they could have implemented a
EJB -> HTTP layer on top, and have new features talk straight to the DB.
Eventually, they replicate all the features talking straight to the DB (w/ the
added benefit they can test against the codepath going thru the old system).
Finally, death by phagocytosis.

~~~
allendoerfer
I think the problem was, that they had to implement new features for the
clients only supporting the old version, too. At this point you are left with
a HTTP wrapper around your old software, which does not qualify as a rewrite.

~~~
zigzigzag
Well, yeah.

It sounded like an HTTP wrapper is all they actually needed. Or in fact any
well specified wire protocol with lots of implementations. It could have been
protocol buffers over raw sockets also.

------
lgleason
Strangler pattern FTW, well that and never have two systems share the same
database. Always have one source of truth.

~~~
Zardoz84
Could be worse. Imagine a situation where you have to work with an old Java
code base, where there is a lot work arounds that had sense ten years ago but
not any more, and you are using obsolete library versions like Hibernate 2.
Add that you can find similar functionality made on totally different ways.
Add that you can find abuse of some programming patterns. Add that you have
your own XML based framework where you end doing XML based programming, and
not sane way of debugging it except doing the equivalent of debug with
"print("CACA"); return -1". Add that for some operations, the java code must
do the same stuff that does a heavy client wrote on Visual Basic 6 that works
directly against the same database. Finally add that the database design isn't
the best (BNF form ? What are you talking? )and that you are stuck on using a
internal tool that does the same thing that Liquid base, but worse, and it's
based on some old abandoned Apache libs that no body knows well how are
working or how we modify. At least, we know that we have a problem and where
are trying to fix it.

And this is my diary work.

PD: I have a friend that lives on a worse situation, where they have an
application based on Java 1.5 applets, with bad or not existent documentation,
and nobody knows how works internally.

~~~
dasil003
Are you also Brazilian like the OA? I ask because I know diário means daily in
portuguese, but in English "diary" means a personal journal and doesn't make
sense in this context.

~~~
Zardoz84
Spanish. Thanks! I should wrote "daily", not ?

~~~
danielvf
Yes, indeed!

------
faragon
In my opinion, asking for "removing unused features" is one of the
consequences of full rewrites. Also, a reason for having to maintain "dual
systems" for years. It is not infrequent having "dual systems" and having to
deprecate the "new", just because after e.g. 10 years you can not get rid of
the old, or even worse: the "new" performing worse than the old (e.g. old
simple centralized system working much faster on new hardware vs "new" pseudo
decentralized mess that can not scale as good because inter-node communication
overhead).

BTW, every time I read Martin Fowler quoted in a post as full rewrite
justification for the "faith jump", I get scared.

I don't know why the poster didn't implemented the new API, using it for the
new features, and eventually, rewrite old API commands (one by one, not all at
once) calling the new API, so in the end, new API would remain, and old API
would be just some wrapped calls to the new API.

~~~
amartinsn
That was exactly our approach, for the old HTTP API, we kept the same
contracts (URIs, response body, etc.) For the EJB one, as only one remaining
client was using it, we deprioritized that, but we were going to pick the
features they were using, and make the Java client point to the new API
implementation, also keeping the same contract.

------
Ace17
Joel Spolsky has a very nice writeup about the subject, "Things You Should
Never Do, part 1":

[http://www.joelonsoftware.com/articles/fog0000000069.html](http://www.joelonsoftware.com/articles/fog0000000069.html)

~~~
35bge57dtjku
From the guy who had a new language invented instead of re-writing
something...

~~~
hinkley
I think Joel's articles were probably best when they convinced you of
something you knew but hadn't articulated yet. If you aren't there yet I don't
know that he always delivers.

And if he's selling an idea you've already abandoned, he can leave you flat
for sure. His strength is in introductions, not reconciliations,

------
gbacon
But the siren song of the from-scratch rewrite is so difficult to resist.

~~~
karmajunkie
I think of this as a saner version of the from-scratch rewrite, personally.

------
auvrw
it's difficult to get a picture of all the nuances of this project, of course,
from any quick read.. one thing that the article left me wondering a/b is the
outline of "the new stack" which is "agnostic to any programming language".

i mean, i'm not the biggest Java fan, but it's alright, and there are some
pretty good RESTful frameworks

[https://www.playframework.com/](https://www.playframework.com/)

while the strangler application (always a fowler link, always, in these types
of articles) makes a ton of sense for making sanity out of incoming requests,
i wonder if some effort could have been &/or was saved _behind_ the transport
layer by reusing (possibly with refactor) the existing Java implementation?

whether doing so would make sense would be really situation dependent, of
course, and an "outside" perspective from a quick article isn't going to have
enough information to say anything conclusive. the (lazy) feel is that reuse
is better than duplication of effort.

\------------

aaaaanyway, super-good article. this is of of the kinds of things that's
important to think about as a workaday web dev.

~~~
amartinsn
Hey, thanks for your reply. Your line of thought is very similar to what we
had. We considered refactoring the existing Java implementation. It used a
really old Jersey version, with too many overengineered customizations
tightened to that version. We tried to upgrade it and get rid of some of those
customizations, but the system was so fragile that other parts of the system
stopped working, which made that a really difficult road to drive. We wanted
to simplify the design. We chose Scala and Finagle for the new stack.

------
amelius
> So once a functionality was implemented on the new stack, with all
> characterization tests passing, all we needed to do was to tell the router
> to use this new implementation, instead of the old one, and everyone
> consuming this endpoint would be automatically using the new implementation!

How would that work without also migrating the database and while keeping the
databases of the old and new implementations in perfect sync?

~~~
biot
Migrate the DB in a backwards compatible way. So both version n and n-1 can
use it. The next update (n+1) can then remove support for n-1.

------
ensiferum
This isn't a rewrite in the traditional "big bang" sense of a rewrite but
normal software evolution and maintenance.

Anyway sounds like their approach was quite reasonable, just a bad title for
the blog entry.

~~~
amartinsn
It's an incremental rewrite. Rewrite one feature, switch everyone to use it,
then destroy the old implementation. Do the same for all other features.

------
gcb0
to grasp how bad that system agree, globo is Brazilian fox news. and only got
on the web 10 yrs ago... and they're already on version 4

------
emdeha
Great article! It's interesting though, how did you convince management to
remove features no one was using?

~~~
tonyedgecombe
If no one was using them why do you need to convince management?

~~~
emdeha
Because management decisions ain't always driven by logic.

The simplest reason may be attachment to the feature--the feature is of no use
but the person who's invented it is scared by the possibility of removing it.

