
Ask HN: I just inherited 700K+ lines of bad PHP. Advice? - ohmygord
So, I've just inherited a very large, very badly written monstrosity. Including javascript, template files etc, it breaks the 1 million LOC barrier. I'm looking for some advice and strategies that you guys might have used in similar situations, in particular on:<p>- getting a handle on the code base
- communicating 'progress' to the client
- not losing the will to live<p>The software is based on vtiger, an open-source CRM that has a (deserved) reputation of being incredibly badly written, that has since been badly hacked apart by several different companies with wildly differing ideas. My client currently have 150+ installs and 150+ angry clients.<p>Words fail me trying to describe the state of the software.<p>- no niceties such as MVC, ORMs, a DBAL, or a modular design
- all DB queries are inline SQL, with tens of inner joins on most queries
- dizzying call stack, yet reams of copy+paste code<p>The best part: the code will often query the DB and execute PHP code contained in the response, or load and run arbitrary files and modules as dictated by parsing particular DB fields. The one page I have studied in detail generates 105 DB queries in the simple case.<p>The DB itself is even worse. There are over 600 tables, as well as views, custom functions, cascades and (but of course) triggers. There is no consistent naming schema, very few explicit foreign key references (despite being heavily, heavily entwined) and I have already discovered several tables that don’t have primary keys, but are referenced by exact string matches on things like date stamps.<p>I wont mention the table-based HTML, javascript, lack of version control etc.<p>I’m not sure if its even possible to give relevant advice (besides perhaps ‘run screaming’), but if anyone here has come through a similar situation and has any advice to share, I would be deeply grateful.<p>Help me HN - you're my only hope.
(PS. 2K char limit sux)
======
lhnz
1\. Get it onto version control.

2\. Make sure there is some workable strategy for deploying and testing the
code.

3\. Ask somebody to provide you with a list of the changes, or else try to
create some kind of diff against the original version of the code. If you can
see crazy stuff here then find out who did it...

4\. Ask somebody what the biggest bugs are? Which things are causing clients
the most problems?

5\. Try to establish which convention is 'winning' in the codebase. But you
might want to create a more sensible convention which will allow unit testing
(start this immediately!)

6\. At this point, ask if you can hire people to work on this with you as it's
a big problem, and you need to free yourself up for the rewrite.

7\. If that isn't possible then leave. You have done enough to make your CV
better and a company which passes you something like this does not care about
your career.

~~~
mtrimpe
First read the Fowler's 'Refactoring' book; it was written just for you. Then:

1\. Identify a small and easily separable piece of code (what you woud call a
component in a _normal_ system.)

2\. Write tests covering every (important?) edge-case of the piece of code you
want to rewrite.

3\. Mercilessly refactor until it's nice and squeeky clean.

4\. Lather, rinse and repeat.

And of course, make sure your client acknowledges that it's a giant
clusterf... and is on board with you pulling the system out of the stone age.

Also, if you want to make life a bit more interesting for yourself, get the
PHP code's AST and programmatically rewrite existing code to shared
conventions for kicks.

~~~
masklinn
> First read the Fowler's 'Refactoring' book; it was written just for you.

"Refactoring" is not the tool for the job, although it's a nice sidearm.

What OP needs is the big gun, Feathers's "working effectively with legacy
code": [http://www.amazon.com/Working-Effectively-Legacy-Michael-
Fea...](http://www.amazon.com/Working-Effectively-Legacy-Michael-
Feathers/dp/0131177052)

As the title hints, it was written specifically and expressly for the "I just
got a huge amount of complete shit of a codebase shoved unto me, how do I
survive". Just check the TOC of part 2 (the meat of the book):
[http://my.safaribooksonline.com/book/software-engineering-
an...](http://my.safaribooksonline.com/book/software-engineering-and-
development/0131177052)

> And of course, make sure your client acknowledges that it's a giant
> clusterf...

That's hugely important. No promises of delivery, and that the client
understands it's not a cakewalk.

------
ck2
I started writing some suggestions but you know what - someone is dumping this
on you because they didn't care and their predecessors didn't care, etc. They
probably make far more than you for doing far less.

The moment you start touching the code, you are going to start being blamed
for the nightmare preceding you. It could even affect your career if future
employment researches where you worked previously and gets told you made the
mess in the first place.

~~~
pasbesoin
My thoughts are in line with this. Are you (honestly) being hired or promoted
to _fix_ this mess, or to _keep things running_?

If the former, has the incredible scale and scope of this been properly
identified, addressed, and acknowledged? Are you guaranteed anything near the
resources to (try to) accomplish this (including your own time, without
traipsing far into overtime)? Is the current state documented sufficiently to
obviate any and all future attempts to blame you?

If the latter (more likely, I suspect), well... I guess the simplest question
is, do you have an agenda and an exit strategy that leaves your career intact?
(And your health...)

Maybe, given the particulars, this is a real opportunity for you. But that's
not spelled out at all, nor obviously implied, in your post. And given that
this situation was allowed to develop to this extent in the first place, and
that you have angry clients to deal with, right off the bat, it doesn't sound
promising.

Do you like playing the role of unacknowledged hero who falls on his sword and
is cursed by his clueless fellows, while some other protagonist goes on to get
the girl?

There is _a lot_ of downside, here. What's the upside? Do the organization's
goals and commitments match your personal ones?

------
fallous
I've faced similar Augean stables in the past (and present, unfortunately).

I'd suggest that correcting the DB is one of the last things you can do,
especially if queries are scattered throughout the code instead of in
functions. You could attempt to abstract it with an internal API and as you
update the codebase replace with calls to the API. Once fully abstracted, you
can then focus on getting the DB corrected and only need to modify the new API
functionality.

As to the code itself, sit down and map out all the verbs and nouns in the
system. If you have a Contact noun, what is the definition of that role and
what verbs can be applied to it or what verbs could it do. This gives you a
good map for creating functions that can then be used to replace existing
inline stuff.

Triage the worst bugs or performance bottlenecks and see if they are
particular to a noun and/or verb, which should give you an obvious starting
place to begin refactoring. For emergency hotfixes and such, feel free to just
tweak the existing crap code but otherwise try and work on your functional
units to get ahead of the game.

And always remember, pimpin' ain't easy. ;)

~~~
bobfromhuddle
+1 for this. The first job is to stabilise the system by fixing critical bugs.

As you're doing so, move all those queries into one big fat DB class, and when
you spot groups of related queries, split them out into their own classes.

The next priority should be to get rid of the PHP from the DB - if need be
create another huge class with a zillion if-else statements.

You need to modify the code to simplify it. You don't need to _improve_ the
design, you just need to dumb it down until you can understand where all the
parts are. Stabilise, simplify, then refactor.

------
citricsquid
What is most important is what are you trying to achieve. Are you trying to
make the system as stable as possible at the lowest cost to the client or are
you trying to bring this system into a future proof state and the client is
willing to pay for that?

Personally if I were in your position I would explain to the client that I'm
happy to temporarily fix some bugs but long term the system needs to be
rewritten. 700k lines of code is a lot, but the way you've described it I get
the feeling most of that code is needless. Depending on what the system
actually _does_ you could conceivably rebuild in a few months.

~~~
mtrimpe
I've spent quite some time the cross-component spaghetti code large companies
sometimes write.

I've come to believe that the skill of _not rewriting from scratch_ but
forcing yourself to slowly refactor (as per Martin Fowler's definition)
existing systems into a proper state is one of the most important skills you
can develop.

That way, once you've refactored most of the system (which includes adding
tests for all the important functionality) you can indeed confidently rewrite
everything. If you do it any sooner than that though, you're in for a world of
pain.

~~~
seldo
Second this. The temptation to rewrite from scratch is to be avoided; the code
is going to be a mass of edge-cases, and you can't spot them all at once.
Rewriting will take just as long as refactoring and introduce new bugs instead
of killing old ones.

------
burningion
I was in this exact same situation two years ago, only with an eCommerce
platform which shall remain anonymous. The client had gone through 4
companies, trying to get the project built and finished. Nobody had been able
to wrangle it clean.

Reluctantly, I took on the project, and started working through it. What I
initially estimated would take me a month to untangle ended up taking a year.
That's an entire year in the snake pit. And since it was eCommerce, there was
serious money on the line when it came to bugs. And there were hundreds that I
found.

Just understand the commitment you're making. Make sure your client has the
money and the time to make things right. Ultimately, in my project, the client
insisted on quick hacks to keep competition at bay, and the code dissolved
into a mess. I decided I couldn't keep working with a codebase that was never
given a chance.

Understand what you're getting yourself into. Because you're taking over the
responsibility of loading a massive crap ton of software into your head for
diagnosis. How long do you want to have fragile, crappy, lazy code in your
head? Forget about bring superman, you're not going to save a million lines of
code, you're going to become the builder of the hacks that work around
absurdity. Make sure you understand that.

The burden of broken code you're responsible for, that's always broken in
production is like nothing I've ever encountered. Make sure it's worth it.

------
toddmorey
So the cold reality for this client is that the codebase will have to be
replaced over time. You are not trying to escape legacy simply for the lure of
something new, you are trying to escape insanity.

I would talk to the client about focusing your effort on helping them
transition--a small piece at time--to a sane architecture. If they aren't open
to that, they aren't your client. I've been a business leader in a position
where we had to make really tough and painful decisions about coding projects
gone awry. I don't envy their position, but continuing forward with this
monster does not seem to be in the long-term interests of the company.

~~~
mattmcknight
This is the beauty of the web- the transition to a sane architecture can be
done page by page- without any visibility to the user. Every time you have to
make a change- fix an old feature or add a new one, you replace it with the
new architecture. Even if it's not a whole page- you can load in a partial
with javascript. The challenge is holding back from going too deep on the
refactoring all at once. Functional tests around the system have to be added
to keep your sanity as part of the change process.

------
schulz
1\. I'm so sorry.

2\. Set up a development environment and deploy the code there. Get it
working. With code that large (and with the added wrinkle of executing code
out of the database) changing things is going to be a nightmare of unintended
consequences. Getting a testable environment up will let you find those things
and help you understand what it does.

3\. Get it in version control. This should be number 1: Before you make
changes get a baseline of where it was.

4\. Find a bug that exemplifies the nastiness of the whole situation and make
a fuss. Let everybody know why this bug is so bad and what caused it. This
will give your employer a concrete example to look at when you say "this code
is shite". Harp on this bug.

5\. Fix that one bug. Roll it out. Be a hero.

At this point you'll have a good base line, some credibility, and the
organization will understand what a mess they've got. Now you'll have to
figure out what you want to accomplish: keep it limping along? Improve it?
Rewrite? The above steps will get your feet under you.

------
beck5
I would start by spending a week or two wrapping the application in Acceptance
tests (cucumber/capybara) just so when you do make a change you are able to
quickly find out to a semi decent level of confidence things are ok.

I would also recommend Working Effectively with Legacy Code By Robert C.
Martin.

Good luck!

~~~
mattmcknight
"Working Effectively with Legacy Code" is by Michael Feathers.
[http://www.amazon.com/Working-Effectively-Legacy-Michael-
Fea...](http://www.amazon.com/Working-Effectively-Legacy-Michael-
Feathers/dp/0131177052)

------
wpietri
My tips:

 _Raise your rates._ If you just took the gig, you'll have to wait a bit. But
you should price your work such that whether they say yes or no, you have no
regrets.

 _Expose the problem._ Start inventorying the issues. Track them in the same
way that you track other work. As you do things that the client has requested,
track real vs actual time. E.g., "This change took 12 hours; if the code base
were clean, it would have taken 1."

 _Estimate the size of the problem._ Talk in terms of technical debt. E.g.,
"Module X needs 120 hours of work to bring the code to commonly accepted
standards of code quality." The clients are thinking, "We have a system we
paid $1m for, so it's an asset worth $1m." Expose the debt and they will have
a better idea of the true value of the code base.

 _Look for opportunities to declare tactical bankruptcy._ Once you have
numbers, you can show that some portions of the code base will be cheaper to
rewrite than to clean up. Help your clients make good financial decisions
about when to just toss and rewrite particular parts of the code.

 _Don't let them make you crazy._ I'd recommend something like a kanban board
to track work and strictly limit work in process. This system is probably a
mess because the client is insane. Develop some very clear, very firm
boundaries that keep them from driving you crazy as well. If you are lucky,
they will, over time, learn from you to behave rationally about software.

------
pfisch
You should quit. Life is short, there is no reason for you to spend it doing
that.

------
jsmartonly
* Rewrite.

* NEVER modify existing one. Once you change one line of comment, you own all the code and problem from that point.

* If rewrite is not allowed, then ask huge pay raise for this work. Basically it is not about money, it is about bring everyone on the same page on the status of he existing solution.

* If the above does not work out, prepare to switch to another project, or quit the job totally.

~~~
thaumaturgy
There is absolutely no way to rewrite a million lines of business logic
without ending up with an even bigger mess. See also:
<http://www.joelonsoftware.com/articles/fog0000000069.html>

~~~
jsmartonly
I read that article before and totally agree with the point.

But that situation is different from the one we discuss here.

I do not know more information about ohmygord's project, but I basically want
to point out to consider non-technical side of it. For example, people in the
same team may not technical, and/or think maintaining existing solution is
simple. I was in similar situation before, I was lucky to happen to select
right strategy to deal with the situation.

------
wanderr
This might sound sacreligious to the many vim fans here, but get a good IDE,
it will help you get a handle on what the code is doing and let you navigate
around faster, which is especially handy if the execution path for
accomplishing any one thing involves dozens of files. A good IDE will also
point out blatant errors, and a really good IDE will point out potential
errors as well. I personally really like PHPStorm by JetBrains, the code
inspection tool is quite good. I was recently able to cut the size of our code
base in half by using it to identify tens of thousands of bugs, a lot of them
on inspection were "this never worked" type bugs, which with a little digging
I was able to confirm could never be called. Eliminating code also makes
refactoring the remaining code easier because you have fewer interdependencies
to worry about.

------
scotty79
Don't touch anything. Run. There's no glory for you in this.

------
stilkov
First of all, make sure the client and you agree on what the long-term
strategy is. Then get buy-in for a first step in this direction.

If the system is as bad as you make it sound, the long-term goal has got to be
a complete replacement of the existing codebase. That will usually require as
much effort, and thus money, as writing the existing version did. (Experience
shows there's is no reason to confidently assume different.)

Then, explain how to get there without doing a (hopeless) complete rewrite in
one big bang:

First, you need to make a set of decisions for the new code you're going to
write, i.e. the language, framework(s), architecture, whatever you want to new
code to be based on.

Next, you try to modularize the existing system so that you can replace one
tiny part.

That's going to be really, really hard - modularizing a systems after it's in
production always is. Don't do it all at once: If you've isolated some small
piece of so that there's a clear interface (based on a programming language
API, a database interface or (my favorite) some RESTful HTTP API), rewrite
that small piece using your new technology stack and integrate it with the
existing monstrosity.

Once you have done that successfully for some small aspect, you have some sort
of proof that this approach can work.

Then, over the next months (or more likely: years), rinse and repeat.

This is a hugely expensive thing to do, but that shouldn't come as a surprise
– after all, you're replacing the organs in a living body while it's running a
marathon on its last breath. An MBA should understand that the additional cost
is because this strategy drastically reduces the risk.

You can explain to the customer that they can try this out using just a small
part, and decide whether or not they want to continue afterwards. Point out
that you're going to start with those parts that produce most of the existing
pain. Explain that you're helping them to return to a situation in which they
have fewer bugs, can introduce new features quickly and easily, and best of
all, that the end result will be a system that's modularized, hopefully
ensuring that they won't run into the same situation again.

If they expect you to do magic, i.e. maintain the mess and magically turn it
into a good piece of software without being allowed to actually change it
significantly, get out of the contract as quickly as you can.

~~~
algermissen
Stefan humbly omitted a reference to his excellent treatment of breaking up a
monolithic giant. It's well worth a watch, so here you go:
<http://www.infoq.com/presentations/Breaking-the-Monolith>

------
beering
You should make sure to set expectations. If your employer only wants the
minimal amount of maintenance done, then don't do any more. You could go to
heroic lengths to repair the codebase, but if that's not what they're asking
for it will be in vain.

Second, I suggest applying as many tools as you can. A modern version control
system, of course, and keeping any version control history that you inherited
(although it sounds unlikely).

A powerful IDE might also let you start cutting out crap immediately, so try
PHPStorm or Eclipse+PHP (or both!) and see what they can tell you.

And start writing tests as you start making changes, because you'll likely
break something seemingly unrelated when you start changing things.

------
RivieraKid
700k+ codebase and a single developer? That sounds crazy.

Run away from this. Trying work with this code would make you stressed and
frustrated, which will have a significant negative impact on your
productivity.

If the company plans to add features to this software, they should hire more
than one developers and perhaps rewrite it from scratch.

Edit: Also send your boss link to this discussion :)

------
pmtarantino
A few weeks ago I was commited with something like that. I just quit the job,
I couldn't sleep at night and I was not making any progress in the first days.
I know there is a learning curve, but it had been two weeks and I couldn't do
anything. Be sure you can work with that before accepting, because then is
really hard.

~~~
kellishaver
I had a similar experience a few months ago, myself, with a huge, very poorly
written PHP app. I lasted three weeks; three weeks in which I didn't sleep and
felt constantly agitated while I spent nearly every waking moment at the
computer trying to be productive amidst an ocean of stress. My whole family
suffered from this project, because I became very difficult to live with.

------
ScottBurson
Lots of good advice in here. I lean toward the "run screaming" side myself.

The fact that they've burned through several contracting companies and still
think it's possible to get large numbers of bug fixes in the first month
suggests that they're pretty clueless. It's going to take you two or three
months (if not longer) just to get your head into the code enough that you can
fix anything nontrivial.

If I were advising _them_ , I would say they need two teams: one team of just
two or three people (or maybe just one) solely trying to fix bugs in the
existing code, and the other team of three or four people doing a complete
rewrite from scratch. As others have commented, complete rewrites are normally
a bad idea, but this code base is so far gone I don't think it can be
incrementally refactored into sanity. Oh, and they should expect the rewrite
to take two years.

But despite their experience to this point, it sounds like they're still not
ready to hear that. Which leaves you little choice but to run screaming.

EDITED to add: what these people need to understand is that their demand for
results in a hurry _is what got them into this mess in the first place_. Until
they get that I don't think there's any hope.

------
grey-area
The first thing I would do, before doing any work, would be to sit down with
the client/your boss and explain just how bad the situation is, that drastic
solutions are in order, and that it will take years to get this under control
(with 1m LOC and 150 clients presumably all running customised software, this
would take years to sort out even with a large team working on it). Unless
they understand that from the beginning you will never get the backing you
need to sort this out.

This might be one of the few occasions where a complete rewrite is justified
(if you can keep scope limited to reproducing what you have). You've said the
code is incredibly complex, and if the problem domain is incredibly complex
too, you're probably stuck refactoring. If the problem domain is pretty simple
(a CRM without too many extra features might be), you may be better starting
with your smallest client who uses the product the least, asking for all the
pain points, and things they love, about the current software, and writing a
simple CRM to cover their needs which replicates the features of the current
product, then gradually porting other clients over to the new system and
adding new features to it, while keeping the old code-base in maintenance mode
and fixing serious bugs only. If you do a rewrite you'd have to port the 150
clients over 1 by 1, and leave the other code in maintenance mode - your
primary client may not be at all happy with that.

If that's not possible, you'll have to refactor it slowly while keeping the
code in place, so the first step is to get it into version control, sort out a
sane deployment strategy with testing servers, then try improving some small
isolated areas of the code for one of the clients in isolation. Good luck!

------
thaumaturgy
I have actually worked on a code base like you're describing, on a contract
basis, for a client. I _loathe_ maintenance programming, so the relationship
didn't last very long -- just a few months. So:

1\. Make sure you have a rock-solid contract in place with the client that
will ensure that you get paid, get paid well, and get paid often. Receiving a
check in the mail makes it easier to look at the code. If your payment terms
are anything like, "payment-upon-completion of ...", or, "paid net 30 after
invoice", or anything like that, you simply won't want to work on the code.

2\. If "soul-crushing", "depressing", or "makes me want to hang myself" are
phrases you'd use to describe the code or your state of mind when looking at
it, then go into this project knowing that you're not going to last long.
There are people who genuinely enjoy working on stuff like this. You aren't
one of them.

3\. Everybody that says "rewrite" is dreaming. It is _impossible_ to rewrite
something that large without breaking something and spending too much money.
Re-factoring a function is doable. Re-factoring a thousand-line file is
doable. Re-factoring part of a database is doable. Re-factoring all of it all
at once is starry-eyed fiction. Not gonna happen.

4\. But, if taking ownership of this code base is something you want to do,
then add re-factoring time in to your agreement with the client -- something
like, "20% time spent replacing bad code" -- and focus on the tiniest little
ugly thing you can find, and re-factor that. Start on it, don't stop until
it's done. Keep it in small bite-sized chunks.

5\. Make sure you're getting paid for time spent just getting familiar with
the code base. If you work with it long enough you'll actually get pretty
familiar with most of it, but you want to do that on _their_ dime, not yours.

6\. Get help. They surely realize by now that they've got a mess on their
hands. Talk with them about whether or not you can bring on additional help.
If they flat-out refuse, _run_. (That is what killed my work with my client; I
wanted to move into a position where I managed a junior programmer and focused
on code rewrites and higher-level stuff; they refused, I quit. They wanted an
employee, not a contractor.)

7\. Version control and a sane bug tracking system (Mantis isn't horrible) are
must-haves. If they don't have these, again, make sure they pay for it.

Dealing with a code base like this one is as much about state-of-mind as
anything else. Either you can handle it or you can't. No amount of advice here
will make it more palatable to you if you're not the sort of person that's OK
with inheriting a disaster.

Also, even if you've got some kind of agreement in place with the client
already, it sounds like you've just now gotten your first look at the code.
This, in my opinion, makes it totally OK to go back to the client and re-
negotiate. You can open it with, "I'd like to work with you, but now that I've
seen the project that you want me to work on, I can understand why this has
been a problem for you, and I need to make sure that we can come to an
agreement that will work for both of us so that I can fix this for you." (Or
something.)

~~~
robomartin
" Everybody that says "rewrite" is dreaming. It is impossible to rewrite
something that large without breaking something and spending too much money.
Re-factoring a function is doable. Re-factoring a thousand-line file is
doable. Re-factoring part of a database is doable. Re-factoring all of it all
at once is starry-eyed fiction. Not gonna happen."

I don't think so. Yes, the opportunities to do something like this are rare.
It is up to the project lead and the client to decide whether or not this
makes sense.

Also, keep in mind that a "complete re-write" doesn't necessarily literally
mean that every line of code must be re-written. There's often tons that can
be salvaged.

If the code base is an absolute disaster I would not touch it without the
understanding that the project might entail massive re-writing of portions of
the codes base as well as significant structural modifications. Maybe I'm
lucky in that I've never really had to go look for work. I would flat-out
reject a project like this without massive client buy-in.

If it is mission critical for the client and they can afford it there is no
reason not to step back, truly evaluate the situation and consider a
significant redo of the app.

For any non-trivial enterprise having a solid and maintainable code base is
nearly priceless. Is it worth investing a year and the corresponding financial
commitment to fix the problem once and for all? For the right business, yes!
The alternative is to live with a patch-work of code for the next ten years of
more.

Because I move across disciplines I have seen this sort of thing in many areas
outside of just software code-bases.

I have, as an example, seen data processing facilities with millions of
dollars in equipment designed in patch-work fashion that bleed money on a
daily basis. In one such cases I proposed a complete redoing of the facility
(in staged fashion in order to not affect business). It was very costly, but
the owners where under such pain due to the constant bleed that they saw the
intelligence in investing a lot of money to lay down an infrastructure that
would withstand the test of time, not to mention stopping the bleeding.

Similarly, I have seen this in faulty processes. Process optimization or
redesign can be critical to a business. The most well known example of this is
the automobile industry.

Car manufactures like Mercedes were devoting fully 20% of their factory floor
space to repairs. Cars would come off the assembly line with defects that
would have to be repaired after the fact. This consumed a tremendous amount of
time, money and resources.

In sharp contrast to this, companies like Toyota where using an approach that
aimed to have cars come off the line with zero defects. They'd stop the
assembly line when an defect was detected. At first they nearly couldn't make
cars. The philosophy was to ensure that detected defects never re-occurred.
With time cars started to come off the line with few, if any, defects. Most
car manufacturer have now adopted these ideas.

The point is that sometimes a "complete rewrite" is warranted and even
necessary. On cannot categorically state that the idea of a re-write is
"fiction" any more than stating that it is an absolute necessity while being
completely detached from the players and their circumstances. I suggest that
it is for the client and consultant to evaluate and decide.

On a personal note. I don't enjoy working with crap. I enjoy my craft. Whether
it is writing code, designing electronics or mechanical. I enjoy doing good
work and working in quality projects. Life is too short to work on shit
projects. You learn nothing and nobody is happy.

~~~
ohmygord
Yes, I have often said that you should never, ever throw (significant amounts
of) code away and start from scratch. I have worked on two bodies of legacy
code before and more or less used most of the techniques discussed in the
comments, but this is unprecented - for me - in both size and badness. Client
and I have agreed that I will work for a month and then see where we stand...
I'm hoping that after a month I will have a better feel for whats going on and
can outline either a staged or complete rewrite.

Lots of great advice in here, has lifted my spirits a bit. Especially getting
complete client buy in (which I have internalised but I guess haven't
expressed, either to myself or the client).

~~~
robomartin
That's a good approach. Remember, it isn't your problem. It is your client's
problem. You are there to help solve his problem. If he doesn't care enough
you certainly shouldn't.

In a month you'll know a lot more about what you might be walking into. It is
critical that you client also learn what he has to contend with. In other
words: Communicate profusely throughout the process.

------
kristiandupont
My secret weapon is a folding editor called Code Browser.
<http://tibleiz.net/code-browser/> \-- with this, you can do a non-destructive
(well, it only adds comments) folding of the source. This is a very fast way
to get a better view of what's going on. I've used it many times when trying
to make sense of legacy code.

That is, if you choose to go through with it. My real advice would be to avoid
it altogether as many others here. It's going to be extremely frustrating no
matter how you attack the problem.

------
michielvoo
Make sure the code (including any stored procedures or code that is otherwise
stored in the database) is in a version control repository. Because any change
in this code might have unexpected, subtle consequences (i.e. introduce more
bugs). In which case you'd do better rolling back that particular change.

The next step would be to get a grip on deployment. Automate it, so you can
roll out updates and roll back updates to all clients without breaking a
sweat.

Then set up a proper backlog and bug tracking system, where you can prioritize
bugs and work items. (And maybe open it up for bug reports by clients?)

Just like with a real debt, with a technical debt, seeing progress can help to
keep you going. At this point, you should have a grip on it, it's just still
going to be a lot of hard work. There's good advice on how to approach the
refactoring.

Finally, and this is not related to the code, educate stakeholders in your
organization about the concept of technical debt. (Back it up by time tracking
various work items from the bug tracker.) Somehow your organization got into
this situation, so there may be a problem where new features or custom
features for clients get priority before bugfixes, and are written without
much guidance. Joel Spolsky has written on this subject, you may find his
writings help explain the concept, as well as find a way out of this mess
(like the '12 steps to better software').

Good luck!

------
tpolecat
The problem you will face is that you have no way to verify that you haven't
broken something unrelated when you make a change, because the current
behavior of the system is unknowable; you can't write comprehensive tests for
a codebase that large and that bad because you don't even know what it's
supposed to be doing. The chickenshit nature of PHP and the lack of a sensible
type system and refactoring tools will make things even more difficult.

So, run. Seriously. You're doomed.

------
dillinger
1\. Put it under Version control. Preferably GIT, You will need a lot of the
tools that git and github provide. A private Github account will do but hosted
github is what I'd prefer.

2\. Get a Test System that has enough horsepower.

3\. Create a deployscript

4\. Deploy until it seems to work.

5\. Start working with CI and static code analysis. You might get lucky when
it comes to copy paste code. Copy-Paste detection and Coding Standards come to
mind at first but there are a lot more helpers

6\. Automatically create some API Documentation. The worst code cant hide what
is inheriting from which class etc. Integrate Generation of Docs into the CI.

7\. Create some basic so called "Smoke Tests". I'd prefer some very basic
Selenium Tests opening the most important parts of the app. This is straight
forward. Run them against the APP with error logging turned on on every E_ALL.
This error.log is your scary list.

8\. Setup Single Builds and try to integrate with More than one Version of
Vtiger, PHP and Mysql. Since you have 150 Customers, chance is great that you
have 150 different setups.

Note: You havent changed one line of code yet. Sit down with the customer and
discuss all your findings and metrics.

9\. Start creating different GIT repos with the above process for all the
modules that are added by your customer. Integrate with the build and run the
tests until you have the same amount of errors like before. start extending
the build To build Against your Mysql, PHP Versions

... I could go on forever .. but basically this will get you up and running.

------
jsmartonly
* Situation like this is not only technical issue any more. Your solution needs to reflect that.

* This is not best situation to be in, but if you learn to deal with this and emerge from it. This experience will make you so much stronger. So be ready to quit, but do not quit too early.

Good luck!

(I replied earlier, but the above two points are so important that they worth
a different post.)

------
robomartin
Take a few steps back and relax. If you were looking at this from the Space
Station, who would you say has a problem with the code base? That's right,
your client. Not you. Your client.

If he/she has 150 installs and 150 angry clients he/she knows that this thing
is rotten somewhere. You client may or may not have some technical
understanding but rest assured that they understand business.

Life often boils down to binary decision. You have two choices. Gracefully
exit and move on or try to help your client.

If you choose option B you've also made another choice: Your first job is NOT
to be a programmer. No, you are going to have to be a teacher.

You have to do your best to explain to your client why he might be sitting on
a ticking time bomb (or whatever you might want to call it). It is imperative
that your client understand that he has handed you an ugly, stinking, putrid
and smelly mess. Without client buy-in I would walk away.

Now, here's the challenge: You have to find a way to communicate the problem
that is not menacingly full of CS jargon and acronyms that mean exactly zero
to your client.

I've had to deal with these kinds of problems before. On one or two occasions
I made the mistake of not securing an understanding with my client and
suffered the consequences. These were miserable walking-through-feces-
infested-mud experiences. Never again. Once I learned that lesson things
changed. My most memorable experience was when I got client buy-in from a
major international corporation and, once they realized that they had a huge
problem, they put me up at the Waldorf Astoria in Manhattan for a full month
(these guys are so big that they have rooms pre-paid for "emergencies").
Imagine a guy in a t-shirt, jeans and sandals showing up at the Astoria. I've
never been looked at like that before. Once they realized who my employer was
things changed. Fuck, the room had marble and gold-plated crap everywhere.

But I digress, the point of that last example is that once a client understand
the degree of the problem in their hands things change. If having a solution
to this problem is important enough there is no end to what they will spend to
fix it. Is it a business-killing problem? Even better.

Judging from your description my proposal to your client --after they really,
really get it-- is to re-write their entire app from scratch.

I would further propose that you are going to need to hire a few more people
(two to five?) in order to get this done as quickly as possible. And, yes,
this will be expensive.

You can use many analogies to explain the problem. I'll leave that up to you.
I've used ideas like that of constructing a building on a foundation of sand
rather than concrete while using substandard supplies rather than industry-
accepted good quality building components. Whatever analogy you use, it has to
convey the severity of the problem without resorting to CS. If your client has
some technical chops you can get into it a little AFTER you are done with your
analogy.

Finally, the most important part: You have to be willing to walk away from it.
You state the problem and explain that it will be expensive. You also state
that you are not interested in anything other than a full re-write of the app
because you are not in the business of doing further damage to your clients.
Respectfully suggest that without full buy-in you'll need to move on and he
will need to find another developer who might we willing to patch this thing
up.

In many ways, it's that simple. Two choices.

~~~
shimsham
Excellent, thoughtful and presumably experience-based reply. This gets my
vote.

------
smoyer
This is a case where having a defined development process and good sharp tools
can be very helpful. Here are the steps I'd use to tackle this problem code
(though it's based on what you wrote above and might need to be adapted as you
learn more.

1) Study how the software is actually used and design the "ideal architecture"
(this may be a moving target).

2) Get the software into a version control system.

3) When a section of the code needs work, first write tests that pass for the
current functionality of the module but fail for the behavior you're trying to
fix.

4) As you repair code in step 3, also migrate the code "towards" your
preferred architecture ... this is going to be a very gradual process so don't
try to complete it in one step and use your tests to verify you haven't broken
the system. This is also a good time to start inserting patterns like MVC/MVP
as it will help. - <http://c2.com/cgi/wiki?TestEveryRefactoring>

5) When you've found "reams of copy+paste code", refactor that code into
utility classes (files, whatever). - <http://martinfowler.com/refactoring/>

6) Establish processes for migrating the database both forwards and backwards
between versions (you'll need a rollback someday).

7) Treat the database schema as source code and refactor it as you work. It
sounds like you're a long way from being able to use an ORM, but have a plan
for migrating the database towards the day you can. -
<http://martinfowler.com/articles/evodb.html>

8) Get the PHP code out of the database ... that's going to be painful but
worthwhile.

9) Get some help! I've used the Sonar source code quality analysis tools on
Java projects for years. There's a PHP plugin for it here
(<http://docs.codehaus.org/display/SONAR/PHP+Plugin>) and it will help you
determine what areas might be worth targeting. It also helps by establishing
style and practice rules that will help get a team coordinated.

One of the hallmarks of a project like this is that coding styles changed
dramatically during the project's existence - Establishing a style guide
(including patterns and forbidding anti-patterns) can be very helpful.

So in short ... don't "run screaming" but rather sit and think when you feel
overwhelmed. If you can solve a complex problem when writing source code, you
can also solve systemic problems.

Good luck!

------
army
I hope you're getting paid well for this.

One of the most important things is just to manage expectations - it sounds
like you've got a huge task ahead of you and people will underestimate how
long it will take you to fix stuff.

It might also pay to just focus on getting the software into a maintainable
but ugly state.

------
languagehacker
This sounds like a decent candidate for being put into maintenance-only mode
while you gameplan a new product. It's pretty impressive the kind of distance
you can get with a modern framework these days. You've got the other
application there to refer to, so it shouldn't be too hard to port over the
more core logic into a service layer that you can actually test.

What I like about the "start from scratch" approach, though most people argue
against it, is that it gives you an opportunity to shape the entire
development process and architectural philosophy of the product. Sometimes the
tree of good software must be refreshed with the blood of bad projects.

------
joell
Run.

------
d0m
But what do you have to do with that? Is it maintenance? Do you need to fix
bugs?

If it's maintenance/bug fix, I'd suggest starting by writing tests and fixing
things on a day to day basis.

So, for all small tasks that you'll have to do with the code base, just
analyse the safest way to tweak it. Most often than not, you'll see that it's
just changing a couple lines. If you need to add new features, just code them
correctly in another part of the program.

And before you know, you'll understand the code-base. But, _tests_ are really
the most important thing here. Don't try to refactor if you can't make sure
you're not breaking everything.

------
cranklin
I have a similar story... only I inherited terribly written JSP. It was just
as bad if not worse than you describe. I ended up re-writing everything from
scratch; now, I am so glad I did.

------
debacle
I remember a client asking for vtiger once. I downloaded the source, and
didn't even get to installation before I fired the client.

I really, really feel for you on this one.

The only thing I can say is: Make a beachhead of clean, working code. Slowly
work your way out. Make it very clear to this client how much of a favor
you're doing them, and give them meaningful status reports (even if all you
did is rewrite the glue between to pieces of code).

------
pyalot2
Chances are infinitesimally small you can fix it. If you start to maintain it,
you'll be the one who gets the blame if it doesn't work. In an absolute best-
case scenario you get to rewrite it whole in your spare time, while trying to
keep the legacy code together with ducttape and chickenwire on your employers
time.

Get out, get out now. They don't need a maintenance programmer, they need a
ninja programmer, the liquidator kind.

------
RobAley
OK, so everyone else has pretty much covered the arguments for "don't do it,
run" and for "re-write it". But assuming that you either

a) have to maintain it anyway (can't afford to lose job etc.)

or

b) are going to re-write it but don't have a definte spec to know what it has
to do

then you are going to need to try and understand the code base. Here are some
PHP specific tools to help you.

\- Use XHGUI[1] (which is a fork of Facebooks XHProf) to profile the code as
it runs. It can draw call-graphs for you (if you have Graphviz installed)
which will help you to visuallise the code flow.

\- Use PHPdoc[2] to generate API docs. This will help you get a simplified
overview of the code to use as a reference.

\- Use Xdebug[3] as you make changes and execute code to get more insight into
how it is running and to trace variables etc. through the execution. You can
use KCacheGrind [4] to visualise the output of Xdebug.

\- Use a staging/development environment for everything you do with this code,
and don't push any changes into to production until you really, really have
to. When you do, use version control (e.g. Git, SVN etc.) and use an automated
build system (Phing[5] is a great PHP specific one) to try and keep everything
consistent.

Good luck! Quick plug : I'm currently writing a book [6] about PHP development
(called PHP Everywhere : Programming beyond the web with PHP) which covers the
tools above (albeit not for the kind of job you are taking on!). The one small
mercy you may have when tackling a project like this is that it is written in
PHP. PHP is usually quite a verbose language, which while it doesn't always
produce sexy code, does mean that its straight forward to read and understand
(at the local level!). An extra space here and there doesn't usually alter the
meaning of the code as it does in some languages!

[1] <https://github.com/preinheimer/xhprof> [2] <http://www.phpdoc.org/> [3]
<http://www.xdebug.org/> [4] <http://kcachegrind.sourceforge.net> [5]
<http://www.phing.info/> [6] <http://leanpub.com/php>

~~~
ohmygord
Hey RobAley, thanks a lot for the tool recommendations... also big thanks for
the person who suggested SONAR+PHP, will definitely look at that too.

I was planning on using phpdoc and xdebug, but haven't ever looked at XHGUI.
Is it significantly different from xdebug? At first glance there seems to be a
fair bit of overlap in terms of functionality.

~~~
RobAley
There is a fair amount of overlap in what they do, the main variation is in
the interfaces and how the information is presented. I tend to use one or the
other depending on the task at hand. They're both of good "pedigree", xdebug
has been around now for about 10 years I think and so has a good amount of
history behind it, and XHProf which XHGui is based on was developed by
Facebook and used against their code base which is probably somewhat larger
than yours (though hopefully better written!). At then end of the day they're
both pretty easy to get up and running (and of course they're free), so I
would suggest giving them both a test run and see which you prefer the feel of
and which better suits your needs in terms of the information it gives you for
your task. Given that you look like you will need all the help you can get,
you might even end up using multiple tools like this to get as much insight
into the code as you can. If you do, be aware that they can often interfere
with each other (or so I've read, I've never tried those two on the same code
base at the same time) so you might need to deploy them on separate
virtualised but identical environments with the same code.

Edit: Just to say, I usually use XHGui for profiling existing code and code in
production, and xdebug for profiling _changes_ to code and code under
development. But thats just because thats how the tools "feel" right to me,
and there's no reason why you can't do both with both.

------
nader
If you haven't done anything like this for the past I would say that as much
as it is a pain in the ass you probably can also learn a lot from it and you
will get out of the job with a lot of experience in refactoring, testing,
bugfixing and deploying. You could also see it as a chance to establish a long
lasting relationship and a boost in confidence and salary if you do it right.

------
cdavid
what does inherit mean here ? You were most likely not hired to refactor
700KLOC, because they would not have been in that position if they had a
decent engineering process in the first place. Obviously, do not rewrite from
scratch: it is a 700 KLOC piece of code so even at a completely unrealistic
rate of 500 LOC / day, it would take you a 5-10 man years to do it, and I
doubt the system is well specified.

First, I would focus on doing something visible for the client: show that you
can deliver, and do it as quickly as possible. This means: do not try to
understand everything, do not try to get a mental model of the whole thing.
Once you get some buy-in from your customer and people within your client, you
will have more flexibility to negotiate things, and be able to use most of the
technical advices you were given.

If the customer is not willing to enter this kind of discussions _after_ you
showed you could deliver, I would just walk away if you can.

------
zalew
you are doomed. flee.

------
mahasvin
Is there an option to migrate db to another CRM with similar functionality? If
memory serves, vTiger is a sugar clone.

------
naww
[http://programmers.stackexchange.com/questions/155488/ive-
in...](http://programmers.stackexchange.com/questions/155488/ive-
inherited-200k-lines-of-spaghetti-code-what-now)

------
TomGullen
Even though the DB is a monstrosity, I'd start by familiarising yourself with
it intimately. Once you know what data is meant to go where, it should be a
good starting point for fixing things.

------
korona
Explain the situation as calmly and thoroughly as possible to the client, and
suggest a complete rewrite, if they ever want future development to be
possible.

------
ecaron
Are you working on this by yourself? How long until your employer is expecting
bugs fixed and features added?

~~~
ohmygord
Yes, by myself. The client so far has been understanding... they've already
burnt through several contracting companies, and I think they're starting to
understand what a mess it is. But still, they want to see serious progress
within a month (eg. large number of bug fixes).

~~~
neuroscr
You're not going to have serious progress for a year. The DB is borked, so you
have no foundation at all.

Software Engineering is serious business, there's bugs, new features,
maintenance, testing, etc. They failed to manage their code. You need to be
realistic that with a team of 2-5 people it could take years to fix.

It might be best to put it out of its misery if they can't hold off their
clients demands and buy you the time needed to rebuild it.

------
smiler
I love this kind of stuff - if you fancy an extra pair of hands then contact
details are in my profile

------
krapp
Charge them a dollar per line?

