
All Software is Legacy - leejo
https://leejo.github.io/2016/02/22/all_software_is_legacy/
======
doxcf434
I recently had to maintain some new perl code. I didn't think it would be a
big deal, but found a number of things I take for granted today that perl
hasn't kept up with:

1) The perl cpan module doesn't resolve dependencies

2) The cpan module has parsing errors when passing in a list of CPAN packages

3) You have to manually grep your perl code to see what modules it depends on

4) Module installs take a long time since they can compile and unit test the
code, unit tests can even make connections to the internet or try to access
databases and fail, so you just have for force them to install

5) Non-interactive installs of CPAN modules requires digging in the docs and
learning you need to set an env var to enable

6) CPAN modules aren't used that heavily and can have bugs that would be
caught in wider used modules. (e.g. the AWS EC2::* modules don't page results
from AWS so results sets can be incomplete, whereas the wider used boto lib
works correctly and is better maintained.)

7) Perl devs don't think twice about shelling out to an external binary (that
may or may not be installed)

8) Even if regexs are not needed, inevitably the perl dev will use them since
that's the perl hammer, and it's hard to know what the intention is with
regexes or what the source data even looks like

9) You have to manually include the DataDumper package to debug data structs

10) You have to manually enable warnings and strict check, it's not on by
default.

Anyhow, I think we've made a lot of progress since the 1990s. :)

~~~
zzzcpan
> Perl devs don't think twice about shelling out to an external binary

No, most of them do. Perl ecosystem has a killer feature called cpantesters,
that allows everyone to see which modules work on which systems out of the
box. You should always check cpantesters matrix before choosing a particular
dependency.

> Even if regexs are not needed

They got overly complicated over the years, but they are needed. They are DSLs
to make things easier when working with strings. I.e. so you wouldn't have to
write 20 lines of hard to grasp code with bytes.Index(), bytes.HasSuffix(),
bytes.TrimRight(), etc., like people do in Go, but a single nice regexp and
therefore reduce your chances to make a mistake in that code.

~~~
giovannibajo1
> so you wouldn't have to write 20 lines of hard to grasp code with
> bytes.Index(), bytes.HasSuffix(), bytes.TrimRight(), etc., like people do in
> Go

Go has regexps, and a very good implementation at it.

Depending on what you do and on the specific code-path, compiling and/or
executing a regexp might be slower than manually parsing the string. Go
standard library is pretty concerned with performance (much more than Python's
or Ruby's, for instance), so it tends to avoid regexps.

~~~
zzzcpan
It shouldn't be like that, that's the problem. Regular expressions should be
compiled into a native code and be even faster than a bunch of hand written
bytes.HasSuffix() combinations.

~~~
giovannibajo1
Your previous post said that they are a very useful DSL for Perl so that
"people don't have to do like they do in Go".

Both Perl and Go implement regexps, and neither or them compile them to native
code. So I don't get your previous comment at all.

The main difference is that, in Perl, if you ever had to write manual string
parsing, it would be much much slower than using regexps as Perl is an
interpreted language. So regexps are needed to perform fast string parsing. In
Go, you have regexps if you want, or you can go even faster if you feel it's
required.

~~~
zzzcpan
> Both Perl and Go implement regexps, and neither or them compile them to
> native code. So I don't get your previous comment at all.

Ok, I'll try to explain.

People feel discouraged to use regexps in Go, because they are very slow for
many typical parsing and validating cases and require extra step of
compilation and all of the additional code complexity associated with that.
So, people do parsing manually instead, with all of its problems. It's not
that they need that performance, almost no one does, but the whole idea behind
regular expressions is not working, parsing code is still bad most of the
time.

------
egraether
In my opinion the biggest problem with legacy code is understanding its
implementation as someone who hasn't worked on it before. In a lot of cases
it's not documented well and the original authors have already left, so
there's no one to ask. You are left with reading code written by someone else,
which takes a lot of time.

This is not Perl related, but I'm currently working on a developer tool that
makes this part of the job easier. It's a source explorer for C/C++ named
Coati that simplifies navigation within source code and thereby makes
understanding the implementation faster and easier.
[https://www.coati.io/](https://www.coati.io/)

~~~
akkartik
Your first paragraph _totally_ resonated. I've been thinking about this
problem for several years. However, my approach is diametrically opposed to
yours. I think our problems in software all stem from focussing on the code as
the tangible artifact to maintain control over. We should instead be focusing
on the _space of possible inputs_ that the code is intended to work for. This
is something you can't deduce from the code (and automatically using the
computer to deduce it, well forget about it), it requires cooperation with the
original author to present things in a way that makes the state space more
explicit. This is why I love projects with lots of tests. I can't be bothered
to analyze static code structure, either manually (what people call
'reading'[1]) or automatically. Just show me how the program is supposed to
run in all the different situations that you've considered. Let me change it
and rerun the tests to find out if I broke something.

Modern programming practice emphasizes tests, which is great. However, not all
kinds of tests can be written so far. So we end up doing manual certification
work everytime we release or publish a new version of software, for
performance, fault tolerance, etc. I want to make it all automatic. Some links
about my project in case you'd like to learn more:
[http://akkartik.name/about;](http://akkartik.name/about;)
[http://github.com/akkartik/mu#readme](http://github.com/akkartik/mu#readme).
I'd love to hear your thoughts, either here or over email (address in
profile).

[1] [http://akkartik.name/post/readable-
bad](http://akkartik.name/post/readable-bad)

~~~
reledi
Tests significantly help with understanding a codebase and safely making
modifications.

Unit tests describe the behavior and usage of the individual systems, while
integration tests describe business use cases in core workflows (usually just
the happy paths). Integration tests should ideally be written in a gherkin
style with feature description and acceptance criteria clearly outlined.

Tests should be optimized for readability foremost. For example, most people
try and make their tests DRY (which is an abuse of DRY because it should only
apply to concepts, not code, but I digress) while they should instead be
making them DAMP. Each test should be able tell a story without jumping up and
down around the file and outside the file. It's a lot more dangerous to
misunderstand a test than it is to have a little bit of code duplication in
your tests.

When I come across a project that has no documentation, I look at the tests.
This isn't an excuse not to write proper documentation of course.

~~~
crdoconnor
>Tests should be optimized for readability foremost. For example, most people
try and make their tests DRY (which is an abuse of DRY because it should only
apply to concepts, not code, but I digress)

That's wrong on so many levels. DRY applies to code.

>while they should instead be making them DAMP. Each test should be able tell
a story without jumping up and down around the file and outside the file. It's
a lot more dangerous to misunderstand a test than it is to have a little bit
of code duplication in your tests.

This is a problem with Gherkin. Gherkin is not particularly well suited to
making tests that are both DRY and readable due to its syntax.

~~~
akkartik
> which is an abuse of DRY because it should only apply to concepts, not code

As a 'copyista', I would change this to "DRY should only apply to production
code, not tests." It wasn't clear from your reply if you agree or disagree.

A couple of great recent links about this IMO under-discussed topic:

[http://www.sandimetz.com/blog/2016/1/20/the-wrong-
abstractio...](http://www.sandimetz.com/blog/2016/1/20/the-wrong-abstraction)

[http://programmingisterrible.com/post/139222674273/write-
cod...](http://programmingisterrible.com/post/139222674273/write-code-that-is-
easy-to-delete-not-easy-to)

[http://bravenewgeek.com/abstraction-considered-
harmful](http://bravenewgeek.com/abstraction-considered-harmful)

------
t3hprogrammer
I share a similar sentiment with a different phrase: "code is a living history
of past ideas, good and bad."

~~~
pluma
Code is a liability. The best code is code that doesn't need to be written.

~~~
ktRolster
If it doesn't need to be rewritten, if it is that good, then it is an _asset_
because you can depend on it.

~~~
ciroduran
Problem is... that code is an asset, until it's not. As most cases, it depends
on the project, but changes in requirements have insidious effects on code,
which might turn your asset in a liability before you realise it.

------
milesf
A softer phrase that might work better is "All Software is Experimental". It
has less pejorative connotations, and could be a shibboleth among seasoned
developers.

~~~
akkartik
Yeah, exactly. But what does it mean if even seasoned developers rely on
interfaces in all this experimental software staying fixed for all time? The
article falls for this disconnect as well, with its great enumeration of the
problems contrasting with the repeated references to interfaces as a solution.
Interfaces are part of the _problem_. (Or more precisely, our propensity to
freeze interfaces, and our reluctance to rethink interfaces at will.)

My favorite bit of the article is something I've been thinking about for
years:

 _..paradoxically the cleaner and saner your interface the more likely it is
to succeed and thus more likely to become constrained by its users, to
solidify._

I've been idly dreaming of a future where we decouple modules not by their
interfaces but by their behavior. The major failure mode of our age is
premature freezing of interfaces, and it happens because it's hard to judge
how done the interface is from the outside. An interface that looks really
clean can be exposed by a detailed look at the implementation that covers a
corner case that hasn't been addressed yet. I'm trying instead to think about
the input space of a function. I'd like in future to be able to make
guarantees like this:

> The behavior of this function is fixed when argument _foo_ is less than 5.
> However, for larger values we're still nailing down some details.

The argument _foo_ might shift from being the first to the third, or its type
may change in some subtle way (that still preserves previous guarantees). So
upgrading will often cause some pain. However, the upside is that the pain is
bounded. You never end up in some 1% situation where upgrading your OS borks
up your Rails app and takes days to fix, or breaks things in subtle ways that
you don't notice for months.

You know that the upgrade effort will be bounded because you know that as long
as you pass in the right value into the right slot, and as long as it's less
than 5, behavior won't change and all your tests will pass. That seems like it
might be superior to an interface, even if it adds a little extra work up-
front.

More details, in case you have any thoughts:
[http://akkartik.name/about](http://akkartik.name/about)
[http://akkartik.name/post/libraries2](http://akkartik.name/post/libraries2)

~~~
akkartik
I've been chatting with the author over email about my no-freezing-interfaces
idea, and the discussion helped uncover two situations I haven't considered
sufficiently:

a) Security issues. We need some way to allow people to quickly apply certain
tiny changes without having to worry about the possibility of an interface
change, while ignoring all the others.

b) Clients sometimes end up in situations where they can't control when
upgrades happen. That would break things even worse with my approach than it
already does. My proposal really relies on people being able to control when
they upgrade.

Back to the drawing board..

------
tariqali34
Legacy software is successful software. What's the point of building a program
that will be discarded after a few months because its users fled to the next
big thing? That just means more lines of code being written overall,
needlessly.

~~~
bottled_poe
Legacy software is software that served a purpose but no longer meets the
business objectives. Some common factors that influence the decision to
replace a legacy system with a new one include: maintenance cost, scalability,
flexibility. The usual approach in startups is to build something at minimal
cost, then replace it later if the business requires it. Everyone has a
different opinion regarding the importance of these factors at various stages
of software life.

------
vinceguidry
I used to fantasize about having a code base I'd maintain for the next X0
years. Something like Dwarf Fortress, or hell, even TempleOS. Then I slap
myself and come to my senses. Who knows how different coding will be 10 years
from now? Do I really want to get married to something that's bound to be
obsolete sooner rather than later?

~~~
my5thaccount
If you aim to create great software that is flexible and powerful from the
beginning, you'll be fine. Many businesses are still running very well on my
10 year old code and will be 10 or even 20 years from now I'm sure.

Let this inspire you:
[http://www.bricklin.com/200yearsoftware.htm](http://www.bricklin.com/200yearsoftware.htm)

~~~
mkramlich
if this is who I think it is, Dan Bricklin, I want to chime in to say thank
you for VisiCalc, sir! I was a user and big fan of your work.

for the kids/newbs: effectively the (co)inventor of The Spreadsheet. and
consider that spreadsheets were easily _the_ killer app for the first PC's,
thus giving them much more business value, in turn causing more money to flow
in, helping to create many more paying jobs, etc, in a happy snowball effect
which leads to Linux, Google, AWS...

here's where you might say, "sorry, different Bricklin." oops

------
gaius
Legacy means "in production, making us money".

------
mooreds
We should all be so lucky as to write legacy code because the alternative is
worse: code that is never deployed, or deployed only for a short time. A
project and website I worked on for about 10 years was recently shuttered, and
I am extremely proud of what that code did for the business which paid me.

