

Ask HN: A better metric than lines of code - blintson

People often compare programming languages by lines of code. The problem is that a more verbose language usually requires less thought per line than a terser language.
Maybe a better metric of a programming language's complexity could be had by measuring the size of a compressed source-code file rather than line's of code.<p>Anybody disagree and think the opposite is true;
that a simpler syntax usually means more thought?<p>Anybody here ever heard of somebody doing something like that?
======
mrshoe
I'm going to slightly disagree with other comments here and say that LOC is a
meaningful metric. It might not be the best, as OP suggests, but it's worth
something.

Having written a lot of Python it is now frustrating to write C++, mostly
because of the verbosity. Instead of thinking about the problem I'm trying to
solve, I'm thinking about how to deal with the language or necessary data
structures. Python just gets out of your way and lets you solve the problem at
hand quickly and concisely.

I don't think more thought goes into a line of Python code, and if it does,
it's productive thought. It's not the "what's the 6th argument to this
function?" or "is the type of the 3rd argument to this function really just a
typedef for a float?" type of my-language-requires-me-to-jump-through-hoops
thought.

To me, all that verbosity means more time thinking about the code and less
time thinking about the problem.

~~~
alanthonyc
I thought this what the OP was trying to address.

In your example, the C program would appear to have been a more productive use
of time, but in reality, the Python program was - even as it was much smaller.

------
gdp
I believe compressed source code size is one of the metrics available on the
"Computer Language Benchmark Game":

<http://shootout.alioth.debian.org/>

I feel relatively strongly that LoC is almost entirely worthless as a metric.
Even compressed source code size makes me uneasy. Shameless self promotion: I
wrote a blog post which appeared on HN just the other day about better ways of
comparing programming languages.

<http://news.ycombinator.com/item?id=766841>

In short, I disagree that there is almost any relationship between mental
effort and verbosity.

~~~
axod
I think it'd be nice to just all stop comparing programming languages, and
realize that it doesn't actually really matter much. Use whichever you love.
(Failing that use the one you hate the least).

~~~
scott_s
Some of us are interested in comparing programming languages because we find
the expression of computation inherently interesting.

~~~
axod
There's also linguists that love comparing spoken languages. But for most
people, most spoken languages are good enough to communicate - just like most
programming languages are good enough to express what you want.

~~~
jacquesm
By comparing things we learn. By comparing programming languages we learn
things about them, such as how easy it is to express yourself.

By doing that over time programming languages evolve. If we didn't compare
programming languages you'd be programming ALGOL right now and the whole web
things (much less the internet) would have never happened.

------
rntz
According to Paul Graham (<http://www.paulgraham.com/arcchallenge.html>):

    
    
      The most meaningful test of the length of a program is not lines or
      characters but the size of the codetree-- the tree you'd need to represent
      the source. The Arc example has a codetree of 23 nodes: 15 leaves/tokens
      + 8 interior nodes. How long is it in your favorite language?
    

So instead of measuring lines of code, you measure the size of the AST. Works
well for Lisp, at least; not sure how well it generalizes to other languages.

~~~
gdp
That's probably a _better_ metric, but I still think it's making a statement
about some causal relationship between size and quality, which I think is
complete bunk.

I can solve a problem in 1 line of buggy code. You have to write 1000 lines,
but it is mathematically certain that yours works. Whose is the better
implementation? Mine might be O(n^n) in the worst case, while yours is O(1).

See, I'm talking about specific _programs_ , but these sorts of comparisons
extend fairly naturally to programming languages too.

~~~
scott_s
Clearly, the 1000 lines of mathematically proved correct code that is also
faster is better. But that's a contrived example. In general, less lines means
less bugs.

That you can contrive counter-examples does not invalidate the metric for use
in the normal cases.

------
ken
Both metrics seem to assume that source code is a fixed thing. My bottleneck
when working with source code is not write or read, but modify. Therefore, the
most important metric to me is whether an O(1) conceptual change requires only
an O(1) change of source code.

A small line count / compressed size / AST node count is probably a good
indicator of this, but I'm not convinced it's exactly the same.

------
tetha
I think anything directly related to the syntax of a language is a very, very
dull metric. I say 'dull', because the LOCs certainly can tell you something
about the size of the project (if 3 guys crafted 3 million lines of code for a
project, and 3 other guys did the whole thing in 12k lines of code in the same
language, something is eery), but it is very hard to make fine statements with
them.

I much rather would like to see metrics which actually involve the complexity
of the code involved, that is, a measure for 'what the user has to keep in his
head'. The cyclomatic complexity (~#of paths through a method) is a good
example of a step in the right direction in my opinion. I just don't know how
one would formalize the complexity of a framework, for example, where one has
to remember 'ok, I need to implement this, this and that method in my class
and these contracts have to hold' and such :)

------
huherto
For business applications take a look at "function points". It is not perfect
but it may help you.

Some people use LoC without comments or blank lines.

In general these measures can be useful indicators but have to be taken with a
grain of salt.

Also take a look at this Martin Fowler's article
<http://martinfowler.com/bliki/CannotMeasureProductivity.html>

------
scott_s
A common claim I hear from software engineering researchers is that the number
of errors in _X_ lines of code from a person is constant. "Lines of code" is
supposedly independet of programming language.

If this is true, then the implication is that when you can implement the same
functionality in less code, it will also probably have less bugs. So, if we
look at opportunities for bugs, I think lines of code _is_ relevant.

~~~
gdp
I don't think that's been true for a long time. It might be true _of a given
language_ , but I would almost guarantee that the language being used would
make a much larger difference to the number of bugs than something as trivial
as code length.

For example, let's say I was writing in language X and the average error rate
was 1 error per 10 lines. I write 100 lines of code (because I'm very thrifty
with my code!) and so my program has 10 bugs.

Now, I write in language Y where the average error rate is 1 error per 20
lines. I write my program in 150 lines, because Y is a bit more verbose.

I'm sure you can do the math. Would you seriously try to tell me that using
lines of code is a relevant metric for making comparisons _between programming
languages_?

I just made up my estimates arbitrarily. I think the difference in error rates
between a language like PHP versus a language like ML would be even more
dramatic.

~~~
scott_s
Your example, though, outright assumes the claim is false because you have
different error rates for different languages.

I agree the metric is coarse, but I think it's a decent stand-in for things
that matter. More lines of code tend to mean more concepts unrelated to the
problem you're solving, more assumptions and a greater cognitive load. All of
these introduce more opportunities for error. Consider the problem of
concatenating a sequence of text, and putting commas inbetween each distinct
entry. In Python:

    
    
      concat = ','.join(texts)
    

In C++, this would be

    
    
      string concat;
      for (list<string>::const_iterator i = texts.begin(); i < texts.begin(); ++i) {
        concat += *i;
      }
    

There are more opportunities for me to make mistakes in the C++ code. I've
introduced more concepts (using iterators to access a sequence, using
iterators to access a value, building a string) which increase my cognitive
load. Hence, I'm more likely to have more bugs in this code, despite solving
the same problem.

I don't even want to write the C code to do this right now, because I would
have to deal with the following concepts: allocating sufficient memory for the
C-style strings and ensuring I'm not accidentally overflowing that memory;
determining what kind of sequence texts is (native array or my own linked
list?); determining how to iterate over that sequence, and how to "add"
strings.

~~~
gdp
Right, so doesn't that example go back to my original assertion that error
rates differ between languages in reasonably intuitive ways?

~~~
scott_s
I think we're using different reference points.

I'm assuming that in a fixed amount of program text, the number of bugs across
all languages is relatively constant. More text means more concepts and more
opportunities to be wrong.

I think your point is that for a given problem, different programming
languages will yield solutions with variable number of bugs. In a language
that is simpler to express the solution, it's more likely to be correct.

These two points are compatible. I'm fixing the amount of program text, you're
fixing the problem.

~~~
gdp
From Withrow (1990), on ADA:

    
    
      Module LoC  |  Error Rate per 1k LoC
      63          |  1.5
      100         |  1.4
      158         |  0.9
      251         |  0.5
      398         |  1.1
      630         |  1.9
      1000        |  1.3
      >1000       |  1.4
    

This appears to show no such linear relationship between error rate and LoC
within a single language, with a fixed amount of program text.

This report appears to suggest an observed average error rate of 18 defects
per 1000 lines of code:

[http://www.lanl.gov/projects/CartaBlanca/webdocs/PhippsPaper...](http://www.lanl.gov/projects/CartaBlanca/webdocs/PhippsPaperOnJavaEfficiency.pdf)

and the same programmer generated an average of 6 defects per 1000 lines of
code in Java.

This is just a random sampling of error rates I could find quickly with a
google search and a paper I already had on my desk, however the fact that
three samples from three projects (two of which were from the same programmer)
suggests that there is likely to be variation both in terms of the (fixed)
amount of program text, and the problem between languages.

------
peterbraden
I actually like LOC as a metric.

Base on the truism that every line of code has a potential bug, even if the
line is cruft, then the length of the codebase is proportional to the
potential bugs.

It's a simplification, obviously, but LOC is useful in this respect.

The main problem with LOC IMHO is people try and use it for the wrong purpose.

------
Derrek
The LoC metric drives me insane. To entertain myself, I mess with the LoC
"bean counters" by writing overly verbose, first-pass code and then optimizing
it down once I'm finished. It's my own little form of rebellion.

"Measuring programming progress by lines of code is like measuring aircraft
building progress by weight." \- Bill Gates

~~~
scott_s
That's a different application of the same metric. Here, we're using LoC to
measure the relative power of different languages. In your example, managers
are using LoC to measure a developer's productivity.

~~~
Derrek
oops, I guess I'll have to pay closer attention

------
MaysonL
Probably the best way I've seen of comparing languages is by what proportion
of the program is problem solution versus how much is language/syntax cruft.

------
mpk
WTFs/minute during code review :

<http://www.osnews.com/story/19266/WTFs_m>

------
jganetsk
Nodes in the parse tree

------
known
How about number of users and applications per line of code?

------
jaspervdj
The only good metric in my opinion is time spent.

------
gnosis
The metric I use is lines of comments.

