Hacker News new | past | comments | ask | show | jobs | submit login

>>industry average bugs per 1000 lines of code at 15-50 and Microsoft released code at 0.5 per 1000, and 0(!) defects in 500,000 lines of code for NASA

Does anybody know how useful this metric is?

I have read NASA uses C, C++, Java and Ada, of which the last three have a lot boilerplate. Heavily commented C can be verbose too. I realize there is probably a lot of review, commenting and redundancy built-in, and that adds to the overall LOC too.

An MS bug might mean Excel crashes, or a bad business decision is made, unless somebody is using MS software for more critical end points. With NASA, it can be guidance systems, and other low-level routines running in radiation-hardened electronics.

Rosetta lander used Forth onboard, so there were probably a lot fewer LOC to make mistakes in. Just one way to approach bug-free programming vs. ADA's or Java's verbosity and checks.

I write J code, so one error usually amounts to 1 in 5 or 10 LOC ;) But then again, I can see all my code in one glance, and I program iteratively in the REPL.




I had the same difficulty looking at the code. Do we count import lines and variable defs? Seems sort of lame.

J of course has a different problem; only a handful of people can even parse it, much less opine on correctness.


Handful is a bit extreme ;) But seriously, it is easy to troubleshoot due to the interative nature of development in the REPL, the similiarity with mathematical formula and their layout. A PhD student wrote his thesis in 2008 about parallelization, FPGAs, ASICS and arrays, and was fully intending to write it in J, but his advisor suggested something more known, so he wrote it in Haskell. I'll put in the reference when I find it.

There is a table in the paper that shows the math formula, the Haskell and then J. Pretty interesting comparison.

To me, if you are not using a prover, then it is mainly going over 10K LOC 1 to 3 times vs. 100 LOC 10 times, 30K LOC reviewed vs. 1K LOC reviewed for errors and correctness.


"it is easy to troubleshoot due to the interactive nature" isn't going to cut it. You cannot change the code of an etherum contract after the fact. Once it's used there is no second chance.


Please don't create many obscure throwaway accounts on HN.

This forum is a community. Anonymity is fine here, but users should have some kind of consistent identity that other users can relate to. Otherwise we might as well have no usernames and no community at all, and that would be an entirely different forum.


Thanks for bringing it back to the OP's topic, but I was specifically addressing the quote the OP made about number of errors per LOC tally. I was pointing out how it could be weak metric, since the languages used in the examples might contain many LOC of template boilerplate that would make for a low bug:LOC off the bat.

An ADA hello world is 5 LOC vs. 1 for a lot of other languages.

Java is not too different.

J and Python are 1 LOC, and typically not template text, but originally coded.


I'm ok with boilerplate when it buys you something useful, like readability, explicitness, and security, as in Ada.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: