
Reading the NSA’s codebase: LemonGraph review - ohjeez
https://ayende.com/blog/184066-C/reading-the-nsas-codebase-lemongraph-review-part-i-storing-nodes
======
grogenaut
Often you wrap a library api with your own to minimize impact of updating
library versions to the rest of the codebase or to provide cross version
support, not just to say swap it for Oracle. At a game studio I was at we
shimmed all of the PS3 API. No system Apu was allowed to leak past one layer
in our codebase. It helped us move to PS4 quicker. Or to windows (with
everything nerfed) between the consoles.

~~~
maxxxxx
I like to do that too. Having a thin layer over the actual API makes
maintenance much easier and you can test your layer better. But it's hard to
implement this in a bigger team. A lot of people don't see the point.

~~~
laumars
It really depends on the API. For stuff that is likely to change it makes
sense to abstract away the API calls. eg either because the API provider likes
to break stuff or because you need your code to be agnostic towards that
particular library eg for portability reasons. But that isn't always a concern
and if you're abstracting away an API just for the sake of it then you'll have
to question whether you're adding unnecessary complexity and overhead just for
the sake of perceived modularity.

Like everything in software development, it's about evaluating the right
methodologies for the problems rather than hitting every screw with the same
hammer.

~~~
joebergeron
Hitting screws with hammers, eh? Ever drive a nail with a drill?

~~~
berti
It's a reasonable way to drill a few holes in wood if you're in a bind.

------
potiuper
"As an aside, I’m loving the fact that you can figure out the call graph for
methods based on the number of underscores prefixed to the name. No
underscore, public method. Single underscore, private method used internally.
Two underscores, hidden method, used sparingly. At three underscores, I could
tell you what it means, but given that this is NSA’s codebase, I’ll have to
kill you afterward."

------
isaachier
To be fair, the way malloc is used is actually the preferred way. If you
change the type of the pointer variable, the malloc is still valid.

~~~
Chabs
I'm mildly surprised to see the NSA using malloc instead of calloc, especially
for allocating what seems to be a partially initialized struct.

~~~
foobiekr
Really depends on whether they know that all fields will be written before
return or not.

This way hey are calling malloc() is very common in professional code bases
and a safety measure; I’m surprised the author was surprised.

~~~
mediocrejoker
The way I read it, the author is surprised that struct node_t is typedef'd to
node_t* and not simply node_t

I am not used to that style so it seems odd to me, but I suppose if it was
common in a codebase one would get used to it.

~~~
convivialdingo
Most compilers in the 80’s and 90s had pretty restricted grammars based on K&R
before C89 and newer standards fixed these sort of things. I’ve seen a bit of
code like this from then... Off the top of my head, I’m thinking the Amiga
toolkit and Xwindows. Old embedded code may still have some of this going on
as well.

~~~
foobiekr
Sometimes limits are good. They lead to people playing it safe.

I've spent most of my life dealing with C code dating from 1986 and earlier to
present day. Once you get used to the stylistic conventions one of the things
that falls out is the simplicity even for huge, old legacy code bases. Often
little more than grep or etags is sufficient to competently navigate and
follow the flow of this kind of code. Simple things are simple: what
callstacks can end up here? etc. The code often has the property that if you
printed it you'd still be able to navigate it without issue. Given a random
page and a line, getting to somewhere would be possible.

In comparison so many modern code bases are just completely incomprehensible
without a tool in the form of an IDE (and often still quite challenging with
them because even the IDE isn't sure, and so you must resort to runtime
analysis). So many "tricks" used which require tools and require the tools to
be bug free and robust.

The closest I have come to "pick it up and you can read and understand it
without assistive technologies" is Go.

------
superbatfish
What's with the name??

There is already a well-known and popular library for graph analysis (in C++)
named "Lemon Graph Library". It is 10 years old. As far as I can tell, this
new library is not related in any way. (Am I mistaken?)

Did they not simply google for the words "lemon graph" before they published
this code base? Or am I missing something?

~~~
xtrapolate
Do you seriously think they care about re-using a name for a throw-away non-
classified project? People have better things to do with their time.

~~~
ralston
The idea of the NSA googling `lemon graph` then saying "Welp, looks like
someone else already has it. Back to the drawing board", is actually pretty
hilarious.

------
schaefer
On the topic of code reading, can anyone endorse any tools specific to reading
a code base, rather than development?

I've been using vim and cscope forever, but I'm wondering if there's anything
new and interesting out there.

...or even screencast recommendations of folks that are particularly
proficient at this.

