
LemonGraph – A log-based transactional graph - adulau
https://github.com/NationalSecurityAgency/lemongraph
======
dnomad
At this point I really have to wonder how many people have written graph
engines on top of lmdb! In the past month I've seen two from the same bank
built on top of lmdbjava. Instead of reinventing this same wheel over and over
it'd probably make sense for somebody to sitdown with lmdb and tinkerpop [1]
and bang out one decent implementation.

...actually this has been done [2] but the project looks abandoned. So NSA
guys you should get right on this.

[1]
[http://tinkerpop.apache.org/docs/current/reference/](http://tinkerpop.apache.org/docs/current/reference/)

[2]
[https://github.com/pietermartin/thundergraph](https://github.com/pietermartin/thundergraph)

~~~
adulau
At least, this one is in Python. My dream would be to have a transactional
graph database written in Python which is used as back-end for NetworkX
([https://networkx.github.io/](https://networkx.github.io/)).

------
funfunfunction
This looks interesting. I have to say I find it amusing that the NSA has a
GitHub.

~~~
roryisok
And that they use Adventure Time puns to name things. I guess the NSA techs
are just a bunch of nerds with government contracts

~~~
cbcoutinho
A bunch of nerds with government contracts _that smoke weed on the way to
their interview_

[https://motherboard.vice.com/en_us/article/d737mx/the-fbi-
ca...](https://motherboard.vice.com/en_us/article/d737mx/the-fbi-cant-find-
hackers-that-dont-smoke-pot)

EDIT: oops that was 2014, apparently the FBI isn't having that issue anymore

[https://motherboard.vice.com/en_us/article/aepj4p/fbi-
mariju...](https://motherboard.vice.com/en_us/article/aepj4p/fbi-marijuana-
weed-hackers-hiring-weedweek2017)

------
codebeaker
What's a use-case for this? Is it like Neo4j or some other niche usecase (e.g
mass surveillance graph, given the source)

~~~
antonvs
Yes, it's like Neo4j but based on the in-memory database LMDB, presumably to
provide high performance running on a single machine.

The link mentions that its use case is streaming seed set expansion, which
allows you to identify communities based on a set of seeds. I wrote more about
that in this comment:
[https://news.ycombinator.com/item?id=17335873](https://news.ycombinator.com/item?id=17335873)

~~~
hyc_symas
Quibble: LMDB is a memory-mapped file database. It is not an in-memory
database, although it generally outperforms all other in-memory databases.

~~~
antonvs
Thanks, this is the first I had heard of LMDB.

I actually love the idea of a memory-mapped database, I've often thought
memory mapping isn't taken advantage of enough.

------
amelius
Why is adding properties to a node significantly faster (153k/s) than adding
edges to a node (25k/s)?

~~~
jpalomaki
Indexes? Seems to be there are indexes fromNode, toNode and for edges.

------
wiradikusuma
for laymen like me: what is this, and what are the perfect use-cases for this?

"..log-based transactional graph (nodes/edges/properties).. ..primary use case
is to support streaming seed set expansion." \-- I'm totally lost.

I know these kind of software is targeted at developers, but it won't hurt to
give analogy like "Uber for XXX" like in startup pitches. e.g. "It's like <put
popular product name here e.g. MySQL> but <differentiating factors>".

~~~
anigbrowl
Graph databases are a really neat thing that liberate you from the need to
figure out your database schema at the outset and also allow much faster
searching than traditional table-based queries across huge datasets. They're
ideal for sparse data or for collections of data whose structure/relationships
you're not sure about, and also allow very fast searches because the number of
steps between different nodes typically grows more slowly than the number of
records between different table entries.

There are a bunch of them on the market, Neo4j is probably the most popular
(and has lots of good quality introductory text on the website and on
youtube). Graph databases are key to many major internet services, eg Google,
Twitter, and Facebook are all just really big graph databases.

This particular graph database stores all its data in a single file trading
speed and simplicity off against flexibility. I'm not an expert but it seems
like it would work very well for search queries, but poorly for tasks
involving a lot of contributors like a chat server.

------
znpy
The NSA released a graph database. I wonder what they use it for. /s

------
kapustinsky
Unix commands: awk - AWKWARD. When you write in awk you become awkward. sed -
When you write in sed you become extremely sad.

~~~
floatboth
Wrong thread?

