
Why Nasa Converted Its Lessons-Learned Database into a Knowledge Graph - zzaner
https://blog.nuclino.com/why-nasa-converted-its-lessons-learned-database-into-a-knowledge-graph
======
mentatseb
The tech story by NASA's chief knowledge architect is more detailed on
[https://linkurio.us/blog/how-nasa-experiments-with-
knowledge...](https://linkurio.us/blog/how-nasa-experiments-with-knowledge-
discovery/) and [https://neo4j.com/blog/nasa-critical-data-knowledge-
graph/](https://neo4j.com/blog/nasa-critical-data-knowledge-graph/), with a
presentation video on
[https://www.youtube.com/watch?v=vwJyU9vsfmU](https://www.youtube.com/watch?v=vwJyU9vsfmU)

Disclamer: Linkurious CEO here, the tool used to explore the Neo4j graph
database used at NASA.

~~~
petra
Since linkcurious is for the enterprise, What would be your recommendations
for a personal knowledge database for individual users ?

~~~
antimatter15
I've been working on building a personal knowledge database tool recently,
feel free to shoot me an email at antimatter15@gmail.com if you'd like to be
one of the first to try it out.

~~~
FuckOffNeemo
Due to the number of crawlers on this site, I recommend (if it's not too late)
you edit your post to use the format

address at domain dot com :)

PS: Sent you an email. :)

------
rambojazz
I was hyped to read a cool article about NASA and tech, but this just reads
like an advertising for a software that I've never heard of.

~~~
ChrisLok1
Was about to post the same thing, multiple submissions by the same person for
the same product over the last 2 months.

~~~
Tomte
First, there would be nothing wrong with it.

Second, I have no idea what you're talking about. The submitter seems to have
exactly six submissions so far, and not a single other one about this. [Edit:
Obviously mistaken on that point]

~~~
glenneroo
I don't know who's submission history you're looking at, but zzaner's history
shows 4 of 6 links pointing to nuclino.com and a 5th is a link to Nuclino for
iOS. Only the 6th seems to have nothing to do with Nuclino.

------
Timothycquinn
I spent a bulk of my programming career modelling business processes in a
graph database with strong schema, lifecycle control (state machines) and
formal change control (revisioning).

I was always blown away with how easy it was to turn around a very stable and
useful system where the customers could actually understand the data model and
refactoring was easy to reason through.

Graph databases FTW.

~~~
riku_iki
What tools you used for this?

~~~
Jeff_Brown
+1

~~~
flyingsilverfin
Grakn (Grakn.ai) offers a strongly typed graph database that is open source :)
(disclaimer: work there)

~~~
Timothycquinn
Took a look. Interesting project. The abstraction model available to declaring
relationships/vertices was hard to grasp at first glance but as I thought
about it further, I can see some interesting benefits for pattern matching
grammar. eMatrix did not have abstraction for relationships/vertices types but
did for object/edge types.

------
aplc0r
I didn't know the LLIS was available online.

The first one I managed to click on was related to a fire in an employee's
car: [https://llis.nasa.gov/lesson/943](https://llis.nasa.gov/lesson/943)

~~~
amylowe
"Employee Falls Down Steep Ramp":
[https://llis.nasa.gov/lesson/21803](https://llis.nasa.gov/lesson/21803)

~~~
amylowe
That's as thorough as documentation gets
[https://llis.nasa.gov/lesson/589](https://llis.nasa.gov/lesson/589)

------
tiuPapa
So one thing I still don't understand is whether Neo4J a pure graph database
is better than using something like AegensGraph[0] or Cayley[1], which uses a
pre-existing DB engine for their persistent layer. If yes, what are the
advantages? Is it something that totally depends on the use case? If it is,
what criteria should be used to make a decision?

[0]:[https://github.com/bitnine-oss/agensgraph](https://github.com/bitnine-
oss/agensgraph)
[1]:[https://github.com/cayleygraph/cayley](https://github.com/cayleygraph/cayley)

~~~
the-alchemist
There's pros and cons to deciding whether to go "graph native" or existing DB.

PROS

You can optimize for exactly the types of queries that you want graph
databases to answer: shortest path, path finding, etc. Relational databases /
document databases are (generally) very poop at those types of queries because
those are not the types of queries people want to run on those databases. In a
"graph native" database, everything down to the storage on disk can be
optimized to perform graph algorithms.

CONS

There's years, sometimes decades, of engineering that goes into databases (I'm
thinking of PostgreSQL and Cassandra, both of which have graph "layers"
available). A lot of the engineering work is non-graph specific: ACID, how to
handle transactions, distributed computing, WAL, replication.

Why re-engineer all of those just to perform graph operations? More quickly.

Also, I can send you a good paper by the founder of DGraph Labs if you're
really curious.

~~~
tiuPapa
I would love to read the DGraph paper.

------
kendallgclark
The real NASA knowledge graph, with actual technical detail...
[https://www.stardog.com/categories/nasa/](https://www.stardog.com/categories/nasa/)

~~~
jshen
Yes, and thanks. This bit is really important, I see too many people who don’t
understand the difference between a graph database and a knowledge graph.

“So how did we build this thing with the smart folks at NASA as partners and
customers? The key takeaway here is that a Knowledge Graph platform is a
Knowledge Toolkit plus a Graph Database, and all of those components are
critical at NASA.

Doing this with a plain graph database isn’t going to work unless you want to
do all the heavy lifting of AI, knowledge representation, machine learning,
and automated reasoning yourself, from scratch. I’ll wait while you
decide…didn’t think so.”

------
brad0
I wonder if there is an enterprise app that does this for you?

I can think of plenty of examples at my work where spidering a website and
displaying it in a graph would be really cool.

Our wiki would be one for sure.

~~~
ice-berg
True, I've been using mind mapping tools but it's not the same.

Nuclino ([https://www.nuclino.com/](https://www.nuclino.com/)) looks
promising, trying it out now.

~~~
jcahill
New account + gratuitous namedrop of the product whose corporate blog is
linked in the submission.

------
dreamcompiler
This seems to be an advertisement albeit a strange one. They make it clear
that NASA used Neo4J rather than Nuclino. Neo4J is a true graph database, but
I didn't find anything on the Nuclino website that suggests what Nuclino
really is or what technology it uses.

~~~
nift
Nuclino is a tool to write documentation and the only thing "graph" about it,
from my understanding as a user, is you can link to different documents within
nuclino which then generates a graph. This graph nuclino visualises so the
user can explore the documentation.

In my experience this exploring thing kinda only makes sense when you want to
document doing/trying the same thing again ( which NASA probably is). If you
are just documenting how to connect to a database, set something up or similar
it, to me, falls pretty glat. Maybe I'm using it wrong...

No idea what they use under the hood.

Source: Use it where I work

~~~
dreamcompiler
Wikipedia says it's written in Javascript. If true, that's kind of horrifying.
[https://en.wikipedia.org/wiki/Nuclino](https://en.wikipedia.org/wiki/Nuclino)

~~~
tiuPapa
Why so? From their site, it looks like they are not really selling a grpah db
but a webapp. Lots of application are written in NodeJs.

------
weitzj
What I am looking for is a nice way (graph) where I can connect all kinds of
events/people/commits/bugs/tickets and jump between them.

Currently I am putting links on GitHub PR descriptions so I know in my
deployment GitHub repo, Who releases What, When and in Which cluster (where)

The PRs contain links to Jira tickets.

So all in all if you “sprinkle” enough links on GitHub Jira, I essentially can
click through them and answer the question, how that ended up here? What
changed? Where is the bug?

But I feel like this set of links referencing GitHub, Jira, PRs, Commits,
Error Reports would be really fitting in some kind of graph

------
mike555
This kind of reminds me of the FMEA and its web structure, which is very
useful.

It does share the big weakness with all the other such databases though, very
hard to convince people to use it, specially to add and maintain content.

------
fxfan
Does anybody here have a 'canonical' application or example in mind that shows
me what neo4j can do that matches my intuitive understanding better than the
'regular' RDBMS?

~~~
lmeyerov
That can be non-obvious, so fair. We (graphistry) get pulled into a lot of
investigative scenarios -- account takeover (web logs), malware/phishing
analysis (host/network logs & feeds), AML, claims fraud, etc. I found the
problems being solved to be some combination of: awkward to express with SQL,
too slow to run in a RDBMS, or hard to visually explore
relationships/correlations.

Examples:

=== Shortest Paths

1a. Referral: "Who on our team connected to which leadership at Apple?"

(target:Company[name="Apple"])<-[_:Leadership]--(champion)--[ _]--
>(us:Company[name="myCompany"])

1b. Supply Chain, AML, entanglements...: "How are these companies related,
even if 5 companies away, and across all sorts of relationship types?"

(a:Company[name="a16z"])-[r:1..3]-(b:Company[name="juicero"])

=== Neighborhood (incl. multi-hop):

2a: 360 context on a security/fraud/ops incident:

(hacked:Computer[ip="10.10.0.0"])-[e:Alert]-(metadata:_)

\+ (hacked:Computer[ip="10.10.0.0"])-[Login]->(u:User)-[e:Alert]-(metadata: _)

2b: fraud rings:

(fraudster:clientIP)-[login:http]-(b:Fingerprint)

\+ (fraudster:clientIP)-[x:http[method="POST"]]-(p:Page)

2c: Journeys (customer, patient, ...)

(a:Patient[id=123]-[e:1..2]->(b:_)

=== Whole system optimization / compute:

Personalized pagerank, supplychain optimization, business process mining, ...

===

The above can be extended, such as by adding in compute (correlation,
influence scores, ...). That feds into viz / recommendations / decision
making.

 __or: Not all uses of graph are end-to-end. We often get used with a graph db
to improve understanding it (our viz scale 100-1000X over the tools here via
GPUs)... but folks may instead plug their graphdb into a tabular frontend. Or
use us with a tabular system like Splunk /Spark/Elastic. So the above can be
hard to write in Splunk/SQL, or slow to run, or hard to visually understand.

------
wespiser_2018
One problem I've seen in start ups as they scale isn't the lack of good
documentation but the lack of information organization and hierarchy. The cost
you pay is repeating experiments/trials, and generally slower development. The
best way, I've found to overcome this, is to just talk to people and construct
an information map/hierarchy as a mental model. Obviously, this process can't
scale with the business. I wonder if this tool would be useful for
software/product dev in start up environments?

------
maxxxxx
Has anybody ever seen a knowledge database for a large organization that
actually works? I always see these efforts but usually they turn out pretty
useless because nobody keeps them up to date.

~~~
notyourwork
Does Wikipedia count?

~~~
maxxxxx
Wikipedia is pretty phenomenal but I was thinking more about a company. I have
never seen a company that has the culture of continuously contributing to
these efforts. They all fizzle out quickly.

~~~
lucb1e
In the two companies I've worked for, I was/am one of the main people that try
to update documentation. I should ask if my previous company still uses the
wiki that I kind of revived (it was full of useful info, but most people
preferred to bother others, and if they searched the wiki before, they would
not update the wiki with the newly learned info). By making it a useful
resource, I felt like more people would start to use it, but my six months of
employment were probably too short to really make that a reality.

I'm working in an even smaller company now - previously it was about 40
employees, now it's a handful. There are docs for the important things so
there is no single point of failure, but very few day to day things are
written down (like whom to report to that you're ill). As we grow, it's slowly
becoming worth it to document that (single source of info instead of either
having to bother the big boss or having different sources) and I'm looking at
options to organise it. Organising it topic-based (graph(-like)) is an
interesting alternative to the standard info dump with a search feature
(wiki).

Trying out Nuclino just now and putting some items into it, I additionally
noticed that having a separate system from your actual knowledge database can
also be useful: info pages are on the wiki, custom tools are in different git
repositories, project info might be in some task manager... If you have a
separate system (such as a graph) that just points you to the right URL
(wiki/task manager) or folder within a git repository, the system can outlast
any of the individual products being used. Then again, having a layer of
indirection makes it more time-consuming to use when you know that your info
page is going to be in the wiki. I guess it will have to be very quick to call
up and integrated nicely to make it worth it for others to use.

~~~
maxxxxx
The problem with separate systems is that when you want to look up something
you don’t know where to look unless it fits exactly into certain categories.
Right now we use OneNote within the team for everything. It’s not perfect but
at least I know where to look.

~~~
lucb1e
Having a system that organises the knowledge/info available should of course
be comprehensive. If not everything is in there, there is no point having it.

But it's a good point you make. Now that I write it like this, it makes it
more clear for myself how the system would work. It wouldn't just be another
separate system, it would be the index for all the systems and whenever
someone writes a new page where-ever, they should be required to link it into
this graph (or whatever form it takes).

~~~
maxxxxx
Also make sure that it’s really easy to figure out where to put new
information and then it should be easy to put the information there. From my
experience even the slightest friction or confusion will make people stop
using the system.

------
jordache
not very convincing. If the differentiator was the correlation of data in a
more meaningful way - It doesn't matter if you display the correlating data in
a list or a graph...

------
blackbrokkoli
Is there any way to view the knowledge graph in the new design? Lot's of other
people linking the database itself, but I can't actually find a link with the
new design...

------
shifto
This reads like SCP but with NASA and sometimes more scary.

------
larkinrichards
Never heard of this in my four years there. Hrm.

------
tabtab
It's not like space-travel knowledge-bases are rocket science ... oh, wait.

------
Zecar
Really rather not read a bunch of content marketing on this site. Could we
stick to news?

