

Data-Driven, Recursive Interfaces for Graph Data - rjurney
http://datasyndrome.com/post/1370125938/data-driven-recursive-interfaces-for-graph-data

======
bsaunder
Interesting article, thanks for posting. To mesh with the recent "Rands in
Repose" post, this gave me some good relevance points to ponder for a little
while.

I've developed a similar fascination with graph data over the past couple of
years. My particular affliction has focused on treating program code as data
(nodes) and stitched together with edges.

For visualization I was considering generating partial 3D models using
something like StructureSynth and putting them in a 3D world like opencobalt.
Also stumbled across Orange (<http://www.ailab.si/orange/>) which looks useful
too.

~~~
rjurney
Check out WireIt, there is a lot of good stuff there for visual peogramming in
a graph interface: <http://neyric.github.com/wireit/>

I used it here to make a web version of PigPen:
<http://github.com/rjurney/Cloud-Stenography> <http://vimeo.com/6032078>
<http://wiki.apache.org/pig/PigPen>
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.134...](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.134.9888&rep=rep1&type=pdf)

------
jerf
FWIW, what you say is basically true, and in fact if you dig into _actual_
relational theory (as opposed to the bastardized subset that SQL gives you)
you'll find something that itself looks an awful lot like a graph.

But the problem that you will find eventually, and you should always be
mindful of, is that nobody has solved the problem of representing arbitrary
graphs _and_ getting the sort of good performance you expect from a web
service. This has been one of the big stoppers for "RDF", for instance, which
by the way is still something you should check out even so if you don't
immediately know what RDF is.

Intuitively, while I will admit I'm not an expert on the topic (just a
dilettante that has gone down some similar thought paths), the problem is that
a full graph has no structure to get a hold of and take advantage of in your
query. A traditional SQL table has a regular, recurring structure and obvious
indexes to use to optimize the performance (and in fact this is the source of
most if not all of the deviations from relational theory, IMHO). A NoSQL
database strictly limits itself to what is usually the equivalent of an SQL
record with one key (more or less) and a blob in it. (And some of them do
various moderately fancy things with that blob, but even so, a blob.) A graph
can just do anything it damn well pleases, and they do not only in theory but
in practice, and that becomes difficult to deal with in practice even when the
theory is beautiful.

~~~
rjurney
The key point is that you've done all your processing in batch, and you're
only displaying the most prominent or interesting properties and links for
each record. Hadoop/Pig/Python together are much more powerful than SQL if you
don't have a real-time requirement, and by the time we get to the key/value
store all the data processing is done. If you think you have a real-time
analytic requirement... well you may, but quite possibly you really don't.

Getting to that point through batch processing can be hard, but the
infrastructure is ideally suited to it. NoSQL doesn't impede you whatsoever.
It enables you to think correctly about packaging your data for recursive
consumption in trivial interfaces.

Real-time large scale graph processing isn't possible, or is very hard but...
its not really needed to do amazing things.

------
besquared
I have no idea what this person is trying to say. Can anyone elaborate?

~~~
rjurney
I can help. Which bit wasn't clear?

I was trying to explain the human interaction consequences of batch processing
and NoSQL in presenting mined data in web applications.

If you're not into any of those things, I could explain but... its a niche.
There's probably not too much point.

