
PyGraphistry – A library to extract, transform, and visually explore big graphs - sndean
https://github.com/graphistry/pygraphistry
======
lmeyerov
Fun to see a link here!

Always happy to describe what's happening underneath w/ connecting GPUs in the
browser to GPUs in the datacenter. Likewise, the connection between event data
& graph analytics is powerful as data scales, so happy to dig into that too.

Not shown there, we're piloting a 'visual playbook' investigation layer to
help teams who investigate through a lot of event data. This has been
especially relevant for security (SOC/IR/hunt) and anti-fraud as a team grows
and needs to cover more ground. Playbooks let you finally record common multi-
step multi-datasource workflows and get real visibility out of them. When your
alerting flags something, running the playbook will gather & correlate that
data for you, and unique to Graphistry, present it in a full visual (graph)
analytics session. Think visually automating multi-step queries across Splunk
+ Spark + various APIs. And of course, for the advanced analysts, giant GPU-
accelerated visualizations. We're actively piloting with interesting teams, so
please ping info@graph....com if it may be your team's kind of thing.

~~~
ThePhysicist
Is it possible to visualize graphs entirely on the client side (without
sending any data to your backend)? We have some very large graphs that we'd
like to explore, but unfortunately it's not possible to send the data to the
cloud, hence a local solution would be great. I have investigated Gephi but
unfortunately the performance is quite disappointing for very large graphs.

~~~
IanCal
Not sure how large very large is or the performance you need but you might
want to check out the fairly recently added graph support in datashader
(examples:
[https://anaconda.org/jbednar/edge_bundling/notebook](https://anaconda.org/jbednar/edge_bundling/notebook))

Here's an example of the output which I then put into tiles & rendered out:
[https://proseandcode.co.uk/beta_gtr_viz/](https://proseandcode.co.uk/beta_gtr_viz/)

I think there's 600k edges in this, all rendered.

I did some of the work on the edge bundling, the split out code is here:
[https://gitlab.com/ianjcalvert/edgehammer](https://gitlab.com/ianjcalvert/edgehammer)

All of this is more work than just loading it into a current program though,
but it might be a useful component of what you need, hopefully a large useful
part.

Get in touch if I can help out.

~~~
lmeyerov
I'm a fan of the continuum team - we have been working together on the GPU
GoAI project. We'll load 2M+ edges, and edge bundling helps make that visually
useful.

Graphistry is a bit different where the result is a full interactive visual
analytics session, not a zoomable png. So you get visual filtering,
histogramming, search, etc. Our goal is to get from a lot of data to the
answer, including whatever data pivots/cleaning/etc., as fast as possible, and
that includes helping analysts skip a lot of the visualization/data coding and
instead do direct visual interactions.

Worth stating: Edge bundling is beautiful! It was an early algorithm we
implemented. A bit differently, we did it interactively, so you could do
things like adjust sliders real-time to get the right physics settings.
However, we found you'd want to hover over individual nodes/edges to see
what's in them, and "zoomable image" style makes that hard. I've been wanting
to bring it back now that we're getting close to supporting dynamic grouping
interactions, so cool to see you call it out.

------
technologia
Personally from using it, it goes well beyond just visualization of large
graphs. Its not readily useful to see so many nodes, but what they've done is
made it easier for people to parse through large graphs without a lot of hand
holding. I would point out the work of others in this field, like Marc Khoury
(which @MurrayHill1980 has already kindly mentioned), but this points to
making large graphs usable for a variety of folks in my opinion.

------
stared
Is it possible to run it locally? (i.e. without an API key)

~~~
lmeyerov
Yep! Feel free to give the cloud version a go, and if it looks useful, we can
help get you going on-premise. That would include our investigation platform
as well.

------
lmeyerov
I realized this is a great time to say -- we're hiring!

If you're into data visualization / UI engineering, fullstack node for
data/security, or enterprise security sales, would love to chat. Our team is
mostly in the bay area. If you've worked remotely before, that works great
too.

We're especially growing in the security market around incident response +
hunt. Our engineering work is around establishing more scalable best practices
for investigation teams, building out our fullstack app, and we're in the
middle of our next GPU visual analytics initiatives (accelerating interactive
visual analytics another 100X!). So a lot of good stuff happening.

------
m3nu
Another plot.ly pushing their freemium product via Github? No pricing on the
homepage yet. Will come later, after enough people integrated it in their
projects.

~~~
bearjaws
Anyone looking for an opensource version of plot.ly, take a look at
[http://www.metabase.com/](http://www.metabase.com/)

~~~
lmeyerov
We're more complementary to Plotly & D3 than competitive. As an analogy for
our developer API layers, we're closer to Google Maps than say D3 geo. Data
tool teams shouldn't have to sink so much time on building out all the little
interactions, just some configuring & skinning, and then back to data
engineering & analytics. The result is we're getting embedded side-by-side.
E.g., drop us into your Splunk/ELK dashboard that already does the basic
charts. Or pilot our visual investigator, and get a direct visual exploration
interface through a variety of connectors.

In terms of paid vs. unpaid, we're a pretty transparent team. We're iterating
on a sustainable pricing model for enterprise investigation teams, advanced
analysts, and developers building internal investigation tools. I don't
believe in charging source developers, research scientists, non-profits, etc.,
and we'll be making our ongoing outreach work with those kinds of folks into a
more formal program.

We have been and continue to be serious about community-minded open source
contributions. Our team has contributed research that helped shape the modern
web and leads open source projects that power a lot of tools used by the HN
community today. Just this week, we released some work as part of the GoAI /
Apache Arrow project, and that is part of our ongoing efforts to bring real
GPU compute to the web world.

Maybe a good time to write -- we're hiring!

------
bravura
Does anyone know a good tool to visually explore large ontology-type graphs?

~~~
jvilledieu
You may want to look into Linkurious:
[https://linkurio.us/](https://linkurio.us/) We provide a graph visualization
interface that can connect to Neo4j, DataStax (DSE Graph), Titan or
Allegrograph.

Disclaimer: I'm a co-founder of the company

~~~
btbuildem
[https://linkurio.us/try/](https://linkurio.us/try/) should go straight to a
demo page, not some signup popup

~~~
rspeer
It goes to a signup popup.

I'm interested to see if anyone's offering a large-scale graph visualization
that's not the "pulsating rainbow hairball".

~~~
mentatseb
Hi, I'm Gephi+Linkurious co-founder. I've found visualizing large graphs
pretty useless beyond the "I see meatballs!" effect and my opinion, after a
decade in the field, is that it's the wrong problem for data analytics.

Much more interesting information is discovered during the process of
dynamically building a visualization that is focused on user questions. I see
with Linkurious that investigators usually need to visualize less than 1000
edges of a 1M+ edges graph to get answers.

~~~
lmeyerov
The ultimate answer is generally a small graph: Graphistry is a tool that
helps you get there. Why that's hard is most Splunk, Spark, etc. queries will
return a bunch of events, and each event has a bunch of metadata. A tool
should help, not fall over.

I think you're referring to scenarios closer to why we created the visual
playbook concept and our embedding APIs. Small visualizations are often a good
starting point in investigative scenarios. Even better.. no visualization,
just full automation. We find this thinking comes up when the investigative
flow is more established and curated. With visual playbooks, teams can record
& automate multistep flows, run them whenever an incident happens, take
action, and share & document the results. If part of the incident involves a
bunch of events, or the analysts wants to dig in, our stack won't fall over.
Instead, it provides a full visual analytics session with multiple cross-
linked data views.

And we're fans of Gephi. We GPU accelerated the core algorithm -- we may be
coming from a different perspective and user base.

~~~
mentatseb
Yup, it's important that people understand the role of visualization in the
complete data chain.

~~~
rotten
I'm not sure I understand. Is there a resource that explains the role of
visualizations in data flows in the context explained here?

