Grapl looks quite interesting. I'm looking for something similar for public cloud (e.g. cloudtrail+config+?? for building graph+events). Is there a general pattern you employ for creating the temporal relationship between events? e.g. word executing subprocess and then making a connection to some external service. Just timestamp them or is there something else?

I think what you're getting at is Grapl's identification process. It's timestamp based, primarily, at the moment, yes.

A bit of the algorithm is described here: https://insanitybit.github.io/2019/03/09/grapl

More specifically Grapl defines a type of identity called a Session - this is an ID that is valid for a time, such as a PID on every major OS.

Sessions are tracked or otherwise guessed based on logs, such as process creation or termination logs. Because Grapl assumes that logs will be dropped or come out of order/ extremely delayed it makes the effort to "guess" at identities. It's been quite accurate in my experience but the algorithm has many areas for improvement - it's a bit naive right now.

Happy to answer more questions about it though.

Based on what you seem to be interested I'd like to recommend CloudMapper by Scott Piper.


The blog post is super helpful! I think the session concept is the thing I needed. Thank you!

I tried running cloudmapper but I think I would need to replace the backend with a graph database and scrap the UI parts. We've got hundreds of AWS accounts and I'm having trouble just getting it to process all the resources in one of them.

FWIW, Scott Piper, who builds CloudMapper, also consults.

Glad I could help.

