If - in the event of a cyber attack - you are concerned with the run-time of your visual display tool for tracing back the source of the attack, then this article is for you.
I’ve noticed that majority of people working in “cyber security” are made of management material. Even those working on technical side of things. I’ve seen them spend all day on Twitter. You have to wonder if they ever get time for actually working on their stuff. As opposed to kernel devs, but they are likely assholes more than anyone else so not a fan of them.
As someone who has spent a career in infosec, I've seen the opposite. Infosec has far more counter culture remaining than software dev (I have worked both roles - currently I'm employed as a Software Engineer).
What infosec has much more of is sales people. There's just a huge market trying to solve a tough problem. So from an outside perspective maybe you see a lot of bullshit but it's a highly technical field. Really, I only learned to be a software dev to up my infosec game, I find infosec to be a far more challenging and interesting field.
I can agree with this, though I’m presumably (based on the language) earlier into my career. The deeply technical parts & the counter culture found in reversing forums & places like DEFCON are still thriving & always awesome.
However, it’s also full of bullshit & marketing that want to convince you that every little technical detail is a vector for a potential cyberattack! You don’t want to be the next business suffering from ransomware gangs, do you? Better purchase our advanced endpoint detection powered by AI/“Neural Networks”, and don’t forget your automated security scanning designed with ChatGPT and quantum encrypted database backups to stop the Chinese/Russian state actors/hackers in their tracks!
I only mean to say that the counter culture is there relative to software devs. It's depressing how far from its roots infosec has made it. BlackHat is a great example - it's a fed conference.
In terms of the explosion in products, sure. But that doesn't apply to the technical people. There are lots of problems because there's a large market solving a complex problem. If you're actually talking to technical people they can be of a pretty high calibur.
This could be because product development is much more quantifiable, while infosec is not. This means the grift can continue for a long time, before an actual ransomware incident happens and the team gets fired. Afterall many of the recent companies that were hacked had contracts with cyber defense firms, but they were just too busy posting memes on social media.
Side note: you should read my other comment on this thread.
I know a few full-time infosec professors who spend a ton of time on twitter. You would expect these people to lead path-breaking research in their fields, but all they do is share memes about infosec, and ofcourse a few papers a year written mainly by their grad students.
But maybe they too are salesmen types. Marketing their services to midwit management people on twitter. I include politicians in this too. Hurr durr china is going to defeat us in "cyber space" opens up a lot of business opportunities for government contractors. No one care about compilers, or operating systems the same way.
IDK maybe my bar is just so low across the board that I don't notice the difference anymore. Devs seem just as bad to me - they don't know shit about compilers or OS either lol
As for Twitter, that has to do with infosec having trouble communicating and a bunch of other stuff. I started to explain and it's not really worth getting into. Twitter was really important to infosec communication.
I am not aware of any incident reponse methodology or technique that would involve anything more than tabular data. Even things like maltego, very cool but haven't seen them used in IR just by intel people to organize stuff for themselves.
So are pivot tables in excel. If you show me a tool that can handle csvs (and excel is crap at huge csvs, timelineexlporer is great) and nice graph UI, I am all for it. The number one priority is not missing stuff so there is no way around having to collect all the right data, eliminate known noise and sift through the data to piece together a timeline.
I'm a noob here but what is stopping you from displaying or exploring your relational DB in a graph fashion? Why does the underlying database structure matter at this level?
I think this is how Sqrrl works or how they explained it when I was busy bombing interviews.
Sqrrl hasn't existed for a long time, but I can tell you that they used a graph database (I don't know if it's public which one they used so I won't go further).
As someone who actually built a graph based SIEM (Grapl) I have opinions here.
Relational databases are fine for graph work. So are non relational databases. What matters more is really how you can model your data and how easy it will be to work with in a "graphy" way, as well as how your system scales by default.
What a graphdb is going to do is get a lot of that "right" out of the box. You'll have a query language/ schema language that makes it easier to express graphs (annoying in SQL imo), that optimizes your edge indexing and join behaviors, that provides graph algorithms and optimizes for them (find a path, etc).
Typical relational databases are not going to be optimized for this work. You're going to need to be very careful about your indexes and queries to ensure that a join doesn't blow things up. You'll just have more work out of the box, and then it's just like... why didn't you use something that exists to solve this?
SIEM scale is much larger than people may expect. You have massive retention (years), billions of events a day, and lots of queries. This makes a lot of problems way harder - like intelligently indexing your edge data.
It is not about the underlying database structure, it is about complex, "branchy" relations and querying them, including properties of the relations themselves. Relational databases are not good with recursive join queries made against the same table. Reusing a recent example:
MATCH (a:Account)-[t:Transfer*]->(b:Account)
WHERE a.country = 'Canada' and b.name = 'John Doe II' and t.transfer_type = 'international'
RETURN a.ID
Which is… "show all accounts that have made international money transfers into John Doe II's account who lives in Canada. Source and target "table" is the same in this example.
It is much simpler and more efficient to use a graph database especially to make a complex query where a leaf node circles back either to a "source" or to any intermediate node using the same or a new relationship. SQL will look nightmarish and will likely require manual hints to the query planner / optimiser.
Graph databases also allow to add (as well as delete) new relationships in an incremental and non-invasive fashion, as they become discovered – a useful feature for the analysis of a large or unknown dataset.
Graph databases require valid use cases, though, and they are not a generic substitute for a RDBMS (or for a random NoSQL database).
Except a lot of relational databases have something like "connect by" that do the job pretty well... and of course if you design your graph wrong, a graph database is going to perform poorly as well.
Graph DB == connect the dots. Relational DB == join the dots. :-)
Steampipe [1] is an open source project [2] to live query all your cloud resources with SQL (via Postgres FDWs). It includes a dashboards as code layer written in HCL + SQL. We recently added support for relationship graphs & visualizations with hundreds of open source dashboards focused on cybersecurity [3].
In our experience, the hard part of these graphs was understanding which relationships / vectors are important to highlight and making them simple enough to browse in that way.
Nothing is stopping you from displaying it like that. However, if you have a huge dataset, you might experience slow performance and being unfeasible to do your analytics job. If you have a small dataset, you can use whatever you want.
I can't comment on this specific use case since it isn't useful to me during incident reponse and haven't seen anyone else even talk about mapping relationships this way. I personally prefer the simplicity and power of excel and tabular data.
That said, neo4j+Bloodhound is something I am heavily invested in and not just me but most serious threat actors use Bloodhound+neo4j to choose lateral movement attack paths. Graph databases are how hospitals or govenrment departments get fully owned, except when they are 90s era flat IP network with everyone an admin over every server lol.
Sorry Aaronstotle if it feels that way, I'm actually the writer of this article. I wanted to get an overview to people that can connect dots in their minds using "graphs", but don't think of them when it comes to developing solutions and thinking of how the parts of the attack are connected.
a lot of people taking jabs at this, but in principle it's a good idea. At least a few problems do surface to mind: 1) having clean data to work on in the first place and 2) having enough skills to write the right queries. Outside of Fortune 100 security teams both are a complete roll of the dice.
Yes, thank you for the reference. We're much appreciating comments, both positive and negative on the benchmarks, and are focused on even more workloads, and acknowledged datasets in the future
As for the title itself, I would not comment much on that.
I guess yes but the question is how do you model it and with what tools. And what are the questions you want to have answered with a Markov chains? Not to mention that they can likely explode with the number of data points.
I also have no idea what I am talking about.
The tldr; was that analysts are terrible. Powerful rule engines for the win. Also, accurately capturing and aggregating all the data in large organizations is very hard to do.