
Stroom – a scalable data storage, processing and analysis platform - adulau
https://github.com/gchq/stroom
======
solidasparagus
I would suggest trying to get this associated with a community-driven open
source foundation like Apache. I think you will struggle to convince
developers or enterprises to use a data storage + analytics platform developed
and maintained by GCHQ.

------
tdons
"Stroom" is a Dutch word meaning either (electrical) "power" or, more likely
in this case, "flow".

~~~
rollulus
Stroom doesn’t mean electrical power, that’s “vermogen”. It means _current_.

------
billfruit
Does it meaningfully deal with binary data, that is can extract then encode,
decide them, handle parts with corrupted data, do analysis on them etc? What
about images and other large 2d data?

~~~
gchq-7703
It supports arbitrary data formats, including binary data. It sends corrupted
data or data that fails to parse to an 'error' stream where you can complete
further processing on it. I don't think Stroom is right for you if you're
processing data like images and other large 2d data sets. It is primarily made
for data that can be transformed to XML.

~~~
616c
Are you actually affiliated or is this just a cute coincidence? Hard to tell
from your message history, but I assume even if you are you would not pick a
username like that, lol.

------
Jordanpomeroy
Looks like GCHQ didn’t want to pay for Splunk licenses

~~~
hestefisk
Is this really equivalent to Splunk? Seems more like a mix of Apache Nifi
(developed by NSA) and Spark.

~~~
gchq-7703
I feel maybe the best comparison might be to Elasticsearch? It takes in mostly
log data, parses it (ala Ingest Nodes) and then makes it searchable / shown as
dashboards.

------
Dowwie
The marketing isn't very clear. Does this compete with Prometheus? Nagios?

------
amelius
What ecosystem does this work with best?

------
unixhero
Well this looks promising.

------
zomglings

        bash <(curl -s https://gchq.github.io/stroom-resources/get_stroom.sh)
    

_GCHQ (straight-faced)_ : Just download and run this shell script we wrote. No
funny business, we promise.

 _NSA_ : _snickers_

I think there's space for a person, a company or a tool to certify all the
scripts we run by passing the results of a curl directly into a shell. Wonder
if there's any money in it.

~~~
chrisseaton
What's significant about the shell script and curl? The entire repository is
software that they wrote that you can chose to run or not. Seems pretty
straight up and clear to me. Running it through curl and bash or downloading
it doesn't make any material difference.

Not sure what the need for the snide comment is.

~~~
zomglings
Was not being snide. Significance is that these installation scripts tend to
be managed separately from the application code and that there are more
avenues for attack via these scripts -- it is not usually apparent where the
scripts are coming from.

Running them directly post-curl without even verifying a sum of some sort
leaves me uncomfortable.

Besides it's a fun exercise to consider how you'd solve the problem of
securing an installation script which are much more homogeneous in behavior
than generic applications.

~~~
chrisseaton
> it is not usually apparent where the scripts are coming from

GitHub. The same place as the software. If you don't trust github.com's
servers then you don't trust either the software or the installation script.

~~~
zomglings
Where? I couldn't find the file quickly (and from my phone) on either the
stroom or the stroom-resources repo.

And when you find it, you still have to perform independent verification that
the file on GitHub is the same one you are downloading through curl.

You are treating their installation instructions as equivalent to "clone this
repo and run this script inside the repo" when they actually are not.

~~~
rzzzt
<org-name>.github.io/<project-name> content usually goes to a separate branch
in the repository, named "gh-pages" (although this is configurable):
[https://github.com/gchq/stroom-resources/blob/gh-
pages/get_s...](https://github.com/gchq/stroom-resources/blob/gh-
pages/get_stroom.sh)

~~~
zomglings
Ah nice, didn't realize it was on a separate branch. Thanks.

