
Concord – High Performance Stream Processing with C++ and Mesos - karamazov
http://concord.io/why
======
cstivers1978
I really like the idea behind Concord. No longer have to stress about
Hadoop/YARN platform, and I can use the language of my choice (haven't used
JVM-based stack in ages).

Is there documentation on adding more input/output sources?

~~~
followtherhythm
If by 'input/output' sources, you mean computations that will pull or push
from an external system such as kafka/cassandra, there isn't any documentation
currently. However we have written connectors to Kafka and Kinesis. You can
check out the Scala Kafka Source here [1]. Internally we are working on a high
performance Kafka Source in C++ (based on librdkafka). At the moment this
source can push records downstream at a rate of > 350K QPS.

[1]: [https://github.com/concord/concord-
jvm/tree/master/concord_k...](https://github.com/concord/concord-
jvm/tree/master/concord_kafka_consumer)

~~~
cstivers1978
I meant connectors. Thanks for the pointer.

------
cstivers1978
Is it possible to abstract the metadata store, which is currently Zookeeper,
to be something else like etcd and consul?

~~~
followtherhythm
Since Mesos uses Zookeeper and we are tightly bound to Mesos, at the moment
this is not possible.

~~~
agallego
To expand on this. We use mesos. Mesos is mostly deployed with zookeeper.
'Technically' we can by swapping the library, but unlikely.

~~~
cstivers1978
Thanks. Wasn't sure you were using the Zookeeper instance used by Mesos. I had
assumed it would be a separate instance.

------
amn0408
We use this at my company. Fast implementation even for those new to stream
processing. Up and running in hours.

~~~
erichocean
DC/OS[0] is a fast way to set up Mesos (via Terraform[1][2]) if you just want
to kick the tires without having to learn a bunch of stuff.

Note: No relation to the DC/OS people, just a happy user.

[0] [https://dcos.io/](https://dcos.io/)

[1] [https://www.terraform.io/](https://www.terraform.io/)

[2]
[https://dcos.io/docs/1.7/administration/installing/cloud/pac...](https://dcos.io/docs/1.7/administration/installing/cloud/packet/)

------
followtherhythm
Hello, I am a software engineer at Concord. Happy to answer anyones questions

~~~
sunnyshahmca
Is this open source? Can you please point me to the schedular code?

------
LilBibby2342
Is Storm still iterated and improved upon regularly? Curious if this is
simpler than Storm because it's the single focus of the founders. Could be a
big advantage of Concord, right?

~~~
followtherhythm
Yes Storm is still under active development. Yahoo, Baidu and other big
players continue to use Storm.

To be honest Storm's API is richer. Our approach however was to make stream
processing available to all developers. It doesn't get simpler than four
callbacks:

    
    
      void init(CtxPtr context);
      void destroy();
      void processRecord(CtxPtr context, FrameworkRecord &&r);
      void processTimer(CtxPtr context, const string &key, int time);
      Metadata metadata();

------
dangoldin
We've been messing around and looks very promising. It's much faster than
Storm so at some point we'd love to move to it. Anyone knows how it compares
against Flink?

------
adev3000
Awesome project and team. Highly recommended for predictable highspeed
throughput. Faster than any other OS distributed processing engine.

------
gratner
How does the performance compare to the traditional stream processors?

~~~
bimil
Here's some benchmark:

Word count of 1.13B messages \- Storm: ~16K QPS/node, 100ms per event (P999)
\- Spark Streaming: 100K QPS/node, 1s batch window \- Concord: 500K QPS/node,
10ms per event (P999)

Server log processing (29G server log, ~260M msgs) 7 different computations
including deduplication, counting, pattern matching, windowing... 4 nodes, 8
vCPU, 32GB RAM each Concord: 1M – 1.8M QPS / cluster Spark Streaming: 72K – 2M
QPS / cluster

Concord generally performed in the consistent range of 1-1.8M QPS for whereas
Spark's throughput varied differently based on window sliding / amount of
internal shared state.

------
tengudoll
10X faster than Storm. How? What are the tradeoffs?

~~~
followtherhythm
There are a lot of reasons why, probably enough to write a blog post about. To
summarize I would say that this is mainly a byproduct of being close to the
metal, and our use of lock free code on the fast path. Also our system is
relatively simple when compared to Storm which I also believe lends to less
work and improved performance.

------
florianleibert
Super impressive performance!

------
sonalmane
Great project and team!

------
pastaking
Cool project!

