
Seastar: C++ framework for high-performance servers - panic
http://www.seastar-project.org
======
jpgvm
Yup, this is what you can do when you code directly against kernel bypass APIs
like DDPK.

Important to note they also use effectively a shared nothing architecture
between cores. This makes it very NUMA friendly and further favours network
performance because you are able to lock processes/threads to the cores that
are directly attached to said network adapter.

The architecture is pretty well known but it's nice to see someone build a
framework out of it.

~~~
tobz
This. There are plenty of opportunities to try out userland/"kernel bypass"
network stacks, but I've always wanted to see a framework that integrates it
for me. Even if you still need to get elbow deep to do anything advanced, it's
great to see a template, a skeleton, on how to put everything together.

------
rbf_
They mention that this is using a polling architecture so each core is using
100% cpu even when idle. I've worked on high performance systems architected
around polling before, but assumed that would be prohibitively costly these
days now that cpu's have really advanced power management capabilities. Is
polling architecture still viable for large scale cloud computing given costs
of energy consumption? Years ago cpu's used a flat amount of power regardless
of cpu utilization but now those extra cycles cost watts and of course money.

~~~
seastarer
We plan to reduce power usage on low load eventually.

~~~
noir_lord
Sounds awesome but I wouldn't look at something like this (which is awesome
btw) unless I was expecting it be under massive sustained load.

For everything else I'd just use Python or PHP as they are mostly fast enough.

I am planning to write a framework in Rust when I have the time though as I
find it an interesting language and as a web dev it's my area, I'll definitely
be looking at how you approached stuff ;).

------
virtuallynathan
Hmmm... wonder if this could go even faster using Cluster-on-Die mode in the
new E5-2600v3 CPUs (It splits each CPU into 2 NUMA nodes).

Its also worth noting that the fastest memory configuration is 4x16GB
DDR4-2133 DIMMs. Gets you the lowest latency and highest bandwidth according
to Intel.

Another note, I see the CPU in the writeup reports as "cpu MHz: 1199.953" \-
have you turned off C-States?

~~~
seastarer
Do you have a link to this cluster-on-die mode?

~~~
virtuallynathan
This document from Intel covers CoD mode (Pg. 28) and optimal DIMM sizes (Pg.
37), etc.
[https://drive.google.com/file/d/0B21tKtZ3UOQNdWtJZW9XekFCZ3M...](https://drive.google.com/file/d/0B21tKtZ3UOQNdWtJZW9XekFCZ3M/view?usp=sharing)

You should also disable C-States (e.g. lock to C0) -
[https://rhsummit.files.wordpress.com/2013/06/shak-jeder-
summ...](https://rhsummit.files.wordpress.com/2013/06/shak-jeder-summit-perf-
analysis-and-tuning-part-2-2013.pdf) covers the performance impact of CPU
states.

~~~
seastarer
Thanks. We "disable" C-states by polling -- the cpu never goes idle. This is
partly because dpdk doesn't support interrupts, and partly because using
interrupts is a major performance hit. We have plans to improve this, but
nothing committed yet.

------
ckluis
Seems like a good one to add to techempower’s benchmarks.

------
kul_
Seems like, http problem needs to reframed as C1B(illion) now!

------
vinceyuan
"Applications using Seastar can run on Linux or OSv." Will Seastar support OS
X? If No, developer using Mac has to install Linux VM to development.

~~~
theonewolf
A lot of their speed comes from special tricks like bypassing the kernel and
going direct to hardware (DPDK).

Intel's DPDK doesn't support Mac OS X.

You should probably develop in a VM anyways, or at least with a platform +
hardware that supports these features so you can better match what production
deployments will be like.

I always cringe a little, for backend systems software, when people develop in
an environment completely divorced from their production environment. It just
creates headaches.

------
forgottenacc56
Errr..... 6.5 million requests per second?

~~~
th0br0
More like 7 million at 26-ish cores. so ~270000 req/s per core, which is still
a lot.

EDIT: here's the writeup [https://github.com/cloudius-
systems/seastar/wiki/HTTPD-bench...](https://github.com/cloudius-
systems/seastar/wiki/HTTPD-benchmark)

------
avdicius
Oops, I wanted to do something similar but in C:
[https://github.com/ademakov/MainMemory](https://github.com/ademakov/MainMemory)

------
ComputerGuru
DPDK supports FreeBSD, I wonder if Seastar runs on that?

~~~
seastarer
It ought to. There may be a minor porting effort involved. Note you need very
recent gcc.

------
framp
I'm curious to test it against an equivalent Warp application. Good job!

