
Kore: a fast web server for writing web apps in C - api
https://kore.io
======
yuvipanda
Awesome! I've used
[https://www.gnu.org/software/libmicrohttpd/](https://www.gnu.org/software/libmicrohttpd/)
in the past (for fun stuff I was building to learn C), will try this out
instead. But..

> Only HTTPS connections allowed

Seems a bit painful. HTTPS on the webserver itself is a fair bit more painful
to setup and administer than HTTPS in your reverse proxy / loadbalancer (I
prefer nginx). Web servers should support plain HTTP.

~~~
jvink
You can turn off TLS on Kore.

$ make BENCHMARK=1

It is not a run time option by design, but it is there.

I want Kore to have sane defaults for getting up and running. That means TLS
(1.2 default by only), no RSA based key exchanges, AEAD ciphers preferred and
the likes.

edit: spelling

~~~
crest
RSA with DHE or ECDHE is a sane handshake. I would avoid DSA and ECDSA based
key exchanges because they fail catastrophically with bad random number
generators. For most APIs session caching is more important than a faster
initial handshake.

The HTTPS only choice would annoy me a lot because I run most HTTPS services
in behind a reverse proxy in a FreeBSD jail on the same host. HA proxy and
nginx are still superior to most applications in regard to reliable TLS
termination.

Using HTTPS by default a the right choice for a new project but offering no
HTTP support (outside of a benchmark) patronizes the user.

All in all this looks like a nice way to export C APIs through HTTPS.

~~~
jvink
Thanks.

I agree the BENCHMARK build option is a bit confusing. I might end up renaming
it altogether.

~~~
jvink
For sanity sake, this build option is now NOTLS.

------
randomfool
I'm curious- why C? Strings, scoped objects and C++11 move operators seems
much safer and clearer from an API perspective.

The complaints about C++ seem to mostly be around the ability to abuse the
language, not specific issues that C solves. Something like
[https://github.com/facebook/proxygen](https://github.com/facebook/proxygen)
seems like a better API.

And I don't quite buy portability- if it's not a modern compiler with decent
security checks then I'm note sure it should be building web-facing code.

~~~
perdunov
It seems that people are so intimidated by the infamous complexity of C++ that
they don't even want to bother getting more familiar with it.

So, although technically the existence of C doesn't make sense, as it is
superseded by C++ (except couple of things), C is winning in the branding
department.

~~~
pron
I don't know if this is the reason why people choose C over C++ for some
projects, but if language complexity is the reason, it isn't that people are
just "so intimidated by the infamous complexity of C++ that they don't even
want to bother getting more familiar with it".

In the 90s, C++ was much more popular than it is now. It was used as the go-to
general purpose language for all kinds of "serious" software (not addressed by
VB or Delphi). Early on, almost everyone was very impressed with the power C++
brought, but after a few years, as codebases aged, it became immediately clear
that maintaining C++ codebases is a nightmare. The software industry lost a
lot of money, developers cried for a simpler, less clever language, and C++
(at least on the server side) was abandoned en-masse -- almost overnight -- in
favor of Java, and on the desktop when MS started pushing .NET and C#. So
while today C++ is servicing its smallish niche quite well, as a general-
purpose programming language for the masses it actually proved to be an
unmitigated disaster. It is most certainly not the case that C++'s infamous
complexity is "intimidating"; C++'s complexity is infamous because of the
heavy toll it took on the software industry. Which is why, to this day, a
whole generation of developers tries to avoid another "C++ disaster", and you
see debates on whether or not complex, clever languages like Scala are "the
new C++" (meant as a pejorative) or not.

~~~
pikachu_is_cool
I also feel like a lot of people are biased because of Stallman and Torvald's
stance on C++. I kind of have an irrational distaste towards the language,
mostly due to their influence.

~~~
nothrabannosir
And Yosef Kreinin. Not arguing for or against the FQA, but its mere existence
catalyzed (not fed) my journey of critical thinking towards C++.

------
unixprogrammer
It speaks poorly of this community that you get second-guessed more for
building an application in C than for building one in JavaScript + Node.js.

~~~
j-pb
If you want to use C today you need a really good excuse. The tradeoff between
security and performance is simply not there anymore.

Unless you write on an embedded system, a game, or a high performance number
crunching application, C is premature optimisation.

And even in the above we see drastic changes today, embedded systems have
become so powerful that they can run scripting languages
([http://www.eluaproject.net](http://www.eluaproject.net)), game engines are
written in C and scripted with other things
([http://docs.unity3d.com/ScriptReference/](http://docs.unity3d.com/ScriptReference/)),
and inmemory-bigdata systems like spark offer significant advantages over
classical HPC frameworks like MPI ([http://www.dursi.ca/hpc-is-dying-and-mpi-
is-killing-it/](http://www.dursi.ca/hpc-is-dying-and-mpi-is-killing-it/)).

While JS is horrid, it at least doesn't have manual memory management.

~~~
fla
C is alot about portability, not necessarly performance.

~~~
j-pb
If you only care for portability you can always use one of the billion safe
languages whose interpreters are written in C. Lua for example.

~~~
vidarh
You may not _only_ care about portability. And "safety" sometimes also means
being able to meet realtime requirements, and also often means having the
ability to carefully account for resource use in ways that many interpreters
does poorly (and no, kernel enforced system limits are not always an option).

Many interpreters also make assumptions (e.g. expecting a POSIX'y system)
about the host system that many embedded platforms doesn't necessarily meet.

I've more than once looked at interpreters for embedding and found most of the
alternatives sorely lacking. Very few interpreters are well suited for
embedding on constrained platforms at all (Lua, admittedly is probably one of
the more solid exceptions). And once you start having to write lots of support
code in C to port or sandbox your interpreter of choice, the reason for
considering an interpreter quickly becomes less compelling.

------
slowmovintarget
Writing a web application in C sounds like a good trigger for an utterance
from the Jargon File: "You could do that, but that'd be like kicking dead
whales down the beach."

We've advanced the state of the art quite a bit with dramatically more
expressive languages than C that are sufficiently efficient in terms of memory
and CPU. This is especially true when communications are occurring over HTTP
and not direct socket-to-socket comms.

Why use C instead of D, Rust, Go, C#, Java, Perl, Python, Ruby, Scala,
Clojure, Erlang, Elixir, Haskell, Swift, OCaml, Objective-C...?

I didn't miss C++, it just seems a worse alternative than C.

~~~
vidarh
> Why use C instead of D, Rust, Go, C#, Java, Perl, Python, Ruby, Scala,
> Clojure, Erlang, Elixir, Haskell, Swift, OCaml, Objective-C...?

Because C runs pretty much _anywhere_? There are plenty of platforms where C
is available where I doubt you'd find any of the others above (e.g. C64; yes
there are C compilers for them; yes, I'm mentioning it tongue in cheek)

Because you can generate small, compact static executables? E.g. I used to
write network monitoring software and an accompanying SNMP server for a system
with 4MB RAM and 4MB flash, the latter of which had to include the Linux
kernel and a shell on top of the application in question. The system was so
limited we did not run a normal init, and couldn't fit bash - instead we ended
up running ash as the init...

There are plenty of use-cases where "web application" == "user interface for a
tiny embedded platform".

~~~
snarfy
> Because C runs pretty much anywhere?

I always hear this argument, but as time has progressed, for better or worse
'anywhere' has become a much smaller target. If your language runs on Intel
and ARM then it's good enough. There are a lot of reasons I might choose C for
a project, but 'run anywhere' is not one of them.

~~~
aninteger
I'd say Intel, ARM, Power, and Sparc is good enough.

------
ejcx
This looks pretty neat actually. A ton of effort clearly went into it and it
looks like the code is really well written with pretty well thought out
interfaces

"Its main goals are security.."

Is it actually?

I also don't really see an advantage of using something like this over
something like the Go net/http package.

Web-type API stuff is usually high enough level that something like C doesn't
make sense. Go has nice enough standard packages for system things that even
if I was doing a lot of system-y stuff I would be alright. I don't really see
the type of work I would be doing where I want to use this.

~~~
mambodog
It does seem like if your goal was security than perhaps Rust would be a
better choice.

~~~
ZoFreX
If your goal is security literally any high level language is a better choice.

~~~
homarp
High level language like php ? Security and secure code is a mindset. You code
without thinking that bad things will happen, you will get bitten.

Sure high level language can help with memory management... but plenty of CVE
are because of sloppy coding, not because of low level language.

~~~
girvo
While you are correct in that picking a higher level language doesn't shield
you from writing insecure code, insecure C/C++ failure modes are usually quite
a bit worse than other languages used for that purpose. I don't trust myself
to write code that handles memory 100% correctly and is also network-facing.
If much better developers than I manage to screw that up, what chance do I
have?

~~~
CHY872
Mmm, it can be difficult to take advantage from C exploits (in the context of
a webserver). On a well written C system, you might expect most bugs to lead
to crashes.

PHP bugs tend to be more exploitable, because you're doing something supported
by the language.

~~~
girvo
But what's stopping someone from writing an app-level vulnerability in C vs
any other language? Most of them are because of horrible handling of strings,
which is something C is also not that great at. I'm not seeing the security
benefit here.

~~~
CHY872
My gut feeling is that if you really want security in C, you have fewer
constructs to misuse and so you get less unexpected behaviour. At the same
time, you get more protection mechanisms; guard pages etc. I.e. harder to be
secure, but if you really want to harden, you can get harder than in higher
level languages.

I'm not putting any kind of weight behind that, though; I just feel that it's
a bit odd for people (not specifically meaning you, just the whole thread) to
criticise this purely on language choice and not put any substance behind
their criticisms that actually relate to the software in question.

------
rdtsc
The site and documentations looks well done, great job!

Architecture looks pretty interesting too. Wonder why was there a need for an
accept lock? Ordinary accept() socket call already allows for simultaneous
threads/process wait on a single socket.

~~~
luikore
I think the authors want to avoid thundering herd. You can find this basic
pattern in the book _UNIX Network Programming_.

~~~
jvink
Correct.

The accepting socket is shared between multiple workers which each have its
own fd for epoll or kqueue. Because of this a form of serialising the accepts
between said workers is needed to avoid unnecessary wakeups.

~~~
rdtsc
Actually that is being changed:

[http://lwn.net/Articles/633422/](http://lwn.net/Articles/633422/)

See part about EPOLLEXCLUSIVE

~~~
jvink
That is great, thanks for sharing.

~~~
rdtsc
If you are the author, thanks for sharing the project. You did a great job and
made the right choice of having per CPU worker processes each with their own
epoll loop.

------
unwind
Great! This looks really nice, and interesting.

Some fantastically quick points from a very cursory glance at the code. Feel
free to ignore this.

\- The code uses the convention to put the argument of return inside
parentheses, making it look like a function call. This is very strange, to me.

\- It treats sizeof as a function too (i.e. always parentheses the argument).

\- It is not C99, which always seems so fantastically defensive these days.

\- It's not (in my opinion) sufficiently const-happy.

\- I saw at least one instance (in cli.c) of a long string not being written
as a auto-concatenated literal but instead leading to multiple fprintf()
calls. Very obviously _not_ in a performance-critical place, so perhaps it's
not indicative of anything. It just made me take notice.

~~~
jvink
Author here.

I see you picked out the few things that I consistently hear on the coding
style I adopted which is based on my time hacking on openbsd. I have no real
points to argue against those as it is based on preference in my opinion.

I am curious why you arrived on it not being sufficiently constified however.
I'll gladly make sensible changes.

As for the multiple fprintf() calls ... to me it just reads better and the
place it occurs in is as you stated pretty obvious non performance critical.

~~~
unwind
Right. I could have guessed these were based on some coding style guide from
somewhere.

I still don't see the point, or why any sane guide would prefer to treat
return as a function. It just never seems helpful to me, and always
wasteful/more complicated. I realize it's just two tokens, so it's probably
not "important" in any real sense of the word, but it irks me. I like to point
it out since it can help others cargo-culting this.

It's not sufficiently const if there are places where a variable could be
const but still isn't. :) To be super-specific, the variable 'r' here:
[https://github.com/jorisvink/kore/blob/master/src/cli.c#L542](https://github.com/jorisvink/kore/blob/master/src/cli.c#L542)
is one such case. It should be declared inside the loop, i.e. as "const
ssize_t r = write(...);" since once assigned the return value from write(),
it's read-only.

Of course, many ancient-smelling style guides seem to outlaw declaring
variables as close to their point of usage, too. Note that declaring variables
inside scopes other than the "root" one in a function isn't even C99, but many
people seem to think you can't do that.

~~~
jvink
That's fair. Parenthesising return is a matter of readability and flavour to
me. It tickles my spidey sense if it is missing.

I strongly dislike declaring variables anywhere else but the function root,
but I agree with you on the example you provided that those kind of variables
could be constified to be sane.

------
dang
Previously:
[https://news.ycombinator.com/item?id=5995298](https://news.ycombinator.com/item?id=5995298).

------
windlep
Was quite excited to try out a little websocket server with Kore till I saw it
fork's per connection. I don't really want 20k processes for handling 20k
connections, I was really hoping for an event loop.

~~~
altano
Evented io is great for extremely high concurrency, but that isn't always the
right thing to optimize for. A forking web server might be faster for users
depending on the application.

Lastly, you can't just have an event loop without also creating an entirely
async platform. For an event loop to work well, all operations from file
reading to network requests need to be completely async.

~~~
bazookajoes
Out of curiosity in what scenarios do you see a forking web server being
faster than a evented server that balances requests across cores and can
direct a request to the core with the best cache for the request?

I completely agree with need to async. The hard part is that many operations
are async without an async interface. For example memory allocation, or even
memory usage if the memory was not truly allocated by malloc.

~~~
nomel
Any non asynchronous application.

~~~
dhaivatpandya
Why would you have a web server that doesn't process requests async?

~~~
otterley
There are a few reasons I can think of.

1) Client libraries you might need to use in your web service might not be
available in asynchronous versions.

2) Writing blocking code is much easier to write than asynchronous code.

3) Your server code is CPU bound, so there's no benefit to an asynchronous
model.

4) If your web app runs in an asynchronous server and your app crashes, it'll
crash the whole server. On the other hand, in a forking model, only the client
that the child is serving will be impacted; the other workers will be
unaffected.

5) Memory leaks are easier to contain in a forking model, assuming the child
can exit or be killed after N requests.

------
rezacks
I've used
[https://github.com/cesanta/mongoose](https://github.com/cesanta/mongoose) in
the past maybe I could try this one.

------
faragon
Writing high level C applications can be easy, if you use a library that frees
you from using dynamic memory on typical data structures (e.g. strings,
vectors, sorted binary trees, maps). I'm developing a C library for high-level
C code, with hard real-time in mind, is already functional for static linking:
[https://github.com/faragon/libsrt](https://github.com/faragon/libsrt)

------
vbernat
When an HTTP API is just an additional feature of a larger project, it may
make sense to keep using C: a toolchain available everywhere and well known
(including cross-compilation and full bootstrap), a small memory footprint,
easy use of any library needed for the project.

I am doing a lot of that and will keep a look at Kore. Unfortunately, HTTPS
only and non-evented core is a no-go for me.

I am currently relying on the web server embedded in libevent, as well as
wslay for websockets and some additional code for SSE. To easily start a
project, I am using a cookiecutter template:
[https://github.com/vincentbernat/bootstrap.c-web](https://github.com/vincentbernat/bootstrap.c-web)

~~~
jvink
Perhaps you can check out the websocket and SSE examples.

[https://github.com/jorisvink/kore/tree/master/examples/sse](https://github.com/jorisvink/kore/tree/master/examples/sse)
[https://github.com/jorisvink/kore/tree/master/examples/webso...](https://github.com/jorisvink/kore/tree/master/examples/websocket)

------
bubbba
Probably it would be better to have also a lib or dll, not just a program that
render c/c++ servlets. It seems that all (that the kore executes from the
feeded code) is running in servlet threads or at least started from servlet
thread. Or it would be cool to add some "application" framework, not only
"C-servlet" framework

------
danieltillett
Has anyone done any performance testing on kore? Yes I know I can do this
myself, but why invent the wheel.

~~~
delinka
You'd invent the wheel to get you places. You wouldn't _reinvent_ the wheel
when someone's already done a perfectly good job.

~~~
rch
as in... [http://okws.org](http://okws.org)

~~~
danieltillett
Interesting. Any performance data on this beyond the vague it is faster?

~~~
rch
Anything I've seen is 10-ish years old. It was relatively fast, but also
complex and covered by the GPL (for better or worse).

~~~
danieltillett
I found this which seems interesting

[https://github.com/koanlogic/klone](https://github.com/koanlogic/klone)

~~~
rch
The portability aspect is interesting.

I use uwsgi extensively (not just with python), and I think it sets the bar
these days.

------
maciej
If you think Kore is interesting, then also check out Tntnet
<[http://www.tntnet.org>](http://www.tntnet.org>). I've checked it out a few
years ago and it felt good - stable, complete, easy to use etc.

------
zippie
Process per connection does not scale no matter how light weight, even with
COW. kore's connection handling model is the reason apache2 mpm_prefork fell
out of favor many iterations ago.

The only valid argument to avoid a single event based I/O is some sort of hard
blocking I/O such as disk or non-queuing chardev.

However I'm still not biting, this is solved...and as usual the answer is
somewhere in between. For example, in RIBS2 there are two models for
connection handling, event loops for connections and "ribbons" for the non-
queuing bits [1]. RIBS2 is also written in C for C.

[1]
[https://github.com/Adaptv/ribs2/blob/master/README](https://github.com/Adaptv/ribs2/blob/master/README)

Edit - mention RIBS2 is also written in C

~~~
jvink
Kore does not fork per connection.

It uses per cpu worker processes which multiplex I/O over either epoll or
kqueue.

~~~
zippie
A clone is a clone. The main inplementation difference between a thread and
process in Linux is COW, for all intensive purposes a worker process and a
fork are the same in this use case. Neither have led to scalable web servers.

~~~
jvink
Except you are basing yourself on the fact it creates a single worker process
per connection. It does not.

Workers are spawned when the server is started. Each of them deals with tens
of thousands of connections on its own via the listening socket they share.

This is a common technique and scales incredible well.

------
bitL
Good job & good luck! ;-)

------
z3t4
Kore mapped to JavaScript (NodeJS):

    
    
      var kore = require("kore");
    
      kore.on("request", http_request);
    
      function http_request(req, resp) {
        var statusCode = 200;
        resp.write("Hello world", statusCode);
      }

------
talnet
Anyway I love this.

------
poindontcare
hmm maybe put it in the kernel and use it for debugging? cant use userspace
threading libraries though.

------
sownkunz
this is AWESOME!

------
stefantalpalaru
You can also do that with uWSGI:
[http://uwsgi.readthedocs.org/en/latest/Symcall.html](http://uwsgi.readthedocs.org/en/latest/Symcall.html)

------
golergka
So, on one hand, it's one process per connection and ease of development of C,
but on the other hand, it has execution speed of C.

Can anyone who works in web (I don't, but I'm curious) explain what kind of
services is this good and bad for?

~~~
tomjen3
Control interfaces to embedded devices?

~~~
golergka
Thanks, that actually seems to be the perfect use-case (from my almost non-
existent experience with web development).

