
LibHTTP: Open Source HTTP Library in C - mabynogy
https://www.libhttp.org/
======
ekidd
Whenever I see a C networking library, my first questions are:

1\. How many security-critical CVEs has it had in the past?

2\. How extensively has it been fuzzed?

3\. Is this one of the rare C code bases that actually has pretty solid
security (like Dovecot [https://www.helpnetsecurity.com/2017/01/17/dovecot-
security-...](https://www.helpnetsecurity.com/2017/01/17/dovecot-security-
audit/) or various OpenBSD tools)?

I ask because once upon a time I just about broke my heart trying to write
secure networking code in C. I inspected every line. I chose my dependencies
carefully. I wrote extensive, malicious tests with tons of malformed data. I
added recursion limits to prevent stack overflow. I ran everything through
Electric Fence and used the best open source validation tools available in
2001. And despite months of effort for a very simple protocol, I wound up
being affected by at least 6 CVEs in 15 years
([http://people.canonical.com/~ubuntu-
security/cve/pkg/xmlrpc-...](http://people.canonical.com/~ubuntu-
security/cve/pkg/xmlrpc-c.html)), many of them in the third-party XML parser I
used. But if somebody directly fuzzed my code, I bet they'd find _at least_
one more issue somewhere.

There are a few people I trust to write mostly secure networking software in
C. But the more time I spend fuzzing protocol parsers, the more I realize
that—even though I like to think I'm unusually careful and paranoid—I'll never
be one of those people.

So how does libhttp look from a security perspective? If it's truly paranoid
and thoroughly fuzzed, it might be very useful for embedded work.

~~~
duneroadrunner
> ([http://people.canonical.com/~ubuntu-
> security/cve/pkg/xmlrpc-...](http://people.canonical.com/~ubuntu-
> security/cve/pkg/xmlrpc-...))

It looks like some of those CVEs are dated not that long ago. If code safety
is still a concern with this project, you/someone might consider conversion to
SaferCPlusPlus (essentially a memory-safe subset of C++). There is an "auto-
conversion helper tool"[1] still in development, but already functional.

[1] shameless plug: [https://github.com/duneroadrunner/SaferCPlusPlus-
AutoTransla...](https://github.com/duneroadrunner/SaferCPlusPlus-
AutoTranslation) (Feel free to post any questions to the github "issues"
section.)

~~~
ekidd
> _If code safety is still a concern with this project, you /someone might
> consider conversion to SaferCPlusPlus (essentially a memory-safe subset of
> C++)._

Thank you for the pointer! I haven't been a maintainer of xmlrpc-c in over a
decade now, and I'm not even sure who's maintaining it or using it. The
sourceforge mailing list archives seem to be down, so I have no way to contact
the current maintainers.

The packages in Ubuntu which use xmlrpc-c are freeipa-client, rtorrent,
opennebula, certmonger, flowgrind and cobbler-enlist. I also remember 2 or 3
commercial users from 15 years ago. If any of these people are interested, I'd
consider writing a drop-in replacement in Rust that preserves the same C ABI,
and spend at least a week of CPU time fuzzing it.

------
jws
Architecturally this uses threads, a master listener and some number of worker
threads. You will add a couple calls to your program to initialize the web
server. The workers use synchronous I/O. You can easily add server side
support for Lua or JavaScript from which you can expose and manipulate your
program’s state, to the extent you can use threads in C without tearing your
leg off.

It looks much nicer than anything I’ve hand rolled in the past to add web
management to various servers and daemons.

------
vortico
It's nice to see a good C library, since it's very easy to bind to it from
virtually any programming environment, which can't be said about other
languages. And the build system can't be simpler: A makefile with a list of
source files and compile flags. All you need to do HTTP!

------
halayli
It's tempting to consider using an http server in your program. But also
consider wrapping your c/c++ programs in fastcgi loop and have nginx run
behind it.

It removes a lot of complexity from your code.

~~~
jasode
_> nginx run behind it. It removes a lot of complexity from your code._

There are good intentions with that advice but I think it's misleading. It
doesn't take into account how http libraries like this one (and similar ones
for C++ such as Proxygen[1] and Silicon[2]) are _supposed_ to be used.

The intended use case is to add an embedded webserver to your executable
that's communicating with _friendly_ and _internal_ systems. That would be
things like microservices and dashboards.

You don't use those libraries to create _external_ public-facing websites that
would be under attack from hostile agents. You're correct: Do not re-invent
NGINX by using the C http library.

As an example proper use case, let's say you write an internal C program to
process terabytes of image files. You think it might be nice to have a visual
status of its progress/errors/throughput/etc. Instead of adding a GTK GUI to
the code, you use the http library to expose a "web dashboard". You can then
point an employee browser on the internal network at it to see its progress.
For this use case, avoiding the http library and adding NGINX into the stack
makes it _more complicated_.

Another use case without HTML gui is exposing http endpoints for
microservices. Again, it's _internal_ communication between friendly agents.

tldr: http libraries that are compiled into executables vs NGINX are for
different use cases.

[1]
[https://github.com/facebook/proxygen](https://github.com/facebook/proxygen)

[2] [http://siliconframework.org/](http://siliconframework.org/)

~~~
gtcode
Your described use case for a "friendly / internal" system has a limited
definition in any non-trivial sized organization. It would only apply when all
users of the embedded webserver already have root access to that server. This
might make sense, for instance, in a small development team with a flat
security hierarchy, but would be a red flag, security wise, otherwise.

~~~
jasode
_> Your described use case for a "friendly / internal" system has a limited
definition in any non-trivial sized organization._

Facebook is a non-trivial size company that has lots of internal private-
facing programs with embedded http connectivity. From that experience, they
open sourced Proxygen http library which is one of the links I mentioned in
the previous comment.

Also, see the comment from VikingCoder which in turn links to reddit thread
mentioning another big company like Google doing similar use cases with http
libraries:
[https://news.ycombinator.com/item?id=15671936](https://news.ycombinator.com/item?id=15671936)

~~~
gtcode
Facebook is a mixed bag as far as being a model of good behavior on many
things. The company is founded on irresponsible social principles. For how
long did general internal employees have access to users' private data? This
is no longer the case, though, correct? Sorry to get OT.

Maybe these libs are well-vetted, after all. My mistake. Thanks for the info.

------
bootcat
Another library worth mentioning is
[https://github.com/h2o/h2o](https://github.com/h2o/h2o)

~~~
aninteger
h20 is nice, but I believe it requires libuv. libuv is a massive stack with
LOTS of abstractions on top of abstractions that are sorta atypical of C
libraries. If your code crashes because some assert in libuv fired, good luck
navigating through the "typedef city" that libuv is. I wrote a little server
built on top of libuv but I could never really get it working right on Windows
(which is a platform that libuv supports fairly well I think). In the end I
ended up abandoning libuv, and all the that abstraction, and just requiring
Linux or BSD by targeting epoll/kqueue directly.

~~~
lxtx
libh2o does not require libuv, it has its own event loop. libuv is optional.

~~~
aninteger
Then their FAQ needs to be updated:

To build H2O as a library you will need to install the following dependencies:

libuv version 1.0 or above OpenSSL version 1.0.2 or above

As I understand it, regular h2o doesn't require libuv but libh2o does.

~~~
deweerdt
The H2O build produces libh2o and libh2o-evloop. The latter doesn't depend on
libuv.

------
akandiah
It's good to see more libraries filling this space. One of the foremost
contenders is Libmicrohttpd
([https://www.gnu.org/software/libmicrohttpd/](https://www.gnu.org/software/libmicrohttpd/)).
The only problem with that library is the LGPL license.

~~~
crdoconnor
Why is LGPL a problem?

~~~
beefhash
Embedded platforms come to mind.

------
advisedwang
See
[https://curl.haxx.se/libcurl/competitors.html](https://curl.haxx.se/libcurl/competitors.html)
for a comparison of various HTTP libs (obviously through the lens of cURL)

~~~
0x0
While those are all http client libraries, it would appear this project is a
http server library.

~~~
otterley
I'd recommend a clearer title, then. HTTP is a protocol, so making the client
vs. server distinction explicit would be useful to the reader.

------
matt_wulfeck
I wish the "about" page for projects like these had more information. For
example, _why_ was this library created and by whom? What does it offer that
others don't? Etc. I want to learn more about the problem it's solving even
though I have no interest in the code, and I'm sure I'm not alone.

~~~
j_s
It is sad how open source is all about sharing/learning until it comes time to
link to competing/alternative projects.

------
ausjke
There are quite a few alternatives, microhttpd is a very popular one though in
c++, moongoose / civetweb(forked version) are both good alternatives and are
tested in the field and are written in C, you also have duda and libonion
doing the same http library but with more functions(restful, websocket,etc)

------
nurettin
I've used mongoose in the past to embed http servers to C++ programs. Would be
interesting to see how this performs in comparison.

[https://github.com/cesanta/mongoose](https://github.com/cesanta/mongoose)

Here's me attempting to optimize REST routing by using Google's re2 regex
library to reduce the list of possible regexes that might match a given URL:

[https://github.com/nurettin/pwned/blob/master/server/server....](https://github.com/nurettin/pwned/blob/master/server/server.hpp#L121)

FilteredRE2

[https://github.com/google/re2/blob/master/re2/filtered_re2.h](https://github.com/google/re2/blob/master/re2/filtered_re2.h)

~~~
ChickeNES
Given this is MIT licensed instead of GPL like mongoose, it's already leaps
and bounds more useful for embedding.

~~~
nurettin
Ok, gpl is a terrible calamity and I'm sorry I ever opened notepad.

------
0x0
What are the advantages over Civetweb, from which this project was forked?

(Civetweb itself was forked from Mongoose when Mongoose changed its license to
commercial+GPL)

------
amelius
HTTP/2 support?

~~~
jgowdy
h2o has it

------
VikingCoder
Re-posting from a reddit discussion [1] about another HTTP server written in
C:

I worked at a company once that had a really decent HTTP server library...
That they put in _every_ program.

You'd launch an app, and to debug it, you'd access
[http://localhost:9001](http://localhost:9001). From there, you could go to
URLs for different _libraries_ in the app. Like, if you had a compression
library, you could go
[http://localhost:9001/compression](http://localhost:9001/compression). It
would show stats about the recent work it had done, how long it took, how much
CPU, RAM, disk it used. The state of variables now, etc. You could click a
button to get it to dump its cache, etc.

If you were running a service on a remote machine, accessing it over HTTP to
control it was just awesome.
[http://r2d2:9001/restart](http://r2d2:9001/restart).
[http://r2d2:9001/quit](http://r2d2:9001/quit).
[http://r2d2:9001/logfile](http://r2d2:9001/logfile).

Oh, and the services running on that remote machine would register with a
system-level monitor. So, if you went to
[http://r2d2/services](http://r2d2/services), you could see a list of links to
connect to all of the running services.

...and every service registered with a global monitor for that service. So, if
you knew a Potato process was running somewhere, but you weren't sure which
machine it was on, you could find it by going to
[http://globalmonitor/Potato](http://globalmonitor/Potato), and you'd see a
list of machines it was running on.

Just all kinds of awesomeness were possible. Can not recommend enough.

And, I mean like, programs with a GUI. Like, picture a game. Except on my
second monitor, I had Chrome open, talking to the game's engine. I could use
things like WebSockets to stream data to the browser. Like, every time the
game engine rendered a shot, I could update it (VNC-style) in the browser
window. Except annotated with stats, etc. It was just the most useful way to
organize different debug information.

And what was great was that writing a library, and wanting to _output_
information, you wouldn't write it to std out... You'd make HTML content, and
write to it. Want to update it? Clear the buffer and write to it again. As a
user, if you ever want to read the buffer, you just browse it. Want to update
it? Refresh the window. Or better yet, stream it over a websocket. Like Std
Out on __steroids. __If you need to combine the output from a few libraries in
a new window, you just write a bit more HTML in your code, and you 're doin'
it.

It's just another example, in my mind, of the power of _libraries._ We all get
used to thinking of frameworks (IIS, Apache) as the only way to solve a
problem, that we forget to even _think_ about putting things together in new
and unexpected ways. __HTTP as a library - HELL YES. __

Using HTML to debug programs, _live_ , is highly under-utilized.

[1] :
[https://www.reddit.com/r/programming/comments/36d190/h2o_is_...](https://www.reddit.com/r/programming/comments/36d190/h2o_is_a_very_fast_http_server_written_in_c_it/crcylze/)

------
__michaelg
Networking library in C? Nice. Who is afl'ing it first?

------
jjawssd
During the past twelve months, this project has had only one active
contributor.

~~~
abiox
how should i interpret this information?

~~~
aquadrop
That this project has bus factor = 1. High risk.

~~~
jjnoakes
The risk isn't a function of developer count alone; for example, a simple and
polished feature-complete library with one developer is lower risk than a beta
version of a huge complex distributed system with 100 developers.

------
arca_vorago
"LibHTTP is licensed under the MIT license. Therefore the library can be used
in both open and closed source projects."

I really don't want to be overly negative, and I applaud people for putting in
work like this... but this is a perfect example of whats wrong with BSDesque
licenses right here.

More people should at least consider gplv3'ing their stuff. So many of the bad
stories we hear are side-effects from software that doesn't respect the user
in the first place, with licenses that don't respect the user.

RMS was and is right.

If I may offer an alternative that respects the user, I have had great
experiences with Hiawatha, and is my current standard webserver these days,
even over nginx or apache. With an added bonus of being programmed with
security in mind from the start, which is something we always hear people talk
about wanting but how often do we actually see it?

~~~
foodstances
I don't understand this logic.

You get free software with few restrictions on what you're allowed to do with
it. In what way is that "respecting" you _less_ than similar software with
_more_ restrictions on what you can do with it? How does someone forcing more
control over your actions in any way "respect" you more?

~~~
arca_vorago
"the library can be used in both open and closed source projects."

When that source gets put in a product that is closed and then that product
makes it to the user. How do people not understand this difference by now?
Tivoization. It violates the four freedoms principle. That's how.

To quote zAy0LfpBZLC8mAC:

"A law that allows murder is more permissive and obviously does not increase
freedom. It's simply a fallacy to think that not putting any limits on what
any individual can do results in maximal freedom for society at large."

~~~
foodstances
This argument for the GPL only wins in a fantasy land where all software has
to be released under GPL or not at all. In reality, companies reject the GPL
and just write their own code or adopt and contribute to permissively licensed
projects, and that is what ends up "making it to the user".

You're telling authors to take away freedoms of its own users now, to prevent
a potential bogeyman from taking away freedoms of other users in the future.

You can use permissively licensed code now and forever, because that can't be
taken away once released. You're refusing to use it now on the off chance that
in the future, someone modifies that code to do something else and give it to
other users without the code. Your initial use still hasn't changed, and your
right to use that initial code hasn't changed. You are not a user of the new,
modified code, so your rights have not been impacted whatsoever.

~~~
icebraining
_In reality, companies reject the GPL and just write their own code_

Right, so we either end up with more Free Software, or with more employment
for us programmers. It's a clear win-win!

------
teknico
I'll just leave this here. Downvotes start in 3.. 2... 1...

The long goodbye to C -
[http://esr.ibiblio.org/?p=7711](http://esr.ibiblio.org/?p=7711)

~~~
sctb
Could you please stop starting language flamewars? Surely you could come up
with something substantive to say.

[https://news.ycombinator.com/newsguidelines.html](https://news.ycombinator.com/newsguidelines.html)

