1. How many security-critical CVEs has it had in the past?
2. How extensively has it been fuzzed?
3. Is this one of the rare C code bases that actually has pretty solid security (like Dovecot https://www.helpnetsecurity.com/2017/01/17/dovecot-security-... or various OpenBSD tools)?
I ask because once upon a time I just about broke my heart trying to write secure networking code in C. I inspected every line. I chose my dependencies carefully. I wrote extensive, malicious tests with tons of malformed data. I added recursion limits to prevent stack overflow. I ran everything through Electric Fence and used the best open source validation tools available in 2001. And despite months of effort for a very simple protocol, I wound up being affected by at least 6 CVEs in 15 years (http://people.canonical.com/~ubuntu-security/cve/pkg/xmlrpc-...), many of them in the third-party XML parser I used. But if somebody directly fuzzed my code, I bet they'd find at least one more issue somewhere.
There are a few people I trust to write mostly secure networking software in C. But the more time I spend fuzzing protocol parsers, the more I realize that—even though I like to think I'm unusually careful and paranoid—I'll never be one of those people.
So how does libhttp look from a security perspective? If it's truly paranoid and thoroughly fuzzed, it might be very useful for embedded work.
It looks like some of those CVEs are dated not that long ago. If code safety is still a concern with this project, you/someone might consider conversion to SaferCPlusPlus (essentially a memory-safe subset of C++). There is an "auto-conversion helper tool" still in development, but already functional.
 shameless plug: https://github.com/duneroadrunner/SaferCPlusPlus-AutoTransla... (Feel free to post any questions to the github "issues" section.)
Thank you for the pointer! I haven't been a maintainer of xmlrpc-c in over a decade now, and I'm not even sure who's maintaining it or using it. The sourceforge mailing list archives seem to be down, so I have no way to contact the current maintainers.
The packages in Ubuntu which use xmlrpc-c are freeipa-client, rtorrent, opennebula, certmonger, flowgrind and cobbler-enlist. I also remember 2 or 3 commercial users from 15 years ago. If any of these people are interested, I'd consider writing a drop-in replacement in Rust that preserves the same C ABI, and spend at least a week of CPU time fuzzing it.
It looks much nicer than anything I’ve hand rolled in the past to add web management to various servers and daemons.
It removes a lot of complexity from your code.
There are good intentions with that advice but I think it's misleading. It doesn't take into account how http libraries like this one (and similar ones for C++ such as Proxygen and Silicon) are supposed to be used.
The intended use case is to add an embedded webserver to your executable that's communicating with friendly and internal systems. That would be things like microservices and dashboards.
You don't use those libraries to create external public-facing websites that would be under attack from hostile agents. You're correct: Do not re-invent NGINX by using the C http library.
As an example proper use case, let's say you write an internal C program to process terabytes of image files. You think it might be nice to have a visual status of its progress/errors/throughput/etc. Instead of adding a GTK GUI to the code, you use the http library to expose a "web dashboard". You can then point an employee browser on the internal network at it to see its progress. For this use case, avoiding the http library and adding NGINX into the stack makes it more complicated.
Another use case without HTML gui is exposing http endpoints for microservices. Again, it's internal communication between friendly agents.
tldr: http libraries that are compiled into executables vs NGINX are for different use cases.
Facebook is a non-trivial size company that has lots of internal private-facing programs with embedded http connectivity. From that experience, they open sourced Proxygen http library which is one of the links I mentioned in the previous comment.
Also, see the comment from VikingCoder which in turn links to reddit thread mentioning another big company like Google doing similar use cases with http libraries:
Maybe these libs are well-vetted, after all. My mistake. Thanks for the info.
You could use a very stripped-down and thus potentially very efficient HTTP parser given that most of the work will be offloaded to nginx.
Also, last time I checked, nginx didn't support request multiplexing/pipelining over FastCGI (as it does over HTTP). Instead, each HTTP request received by nginx would result in a new (FastCGI) connection to your application which is obviously suboptimal.
I have these same feelings every time I see uWSGI being used.
Not to mention, Nginx implements quite a lot of features that are probably missing in LibHTTP, or implements them in a more performant way.
Having it inbuilt is useful for testing.
Also not sure whether doing a fastcgi loop and requiring another http server to frontend it really is simpler!
In fact in some cases it may only accept one client at a time.
1: This of course presupposes you know, or are willing to learn, Rust. That said, given what you've explained about libuv, the total time to grok libuv's code base and the time to learn Rust may be similar, but learning Rust may pay more dividends?
To build H2O as a library you will need to install the following dependencies:
libuv version 1.0 or above
OpenSSL version 1.0.2 or above
As I understand it, regular h2o doesn't require libuv but libh2o does.
It's the Flask/Sinatra of the C world, though you do need to provide your own router if you want anything a little bit more complex.
Here's me attempting to optimize REST routing by using Google's re2 regex library to reduce the list of possible regexes that might match a given URL:
(Civetweb itself was forked from Mongoose when Mongoose changed its license to commercial+GPL)
I worked at a company once that had a really decent HTTP server library... That they put in every program.
You'd launch an app, and to debug it, you'd access http://localhost:9001. From there, you could go to URLs for different libraries in the app. Like, if you had a compression library, you could go http://localhost:9001/compression. It would show stats about the recent work it had done, how long it took, how much CPU, RAM, disk it used. The state of variables now, etc. You could click a button to get it to dump its cache, etc.
If you were running a service on a remote machine, accessing it over HTTP to control it was just awesome. http://r2d2:9001/restart. http://r2d2:9001/quit. http://r2d2:9001/logfile.
Oh, and the services running on that remote machine would register with a system-level monitor. So, if you went to http://r2d2/services, you could see a list of links to connect to all of the running services.
...and every service registered with a global monitor for that service. So, if you knew a Potato process was running somewhere, but you weren't sure which machine it was on, you could find it by going to http://globalmonitor/Potato, and you'd see a list of machines it was running on.
Just all kinds of awesomeness were possible. Can not recommend enough.
And, I mean like, programs with a GUI. Like, picture a game. Except on my second monitor, I had Chrome open, talking to the game's engine. I could use things like WebSockets to stream data to the browser. Like, every time the game engine rendered a shot, I could update it (VNC-style) in the browser window. Except annotated with stats, etc. It was just the most useful way to organize different debug information.
And what was great was that writing a library, and wanting to output information, you wouldn't write it to std out... You'd make HTML content, and write to it. Want to update it? Clear the buffer and write to it again. As a user, if you ever want to read the buffer, you just browse it. Want to update it? Refresh the window. Or better yet, stream it over a websocket. Like Std Out on steroids. If you need to combine the output from a few libraries in a new window, you just write a bit more HTML in your code, and you're doin' it.
It's just another example, in my mind, of the power of libraries. We all get used to thinking of frameworks (IIS, Apache) as the only way to solve a problem, that we forget to even think about putting things together in new and unexpected ways. HTTP as a library - HELL YES.
Using HTML to debug programs, live, is highly under-utilized.
 : https://www.reddit.com/r/programming/comments/36d190/h2o_is_...
I really don't want to be overly negative, and I applaud people for putting in work like this... but this is a perfect example of whats wrong with BSDesque licenses right here.
More people should at least consider gplv3'ing their stuff. So many of the bad stories we hear are side-effects from software that doesn't respect the user in the first place, with licenses that don't respect the user.
RMS was and is right.
If I may offer an alternative that respects the user, I have had great experiences with Hiawatha, and is my current standard webserver these days, even over nginx or apache. With an added bonus of being programmed with security in mind from the start, which is something we always hear people talk about wanting but how often do we actually see it?
You get free software with few restrictions on what you're allowed to do with it. In what way is that "respecting" you less than similar software with more restrictions on what you can do with it? How does someone forcing more control over your actions in any way "respect" you more?
When that source gets put in a product that is closed and then that product makes it to the user. How do people not understand this difference by now? Tivoization. It violates the four freedoms principle. That's how.
To quote zAy0LfpBZLC8mAC:
"A law that allows murder is more permissive and obviously does not increase freedom. It's simply a fallacy to think that not putting any limits on what any individual can do results in maximal freedom for society at large."
You're telling authors to take away freedoms of its own users now, to prevent a potential bogeyman from taking away freedoms of other users in the future.
You can use permissively licensed code now and forever, because that can't be taken away once released. You're refusing to use it now on the off chance that in the future, someone modifies that code to do something else and give it to other users without the code. Your initial use still hasn't changed, and your right to use that initial code hasn't changed. You are not a user of the new, modified code, so your rights have not been impacted whatsoever.
Right, so we either end up with more Free Software, or with more employment for us programmers. It's a clear win-win!
For most users the choise is not between a 50% proprietary product that uses an MIT library and a FOSS product that is 0% proprietary. The choice is between a 50% proprietary product and a 100% proprietary product and you are trying to take away their choice of the 50% proprietary product.
Complete eradication of proprietary software is not the expected outcome of MIT licensed libraries. The goal is reduction.
The long goodbye to C - http://esr.ibiblio.org/?p=7711