Hacker News new | comments | show | ask | jobs | submit login
Balde: a microframework to develop web applications in C (rgm.io)
215 points by ashitlerferad on May 22, 2016 | hide | past | web | favorite | 80 comments



Hi guys, I'm the author of this framework, and it seems that someone shared it here because I started this blog post series this weekend: https://rgm.io/post/balde-internals-part1-foundations/

This is why most of the documentation is marked as TODO, but the API docs are reasonably up-to-date.

If someone has interest on it or is willing to help, please let me know :)

EDIT:

simple url shortener example, using redis: https://gist.github.com/rafaelmartins/9f8392a8909e62820ae0

"complete" app, with templates and stuff: https://github.com/rafaelmartins/bluster


Unless you're going to reference your blog-posts in the documentation and you're willing to keep such documentation updated, I'd strongly recommend you to focus on improving the documentation instead of writing long walls of text on the internals.

If I imagine myself using your framework, well, I would be very angry at you if I had to stop writing code, leave my editor and go read your shilly-shally.

I would focus on:

1) Topics like "Application structure", "Application Deployment", "Application command-line interface" (which at the moment of writing are all just a "TODO").

2) Sample code. A 30 lines example is worth a thousand word blog post

3) More sample code. Ideally, one for every crucial part (cookies? session? middlewares?)

By the way: I see there are some examples, great! But add some more comments to your examples.

4) Blog post on internals.

I'd delay the blog post upon internals because I guess that since you'll be receiving a lot of feedback you might want to change things in the internals thus "invalidating" your blog posts.

P.s: congratulations on using doxygen so well! I wish more project had such a well done documentation!

p.p.s: I hope you won't find my comment rude or anything, I actually like this project.


hey, thanks for the comment, it is far away from being rude :)

blog posts were never the focus, they are mostly for myself, to remember how things work, as I don't work on the framework for some time, and need to have it fresh in mind to write proper docs. it may also help other people willing to help writing docs. also, whoever published balde website here in HN found it due to these blog posts, then they were worth already. :) and to be honest, I'd not publish the framework here in hacker news right now, I'd wait for documentation to be ready, but someone else couldn't wait, so...

but you're right. the focus now is writing documentation.

thanks


Great job! But Balde first appeared on HN about 2 years ago, https://news.ycombinator.com/item?id=7765301.


There is something fantastic with the world of open source : you can enhance the bad parts of a project with your own skills to make it awesome.

Basically, if you like the framework and there is no doc, Make a pain to yourself and contribute.

p.p.s: I hope you won't find my comment rude or anything


Using SCGI as the sole backend protocol is a bad idea. I have implemented a C++ SCGI stack myself. The problem with SCGI that I found (of course very late in the game) is that it requires a TCP connection on localhost for every request on the HTTP frontend. This will leave you with a TCP socket in TIME_WAIT state for every logical request. While it is very simple to implement, it's missing pipelining an keep-alive. I think it's a better idea to implement a very minimalistic HTTP backend that can do pipelining and keep-alive with the upstream HTTP server. You don't need to support the full HTTP protocol, just enough to function with nginx/lighttpd/Apache.


Totally agree with that, I have implemented it as learning project FastCGI and SCGI in Swift few months ago and SCGI seems really weak for a real world project right now, even an easier one. I would also give an extra +1 in recommending to consider interfacing with nginx/lighttpd/Apache.


thanks for sharing your experience. I appreciate it


Is this meant to be pronounced "baldy"? Are you bald?

Excellent name, especially if one or more of the above questions are true!


haha, no, it is a joke with "bucket" in brazilian portuguese, but its not the first time I see someone relating it with bald people ;)


Is it supposed to do more than hundreds of requests per second??


of course. I'll just get some benchmark results published before doing such claim on the website. :)


Could you please compare it with kore [https://github.com/jorisvink/kore] which is another C web server that seam quite efficient when tested with the benchmarking tool wrk [https://github.com/wg/wrk]. Result are bad with the benchmark tool Siege for kore because of its particular request pattern.

You might find this set of benchmarks inspirational [https://github.com/nanoant/WebFrameworkBenchmark]. The Java web server seam faster because it returns less data. kore is faster in fact.

Note: vibe.d has improved its performance with more recent versions. The author of the benchmarks ignore requests to update the benchmarks.


I would guess most people use C web frameworks for embeddability, not performance. The use case is to add a web-based UI to some C/C++ application or device, think configuration UI of your router or something of this sort.

My go to solution for this is libwebsockets. Balde seems to have some nice features, but SCGI + external webserver requirement makes it difficult to embed. I'd also question whether GLib is a reasonable dependency for HTTP microframework.


My thoughts exactly! Apart from embedded systems (or using it as embedded in an application, e.g. to provide an application's help system via an internal web browser) I'd find it weird to build websites in C. I'm looking for something along these lines for embedded systems, but GLib & external webserver disqualify balde. Still looks like a nice, clean project, though I was disappointed to find out I can't try it out :)


C may have some shortcomings, particulary policing memory vioations, but can be very lightweight and very fast. Remember, Linux and Apache/NGINX are in C . Implementing the site specifics in C isn't that much of a stretch. The extra speed could even make your site that much more useable.

The canonical genome browser is written in C: http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19 ... for pure speed.


"With balde you can serve hundreds of requests per second" is it supposed to be "hundreds of thousands"?


They're just playing the expectations game.


Unlike, say, Oracle :-)


It's based on SCGI, which is kinda slow these days.


So why bother with a low level language like C? With just plain CPython, Flask and a file based approach (including polling the modified dates and thus get live reload) I can get up to ~2500 req/s on localhost with a good CPU, i.e. network is going to be the bottleneck until a fairly high level of performance (at which it becomes a no brainer to scale out).

https://github.com/muellermichel/guetzli


I'm just avoiding claims without evidence... unlike the guy who says that SCGI is slow ;)


It would still be interesting to know what you think you can achieve with this approach, i.e. in a direct comparison with Flask (because as you wrote it's quite similar in the featureset) what do you think is the advantage of your approach? Performance is the only justification you give on the page, so I'd be curious about this.

Other than that I guess it's useful if you want to display data from a C / C++ program without going through any bindings. When it comes to that, maybe it would be great to get some automatic HTML display helpers for data structures common in C, like Arrays of numerical values or pointers to sequences of structs. For example in Python it's really fun to just use something like [1] together with Flask and get interfaces to your data without doing any template programming.

[1] https://github.com/softvar/json2html


On a decent box Go/GIN will do about 180K, Elixir/Phoenix almost 200K, Node/Express about 90K. People are not trying to be mean but in everyone's mind the major reason to use C framework would be it offering a significant speed advantage over easier to use alternatives.


One question I always had in mind about such high performance web servers: What kind of networking stack do you use to actually make use of it? This is probably a limitation of our VPS host, but on a given box it usually maxes out at around 1000 req/s while the webserver could do more. Can you get AWS to 200K? Azure? Google App Engine? Or am I just doing something wrong?


Actual box 24 cores 128G ram 2X10Gbe SSDs in RAID 10 . Not really a big fan of "clouds" for high load projects Rackspace onMetal looks OK though for some things.


That's what I expected. Yes, when you actually need on the order of 100k req/s, doing this over 100 vps or even 10 dedicated doesn't look that good over just 2-3 boxes in your own datacenter. That is, if you already have one. On the other hand even Netflix has moved on to Amazon, so there must be something to it.


Netflix get's 70% discount of list price, they also consume over 30% of AWS resources. They are pretty good match for AWS as they have extremely spiky workload.


I used my own C code generator for CGIs for years, along mongoose for other tasks. Nowadays I'm using https://kore.io and I'm VERY happy with it, great developer experience, well written and understandable code, minimal in dependencies and no bugs so far. I'm using it quite heavily in a project, it was easy to add templating and other amenities to it. Highly recommended.


It has potential, but the documentation is full of TODO, so it's hard to say much about it. For example, I was trying to figure out how it handles unicode, and also how it handles memory management.

For comparison, ribs2 ( https://github.com/Adaptv/ribs2 ) is a framework in C that handles garbage collection for you. But it's not really a comparison because Balde doesn't have the documentation, unfortunately.


yes, it lacks documentation, I started this blog post series to try to address this: https://rgm.io/post/balde-internals-part1-foundations/

if you have some specific question, feel free to ask


So, how does it handle memory allocation? Are there memory pools?


as it is a microframework, it does not cares about such details, you can handle memory allocation by hand, or use memory pools provided by glib, create gobjects and use refcount, etc. It is up to you :)

Internally used memory should be free'd manually with the balde_app_free function. it encapsulates most of it for you.


That actually helps me have a better understanding of how the framework is structured, thanks.


This looks very interesting. This being C, I assume the performance is really good (although yes I know it's not always the case).

However, C is one of those "double-edged sword" kinds of languages. What kind of trade offs between performance and "safety" would one be making here? Are they worth it?


This being C, I assume the performance is really good

FWIW, everyone I know who moved web-serving things to C did so primarily because of latency, and then secondarily to avoid GC time delays. Higher transactions per server was a bonus, but no one did it for that.

Also, if you have to hit a database, then any efficiency you get from switching to C will probably be swallowed up by that.


> Also, if you have to hit a database...

I can only think of one case where that is pretty much always true - CRUD with a network connected db backend, several cases where it is debatable, and a bunch of cases where that is wrong. See sqlite3, situations where you can use unix sockets, dynamically loaded modules, etc.

I'm finishing up a project that basically builds state models from log events, automatically, and robustly enough that hopefully the end users wouldn't notice if I were run down by a bus. With the massive amount of data and high dimensionality of the problem, I have a hard time seeing how to pull it off but with C and statically compiled sqlite... or a big bag of money to throw at IBM.


builds state models from log events, automatically

That sounds really cool, but I'm not quite sure what it means.


Take business process modeling [0], and add a dash of directed acyclic graph [1]. You need both fast access to a massive amount of historical data, and the ability to finely control what gets stored where in memory - so not something you can really do in an SQL transaction.

[0] https://en.wikipedia.org/wiki/Business_process_modeling [1] https://en.wikipedia.org/wiki/Directed_acyclic_graph


> This being C, I assume the performance is really good

The choice of programming language affects performance much less than the quality of the architecture, algorithms, data structures and just plain coding skills.

> "With balde you can serve hundreds of requests per second"

A simple Netty-based Java web server can do tens of thousand requests per second easily on a laptop (I'm taking these numbers from our current project, and we didn't spend any time on any real optimizations for speed yet).


Balde means bucket in Portuguese. I thought that was an accident, but looking at the logo on the website I guess it's intentional.


I've been looking into web frameworks in C recently. The offerings aren't awesome. It's hard to find a good BSD-like-licensed library.

Frontrunners included Kore and Crow (another microframework). I went with Kore for a while but it has a pretty terrible API for actually writing web applications. I couldn't take Crow seriously. I ended up going with fastcgi because it was the simplest to wrap using the Scheme FFI.

Others included Lwan (gpl) and Mongoose (gpl).


https://libwebsockets.org - LGPL though. There would be a bunch of dead code if you had no intention of using WS, but the commitment to being valgrind-clean is pretty nice, multiple event library options, multiple ssl backend options... lots of flexibility with what I find to be a sane api. Oh and this might matter to some more than others, but the developer is very active - I've had pull requests accepted within minutes for the obviously correct patches.


building libwebsockets on iOS can be quite a chore for many, however, and the authors of the library are not shy to express their distaste for Apple products, which I find inappropriate.


I can't speak to the appropriateness of publicly expressing distaste, but I'd expect building any nontrivial C library for iOS to be a chore - because Objective-C isn't C.


Objective-C has nothing to do with it, actually (despite that Objective-C technically is C with a superset of features -- all C is compatible Objective-C).

The issue is the build/compilation process for the different platform architecture via cross-compilation. Tends to be rather tricky, regardless of the actual language you choose to write your app with (this build process is entirely separate from any code you write yourself in any supported iOS language).


I'll have to defer to your expertise in Objective-C and iOS, because it was my understanding that C was not a first class citizen in the iOS world. Like with Windows, Microsoft has treated C like a redheaded stepchild. Yes C++ is a superset of C, but it is still a huge pain to get working in Windows.

Do you have a favorite platform independent C project that handles iOS deployment well that you can direct me to?



I hope you didn't put much effort into tracking that down for me, because that isn't what I would call a "favorite platform independent C project". I guess it could be someone's favorite something, but I think it would make most programmers sad. Thanks for trying though.


I didn't. This (or something very similar) was on HN a few weeks ago so I remembered it.


Hey, I'm on a phone without my last pass. I've spent a fair amount of time trying to find a happy web c++ framework.

Implementing fastcgi seemed too daunting to me... hello world was crushingly painful to set up (using Apache)

lwan and mongoose are GPL.I

I feared kores API as well, but I never got around to actually using it, so I'll just use your Menton of it as confirmation of my fears .

I'd looked at crow, and it didn't look so bad, but now I'm not so sure. What did you like about it? What didn't you like about it? What made fcgi less painful / more fun?

Basically, I'd love any notes on your attempt / experiences / etc. !


Fastcgi was incredibly simple to get hello world running - both in C and through the Chicken Scheme FFI. However, "all" it provides is the backend for a web server so you'd still need to handle things like routes and templates and responses yourself. Still, I don't find those to be the challenging part.

The issue that I'm having with fastcgi is the random scarcity of API documentation. I can't believe it's so hard to find - googling for "fcgx accept_r api documentation" just yields nothing.

Anyway, it was very cool to get it running in Scheme so I'll definitely write a post on that soon. I'll post it on this thread when I do.


I'd read a (hopefully scheme agnostic, or at least, scheme-decoupled) tutorial on how you got fastcgi up and running painlessly. I really struggled with the lack of documentation as well.


Author of Kore here.

What fears? Care to elaborate? I'd love to hear!


For me, too much was tied into using the Kore CLI to build and manage the actually application. Plus configs for routing is the only (apparent) option.

I think Kore would be an awesome framework if it weren't built with the assumption that you want to use its configs and its CLI. And although these could be useful too, I don't think it's too much to ask that the actual functionality there be exposed nicely and well-documented.

In particular, I'd just like to be able to wrote my own main and start Kore from there with high-level calls (including building routes).


I understand.

You're not forced to use the CLI create/build/run commands for anything. They just make it easier, but you are in no way tied to this.

Building the module itself can be done on your own for example, as it is just a normal dynamic library you can use whatever build system you want.

I've considered time and time again to turn kore into a "library" that you can link against and include into your own applications but every time I decided against it as it didn't give me any real benefits. It would make certain things considerable harder, who takes care of the worker processes? Who takes care of the logging and the internal message relaying? Having this abstracted away in a library is probably possible but adds tons of expectations on your own application.

Having Kore as the platform your code runs under makes this easier.

Thanks for explaining however, very insightful!


Well there is civetweb (BSD) which was forked from an earlier BSD release of mongoose but has now diverged a little on its own.


I'd love to make balde BSD-licensed ( see https://github.com/balde/balde/wiki/Converting-from-LGPL-to-... ) but unfortunately I have to deal with GLib for now... :/


Looks neat, although I think I'd find myself mistyping "blade" a lot.


"Why would anyone want this?" -- There are people who know C--and it's associate library ecosystem--really well and are very comfortable with it. They may have projects where it would not be a productive use of their time to have to learn a new language and a new set of libraries just to get a little code online.

I know, shock and horror, wailing, gnashing of teeth. /s

Is it just me, or has HN gotten really bitchy towards people's projects lately (and by "lately", I mean "the last year")? There are real people behind these projects. They did a thing and it might not be your cup of tea or anything in your experience bubble, but that doesn't make it stupid or misguided.


It's not a new phenomena, Show HN has had such questions for years: https://news.ycombinator.com/item?id=5873619

:)


I really wouldn't want to expose GLib to this level but to each their own. kcgi is an interesting C API with security taken seriously: http://kristaps.bsd.lv/kcgi/


I love kcgi, but we have different focus. I may use it as FCGI layer for balde, though :)


Looks like it's using NULL-terminated strings. Is the framework secure? Is it compatible with (all kinds of) Unicode and data with NULL in it?


Why would you need to be compatible with "all kinds of Unicode", by which I assume you mean different encodings?

You just need to be fully compatible with one kind of Unicode (UTF-8) and use it consistently. And null-terminated UTF-8 is no more or less safe than null-terminated ASCII, because the only UTF-8 codepoint that uses 0x00 is the same one as in ASCII.


That's true provided you validate all UTF-8 input. As you probably know, UTF-8 has a lead byte that indicates how many of the following bytes are part of the same code point. If your code blindly believes what that first byte claims, then you could easily get tricked into reading beyond '\0'.


Yes, you should validate the UTF-8 at some point before you actually manipulate the contents of the string. Before that, you can just treat it as a pile of bytes.

You don't need to go down different paths of how you read those bytes based on the codepoint lengths until you're actually doing something with codepoints.


So this has an embedded HTTP server then? Looks cool.


No, I don't think so, it says it requires an SCGI interface to the web server. Apparently SCGI is like an upgraded CGI.


Looking through the source (app.c) it can do SCGI, CGI, and also has its own httpd server.


You are supposed to use SCGI in production. The embedded HTTPD is provided to help during development, so you don't need to setup SCGI in your local machine. I never did any serious testing on the HTTPD, but it was not developed with security and performance in mind.


Rafael, congratulations on this project, i love seeing Brazilian projects here.


A comparison in terms of speed between blade and cppcms would be nice.


Seems more promising than g-wan


Why not vala (or genie)?


anyone know a good micro web framework written in LLVM IR?


[flagged]


Why not? If we only developed for what we already do, then we'd still be doing the same stuff we did years ago.

Besides, web apps in C aren't completely unheard of.


I remember that one big lesson learned after heartbleed bug was that we should not develop in C anymore. To me, this has probably many exceptions, but still should be especially true for web dev.


If that was your takeaway from Heartbleed, then you wheren't paying attention.


whoosh!




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: