Hacker News new | past | comments | ask | show | jobs | submit login
Libwebsockets a powerful and lightweight pure C library (libwebsockets.org)
257 points by khoobid_shoma 3 months ago | hide | past | favorite | 64 comments

I have used this library before, it's a powerful library indeed but the documentation is a bit lacking. There isn't a straightforward guide to do the most common operations, even the most simple ones like connecting a client to a server, and you're left with studying the various examples provided, each of them do things a little differently. There are also a lot of undocumented functions/constants/structs.


I'd prefer this well written and working minimal example (one of many) to API documentation 99 times out of 100.

I've also used it on occasion and would agree the documentation is patchy in some areas thought getting a running example is fairly trivial.

e.g. when a TLS cert is renewed/replaced, how to deal with that in an already running websocket.

Check out the "Secure Streams" of libwebsockets, it hides details of protocols specifics stuff (in the JSON policy) and easier to program network application (just deal with the abstracted states and payload in the callbacks).



We use libwebsockets in Ardour (a cross-platform digital audio workstation) to provide the ability to create control surfaces (GUIs) within the browser. We mostly treat it as a transport layer for OSC messages, which could otherwise be transferred via UDP (if the endpoint wasn't a browser).

Something close to UDP APIs for browsers (QuicTransport + datagram extension) was in development for a while but the proposal ended up getting rejected/withdrawn.

How does this fare compared to uWebSockets? https://github.com/uNetworking/uWebSockets

Like most websocket library, uWebSockets works by calling some sort of send/write function. You give the data you want to send, which is a sequence (a string or a vector).

The underlying library will take care of buffering that, and the size of that buffer is unbounded. In most of them there is no way to know how much is buffered.

But with libwebsocket, it will call a callback written by you, it will call it every time it is ready to send. And then you have to call a write function that does not guarantee you it will write the whole buffer you give it.

This can be useful for web based games, you keep the last version of the position of each object, and send them one by one when the client is ready to receive. The memory usage will then be bounded to: size of message * number of objects + number of players (each player can have a message that is currently kept in memory until it is entirely sent, since multiple calls to the callback may be required to send a whole message.

The libwebsocket site claims that it is the receiving side that decides when it is ready to send. I don't know exactly how it is determined but I think the sender uses TCP ACKs, the window size...

uWebSockets appears to be significantly faster, but this one is second: https://github.com/uNetworking/uWebSockets/blob/master/misc/... (linked from uWebSockets main github)

If the maintainer is sane, then that would be one advantage over uwebsockets.

uWebSockets does not have a websocket client interface.

I wrote something similar, except instead of providing a library (which Libwebsockets already does a fine job of), I created a server/framework accepting shared objects as backend plugins running as dedicated threads interacting with spsc lockless ringbuffers. In other words, more or less the inverse of a library: https://github.com/wbudd/ringsocket

I haven't been putting much time into it anymore lately, but I intend to create a bunch of language bindings for it soon so you can write plugins in other languages too such as Python, C++, Rust, etc. Should be interesting.

I personally prefer QtWebSockets. I've used libwebsockets before, but I found the API error prone and crusty, and ended up eventually converting it to QWebSockets. That turned out to be a good move.

Not a library [1], but trivial to integrate in your architecture. Just redirect stdout to gwsocket

[1] https://gwsocket.io/

I remember hearing about it many years ago, and am surprised to find it alive, and well now.

I always felt that the web people were never about performance, since there were many "pure C" webdev attempts before without much success.

Nginx can very realistically handle 1-2M requests per minute on commodity hardware, and no customisation, and that's from the disk.

Were somebody really serious about web performance, I think going from millions of requests per minute, to millions of requests per second is 100% possible.

I worked on this problem around 6 years ago, when I had a task of squeezing HTTP, and network perf on some API servers close to hardware limits. The task was mostly about gluing DPDK to popular software: nginx, memcached, postgres.

I am very enthusiastic to see Libwebsockets getting glib support. Glib is a one of a kind piece of software in the C ecosystem with which you can adopt modern programming methods, and in general approach it as you do it in a big "platform" like environment like NODEJS. Glib is really undeserving neglected, and overlooked.

On a 5 year old Intel Xeon, nginx was able to serve a 1 kilobyte static file over TLS at 1.2M requests per second. It's definitely possible already.

Coding big web backends in C is extremely expensive. I'm not sure it's worth the tradeoff for most use cases

I am not talking about whole backends, but task specific individual APIs, especially for performance sensitive tasks.

You can get within 20% the performance for 1/10th the cost by using a modern fast language (rust, d, nim, go), and within 50% for even cheaper by using c# or java.

I really doubt.

Of my short encounter with "modern" webdev, I found that running out of RAM was far faster than running out of CPU, even with every trick possible thrown to increase GC aggressiveness.

RAM is by far more discriminately priced on all these new "cloud" hostings, and has the most unpredictable performance change with size. Even on real hardware, going for high RAM servers is quite expensive.

While CPU, or I/O saturation naturally throttles itself, RAM exhaustion is rarely pretty, and hard to proof your software against. Most disconcerting about this is that your RUST, GO, or the TRUE ENTERPRISE JAVA®, don't really use that RAM at all. Most of "modern languages" RAM content is just zeroes, and empty buffers.

How is Rust cost more memory than C again?

Also as shown by TechEmpower benchmark, in "webdev" field, C/C++ doesn't have proper/official driver to Postgres with pipelined support, thus even lose to Java in the fortunes benchmark.

Nowadays, they are using a fork of libpq with batch API that has not been merged for 6 years in order to compete. So, lacking of good library support in "webdev" field will put C/C++ at extreme disadvantages compare to other "webdev"-friendly languages.

You can make efficient systems with Java too.

Quote from 10 years ago:

> The system [LMAX] is built on the JVM platform and centers on a Business Logic Processor that can handle 6 million orders per second on a single thread. The Business Logic Processor runs entirely in-memory using event sourcing.

https://martinfowler.com/articles/lmax.html (2011)

You'd be hard pressed to run into memory issues with nim or go, to list the easier languages to write.

You don't know what you are talking about. Rust doesn't have garbage collection and has a memory footprint very close to C.

Your GP didn't just mention Rust, they also mentioned C# and Java, so when your parent refers to GC the more charitable interpretation is that they were using C# or Java and responding to that part of the message.

Nim's ARC is one of the most lightweight memory managers.

Is there an example of a task like that? Specifically, ones that don’t talk to a database/files or another service, in which case the talker’s performance becomes irrelevant, unless it is really crawling. Most code “we” write is scheduling queries, glueing datasets together and jsoning the results into a socket through some stream library. IO takes 98% anyway, 2% rerouting and checks. Personally I’m fluent in a range of languages, but wouldn’t ever think of writing networking in C or a similar low-level environment. A mountain of work and skill for something expressable in just a few lines of python/perl/js/ts/lua/sql, zero economy. (Okay maybe an nginx plugin in a critical case when multiplying in instance costs doesn’t help.)

That's the crux of the issue. Most of webdev is just managing a huge amount of very simple pieces of code, where I/O from somewhere, to somewhere dominates.

One particular issue I remember when talking with Alibaba engineers when I worked on a subcontractor for a custom DC project was "1 second kill"

That's a phenomenon when some super good deal is posted onto the front-page of Taobao, and they get squished when people from all over China smashing F5 click on the deal.

Purchasing looks like a lock, and write task from the DB side, and the whole of Taobao.com was tied onto a single point of failure MySQL cluster in 2016 abused to the maximum.

They went to Computer Science people which only said that there is no way around locking, and a single DB write origin.

External contractors wrote a super-duper performant "database gateway" which organised, and queued purchase reservations to the stock database at around 50hz.

People confuse performance (CPU) and throughput (IO). The reason CPU usage is undervalued is that it's simply not the bottleneck typically. Very few people actually have the challenge of having millions of requests per minute. The people that do, can scale horizontally by just throwing load balancers and cheap vms at the problem. That gets you both bandwidth, memory, and CPU. Scaling horizontally is very competitive with whatever cleverness engineers can provide to reduce the need for that. Engineers cost a lot more to scale than VMs.

We all love to optimize but there is a point of diminishing returns. Scaling a cheap vm with CPU credits (i.e. it's not even supposed to have double digit CPU usage other than for short bursts) costs nothing compared to the salary involved with e.g. making something like that 2x or 3x faster. And the yields are terrible too, even if you succeed.

Say you are running 12 vms costing about 20/month. That should get you some CPU and modest amount of memory and would be something appropriate for a web server. Twelve of these means you have fail-over, multiple availability zones, and possibly even regions. Going from 12 to 6 would be a nice cost saving of about 120/month. Except now you have less bandwidth to go around and maybe a bit less resilience. That's a decent but not amazing freelance rate per hour. If you spend a week on implementing this, the return on investment (i.e. your time) would be 40 months (assuming an unreasonable 40 hour work week), or about 3 years a and a bit before you earn back the expense of your time. Now say that instead of spending that time, you simply scale to 24 servers: i.e. you double your cost from 240 to 480 per month. Or about four hours of your hypothetical hourly rate. Or about half a day. So a week of your time still adds up to nearly a year of simply running at 2x the capacity.

If you are any good, your rate might be higher and the tradeoff is even worse. Not even worth having meetings about. Using C for this stuff means hiring more expensive C developers and making a bad deal even worse. The smart way to get performance is to have those C developers work on the OSS infrastructure we all love to use. It will trickle down and get us decent performance elsewhere. Nodejs is actually built on lot of C/C++ libraries and benefits from a lot of cumulative optimizations that have gone into these libraries. That's why it is so competitive in this space. There are a gazillion other languages to consider with similarly good enough performance and throughput. C is a last resort when performance wins over security and stability concerns. Sometimes it does, but mostly it doesn't make sense from a cost point of view.

According to TechEmpower benchmark, C# and Java perform better in some tasks than C/C++ [1]

Even in categories where C/C++ are more performant, other languages are not that far behind.

If at all possible, we should not use memory unsafe languages for anything, especially something exposed as a server. No mater how careful you are, and with all the tooling available, majority of exploits in popular software happen due to memory unsafe languages.

[1] https://www.techempower.com/benchmarks/#section=data-r20&hw=...

Benchmarks are useless if you don't understand what it is you're measuring. A contrived "Hello world" benchmark tells you nothing because your bottleneck is in the system calls, which has nothing to do with what framework you use. If you run a web service that requires a compute or memory intensive task, like a game server physics engine, or multimedia streaming, or anything that requires large-scale postprocessing, you're going to heavily rely on C or C++ based libraries to do the heavy lifting.

People like you pointed to benchmarks in the past and said the exact opposite. It is sad that any kind of reasoning stops at some benchmarks

Memory safety issues are exaggerated. It's definitely harder to keep big, complex software memory sanitised, but it's not something completely insurmountable.

You use C judiciously on the most performance demanding tasks, while trying to bring the overall task itself closer to some simple algorithm, on which you can later throw heavy verification, like formal verification, valgrind it to death, fuzzing etc.

The current wave of "new age" computer languages like Koltin, Go, Rust have a very noisy activist userbase which tend to extol some very simple, obvious things as ultimate virtues.

Memory safety issues are bugs. Do you know any programmer that does not occasionally create bugs? Don't forget tight schedules, low budgets, ...

Also rust is just what you propose that - a programming language integrated with heavy verification of safeness built-in. Because occasionally someone writes c code without using all available tools to verify the code it is better to have it built in.

Memory safety is not an issue if you actually learn to take advantage of the C toolchain. I've caught memory leaks and buffer overflows to great effect just by using Valgrind and ASAN. And for most applications, you can limit the attack surface by only writing C for the performance-sensitive areas and using FFI to call into those routines. As a bonus, it becomes much easier to unit test for logical corner cases.

This just isn’t true in practice. Can you point to a popular c project that’s accomplished this? I bet there are a few tiny ones that make such claims but haven’t received scrutiny.

IIUC, it needs extensive code coverage and it's even difficult for library (the lws case).

> Do you know any programmer that does not occasionally create bugs? Don't forget tight schedules, low budgets, ...

Do not run programming in C on tight schedules, and low budgets. It needs tact, and understanding.

“Have you simply considered not having bugs” isn’t a useful strategy though.

It helps 40 years of wasted advocacy how good programmers never do errors with C.

I can´t wait for governments to make security exploits liable, then lets see how much software will still be written in C, or derived languages.

I wish there was something like this for WebRTC data channels.

There is this[0]. It’s C++ but the API is simple and it works well, you don’t need to setup any extra ICE/TURN/STUN stuff either.

[0] https://github.com/seemk/WebUDP

Has anyone used this in a multithreaded implementation, and how is/was the experience?

I used to roll my own code for websockets since the protocol is so simple. Switched to libwebsockets when I had to support TLS since I did not wanted to fiddle with openssl directly.

Overall the experience was good. It's simple, to the point. Configuration of the library is a bit like dark magic though, not much documentation.

> Has anyone used this in a multithreaded implementation

It's an event loop based library, so the point is more to have a single asynchronous main thread that fetches the messages, which you can dispatch to threads for processing if you wish.

Have you tried libkj? It’s the c++ runtime that underpowers Cap’n’proto’s RPC layer as well as Cloudflare Workers. Has many of the same features - curious how you find the documentation.

Disclaimer: I work at Cloudflare.

I would love to, but I stopped bothering with C++ a long time ago, and just use C nowadays :-)

>I did not wanted to fiddle with openssl directly

It's actually fairly simple to use openssl directly if you have a good handle of sockets in general, it's basically just a case of doing some initialisation and using SSL_write/SSL_read in place of whatever write/read functions you were using before to write to the socket

You can just fork multiple threads that are each their own server, and then do communication over thread safe queues to workers in other threads.

Yes, you run it on its own thread and output messages to a threadsafe queue for other threads to consume as is the usual practice. Or did you ask for something else?

How does performance compare to a C/C++ UDP implementation?

This really needs to be a system library; I'm rooting for it.

When will people stop writing C? The memory safety issues make go, rust etc. much better solutions.

When the entire OS is written entirely in Rust or Go from the kernel on up, and all applications are also written in the same language.

Oh, and the silicon itself becomes adapted to the paradigms presented by those programming languages, since C was designed to work on the existing silicon. Forcing entirely new hardware designs to meet an evolving and always-changing software paradigm is an expensive proposition in a commodity market and it will take either a lot of central control and will power or a lot of time.

Up to 1972, computing world managed without C, and even afterwards plenty of systems until the early 1990's kept doing quite well without any trace of C code.

C was created in response to then dominating practice to use assembler for all kinds of coding tasks (not just “system programming”). I wouldn’t characterize the situation as “doing quite well.” On the other hand, C didn’t take off on IBM System 370 until much later, due to the availability of PL/I.

Jovial, ESPOL/NEWP, PL/I, PL/S, BLISS and a couple of others did exist and were in active use outside Bell Labs.

Even Multics actual history was only a failure from Bell Labs perspective, as they went on and were even considered more secure in a DoD security assessment.

Even IBM did all their RISC research in PL.8, before deciding to create what would be AIX, as by then it was all about UNIX workstation market.

Had AT&T been allowed to sell UNIX from day one at the same price as competing OSes, I bet C wouldn't be around.

When companies start to be made liable for security exploits and it starts really hitting how much money they get at end of each quarter.

U mean the language where they use unsafe, to throw away all the memory safety?

Unsafe does not do that; mostly unsafe allows to dereference pointers, but does not disable the type system.

Borrowing rules are still enforced (eg you cannot write with a read-only pointer)

C is effectively a _predictable_ and portable assembler. You can't do that with Rust because everything is catastrophically moved, which makes it harder for humans to predict emitted code, and Go afaik has a runtime.

Preditable when code is either compiled with -O0 or the target CPU is a PDP-11 clone.

Linus seems to disagree with you.

Albeit old http://www.youtube.com/watch?v=MShbP3OpASA&t=20m45s

It remains true.

Rust uses references all over the place and reuses the same memory addresses within stack that once belonged to another object because the compiler can guarantee it, but that makes it much less reasonable to write.

If you can reason with -O0 you can take it that your code will remain correct in later levels.

I don't care what Linus thinks, I also happen to disagree with him in regards to C++ and NVidia.

> If you can reason with -O0 you can take it that your code will remain correct in later levels.

Unless you happen to be a Jedi master in UB performance optimizations, auto-vectorization, OMP, out of line code execution, I very much doubt that.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact