
Show HN: Moustique – C++14 coroutine-based non-blocking IO on Linux - matt42
https://github.com/matt-42/moustique
======
Const-me
I haven’t developed much for Linux, but I think it’s a good idea to replace
your std::vector<ctx::continuation> fibers with std::unordered_map<int,
ctx::continuation> fibers.

Here’s why:
[https://stackoverflow.com/a/9376493/126995](https://stackoverflow.com/a/9376493/126995)

~~~
drfuchs
That StackOverflow comment is wrong. Linux guarantees that for open(2) /
creat(2) / socket(2) / etc. "[t]he file descriptor returned by a successful
call will be the lowest-numbered file descriptor not currently open for the
process." This has been the case for the entire 40+ year history of Unix, and
was certainly relied upon in the days when select(2) could only handle file
descriptor values that were less than 32.

See the man page at, e.g., [http://man7.org/linux/man-
pages/man2/socket.2.html](http://man7.org/linux/man-pages/man2/socket.2.html)
from which the quote above is taken. Note that I'm not expressing any opinion
on whether to use a vector or an unordered_map in this particular application.

~~~
Const-me
> Linux guarantees that for open(2) / creat(2) / socket(2) / etc.

Not “etc.”, accept(2) that creates the sockets in that code doesn’t guarantee
that.

See the man page at, e.g., [http://man7.org/linux/man-
pages/man2/accept.2.html](http://man7.org/linux/man-pages/man2/accept.2.html)

~~~
drfuchs
Hmmm. Looking further, the dup(2) man page also mentions "lowest," but pipe(2)
doesn't. A documentation oversight, I claim; after all, what would be the
point of the kernel not using a single function for "get me an available fd"
everywhere that it needs one? Not as reassuring as finding it in the specs,
though; here's what
[http://pubs.opengroup.org/onlinepubs/9699919799/](http://pubs.opengroup.org/onlinepubs/9699919799/)
has to say:

"2.14. File Descriptor Allocation

All functions that open one or more file descriptors shall, unless specified
otherwise, atomically allocate the lowest numbered available (that is, not
already open in the calling process) file descriptor at the time of each
allocation. Where a single function allocates two file descriptors (for
example, pipe() or socketpair()), the allocations may be independent and
therefore applications should not expect them to have adjacent values or
depend on which has the higher value."

~~~
Const-me
> what would be the point of the kernel not using a single function for "get
> me an available fd" everywhere that it needs one?

Multithreading.

To return lowest-numbered file descriptor, calls to that single function need
to be serialized. On multi-core, and especially on NUMA, I can see how that
might hit the performance.

Apps don’t normally create thousands of files, or pipes, or client sockets per
second. However, for some server software, accepting thousands of sockets per
second is normal.

Update, see also multi-queue NICs:
[https://www.kernel.org/doc/Documentation/networking/scaling....](https://www.kernel.org/doc/Documentation/networking/scaling.txt)

------
Matthias247
Please be fair if in the readme that a comparison to ASIO is a bit apples to
oranges. This one seems to be mostly a learning project, and is heavily work
in progress.

boost asio already works with boost coroutines in a reasonable fashion. And
this would certainly be a good way if someone wants to build some production
project on top of it.

Regarding the code and example itself:

Having a separate callback for connection close seems to be a bit backward,
since it again means the connection state is spread out over multiple
callbacks. The purpose of a coroutine is to have everything in scope of that
routine. In that case the connection close should be signaled by reading 0
from the socket, or the read returning an error.

~~~
matt42
Removed the comparison with ASIO. But note that this is not "heavily in
progress" but ready for production too.

I took your remark for the closing callback. read now returns 0 and write
returns false when the connection is closed or lost.

------
ac_slater
Not super fair to say _linux sockets_ (in the README) and _I /O_ in the title
when it's really just a TCP client - you force SOCK_STREAM and bind(2).

~~~
matt42
This is default behaviour but you can fill you own socket file descriptor if
you want, using the second overload of moustique_listen:

template <typename G, typename H> int moustique_listen(int listen_fd, G
closed_connection_handler, H data_handler);

~~~
matt42
But you're right, maybe we want two moustique_listen_tcp moustique_listen_udp
helpers. I put this in my todolist.

~~~
primitur
While you're at it please add a namespace .. it'd make it a lot easier to
administer in a project.

Great project by the way - I've put it on my workbench-TODO for a little
hacking some time in the next few days. I've got a standard 'command
processing server' project that I've been hacking on over the years, I might
try to hack up a moustique_ integration, just for grins.

------
matt42
Note: I wrote this yesterday after I wrapped my hands around coroutines. This
is a work in progress (even if it's already fully working) I'll take any
remarks here.

~~~
notfed
Since you asked, I suggest running your readme through a spell checker. I
didn't bother looking further than that.

~~~
bjoli
Not everyone has English as a first or even second language. No bothering
because of spelling seems like a good way to filter out some cool pieces of
code.

~~~
krotton
With some potentially unreadable documentation. Spoken language is not less
important than programming languages (and interestingly, from my experience,
there seems to be at least some correlation between caring about either).

~~~
matt42
I did not spend too much time writing the documentation because of the
simplicity of the library. I thought that the tcp echo example would be
enough. But I'll spend more time in the Readme, at least fixing the language
issues :)

------
SamReidHughes
You've got memory leaks and resource leaks there.

~~~
matt42
Got it. I'll fix them asap thanks

------
petters
This is the API:

int moustique_listen(const char* port, G closed_connection_handler, H
data_handler);

int moustique_listen(int listen_fd, G closed_connection_handler, H
data_handler);

This could be improved, because the port argument could easily be confused
with the file argument.

I.e.: moustique_listen(80, ...)

~~~
Koshkin
> moustique_listen

What, no love for namespaces?

~~~
matt42
I did it this way because This library contains just one function. Would you
still use namespaces in this case?

~~~
w8rbt
It's just more C++ ish to write something::listen rather than
something_listen. Namespaces are easy too. Why not?

~~~
matt42
My fear with this approach that it allows to write "using namespace
moustique". This would cause: \- name conflicts on the listen function. \-
Code harder to read since it's not obvious anymore that listen comes for the
moustique library.

I often use namespaces but I do think that they can have a bad impact on the
usability of the api, especially when you have too many deeply nested
namespaces, and have to use plenty on using namespace ... to fix this.

