
Rethinking the D-Bus Message Bus - kragniz
https://dvdhrm.github.io/rethinking-the-dbus-message-bus/
======
Animats
_We rather consider a bus a set of distinct peers with no global state._

If they've gone that far, they may as well implement QNX messaging, which is
known to work well. QNX has an entire POSIX implementation based on QNX's
messaging system, so it's known to work. Plus it does hard real time.

The basic primitives work like a subroutine call. There's MsgSend (send and
wait for reply), MsgReceive (wait for a request), and MsgReply (reply to a
request). There's also MsgSendPulse (send a message, no reply, no wait) but
it's seldom used. Messages are just arrays of bytes; the messaging system has
no interest in content. Receivers can tell the process ID of the sender, so
they can do security checks. All I/O is done through this mechanism; when you
call "write()", the library does a MsgSend.

Services can give their endpoint a pathname, so callers can find them.

The call/reply approach makes the hard cases work right. If the receiver isn't
there or has exited, the sender gets an error return. There's a timeout
mechanism for sending; in QNX, anything that blocks can have a timeout. If a
sender exits while waiting for a reply, that doesn't hurt the receiver. So the
"cancellation" problem is solved. If you wan to do something else in a process
while waiting for a reply, you can use more threads in the sender. On the
receive side, you can have multiple threads taking requests via MsgReceive,
handling the requests, and replying via MsgReply, so the system scales.

CPU scheduling is integrated with messaging. On a MsgSend, CPU control is
usually transferred from sender to receiver immediately, without a pass
through the scheduler. The sending thread blocks and the receiving thread
unblocks.

With unidirectional messaging (Mach, etc.) and async systems, it's usually
necessary to build some protocol on top of messaging to handle errors. It's
easy to get stall situations. ("He didn't call back! He said he'd call back!
He promised he'd call back!") There's also a scheduling problem - A sends to B
but doesn't block, B unblocks, A waits on a pipe/queue for B and blocks, B
sends to A and doesn't block, A unblocks. This usually results in several
trips through the scheduler and bad scheduling behavior when there's heavy
traffic.

There's years (decades, even) of success behind QNX messaging, yet people keep
re-inventing the wheel and coming up with inferior designs.

~~~
AceJohnny2
So, SIMPL?

 _Synchronous Interprocess Messaging Project for LINUX (SIMPL) is a free and
open-source project that allows QNX-style synchronous message passing by
adding a Linux library using user space techniques like shared memory and Unix
pipes[3] to implement SendMssg /ReceiveMssg/ReplyMssg inter-process messaging
mechanisms._

[https://en.wikipedia.org/wiki/SIMPL](https://en.wikipedia.org/wiki/SIMPL)

[http://icanprogram.com/simpl/](http://icanprogram.com/simpl/)

~~~
Animats
If you do it via pipes, the performance will be terrible. If you do it via
shared memory, there's a good chance that one side crashing will take down the
other side.

QNX itself implements pipes via messaging.

~~~
transpute
Since QNX is proprietary, QNX messaging is likely patent encumbered.

What's a good open alternative, or starting point for new innovation, given
current investments in microservice architecture?

    
    
      - ZeroMQ (and nanomsg)
      - gRPC (Google)
      - Apache Thrift (Facebook)
      - Finagle (Twitter)
      - L4/seL4 IPC

~~~
sametmax
The wamp protocol is interesting imo. It's independant a serialisation and
transport despite the fact it uses websocket + json by default. It does routed
rpc and pub/sub out of the box. But no low level router yet.

~~~
marqis
Check out crossbar.io, it's a broker from the authors of wamp.

------
onli
That sounds reasonable. I'm very surprised. Disabling remote targets, ignoring
SELinux, focusing on reliability.

DBus is the one part of the modern linux desktop I would like to/have to
install to get the applications I want running, even though I dislike it a lot
(pulseaudio and systemd one can just not install). One example is the password
remember function of steam. Having a more reasonable implementation could help
with this a lot.

~~~
tomegun
For the record: dbus-broker has full SELinux support.

~~~
onli
Well, I don't mind either way. [https://github.com/bus1/dbus-
broker/wiki#using-dbus-broker](https://github.com/bus1/dbus-broker/wiki#using-
dbus-broker) says it has to be disabled.

~~~
tomegun
Indeed, thanks for the pointer! Removed that now (it was left-over from before
we got SELinux support).

------
atemerev
Dbus is bloated hell. Whoever came with the idea "let's cram all
communications from all sources into the single unified data stream, and let
the clients fish what they need out of it" had the strange mapping of mental
processes, to say the least. Most other forms of IPC are better (more
scalable, more elegant, more comprehensible) — "everything is a file" is
better, actor model is better, and I nearly think that even plain shared
memory is better than a common bus.

There is a reason there is no "shared bus" in Internet communications.

~~~
pjc50
Well, there's multicast, but nobody uses that. Some things use link-local
broadcast.

Personally I think the Windows messaging system would actually be a pretty
good model to follow, especially if you could give it an actual payload and
not just two words. It would certainly solve the actual problems DBus was
built to address - media change notifications and things like that.

~~~
fusiongyro
We actually rely on multicast here at the NRAO for monitor and control of the
VLA. I admit it's the only place I've heard of it being used.

ZeroMQ is getting used more for those kinds of purposes; the Greenbank
Telescope uses it for one of their instrument backends and we are now using it
for VLITE and REALfast. The new archive system I'm helping build uses AMQP.

~~~
jcurbo
This is fascinating (as an amateur astronomer), have you written anything else
on the systems design of these types of instruments, or have any good
pointers? I imagine for large instruments there is a massive amount of data
involved and lots of stuff to control. I know it's bad enough with just my one
telescope setup :)

~~~
fusiongyro
There is quite a bit of technical documentation about our systems, and the
code is supposed to be open-source (though a lot of it isn't publicly
accessible as it should be). But I haven't written anything meant to be a
high-level overview of how it works. I think I will try and do that in a few
days and send you an email.

I have been joking about how we should build a "total sh*t array" out of old
Dish/DirecTV antennas, so that we could explore the systems design without
worrying too much about whether anything could be done with the data
collected. This hasn't interested my coworkers that much :) There is an
amateur radio astronomy society, and there are plans for how to build various
levels of radio telescope, starting from ~$50 and an old Dish receiver and
going up. And our open-skies policy means that you do not have to be a
professional astronomer to use our instruments, although I only know of one or
two amateurs that have proposed for time. (They did get it, though).

As a rough back-of-the-hand deal, we allow anybody who gets time to access
about 25 MB/s of data. The correlator we have currently (WIDAR) can certainly
output much more than that, up to gigabytes per second, but the rest of the
infrastructure certainly can't keep up at that rate sustained. It's not
unusual for an observation to top a few TB in size. ALMA data files are
probably even larger on average.

We are already in the early design stage for a next-generation VLA, which will
increase the number of antennas to about 300. At that point, we probably won't
be able to keep correlated but unprocessed raw data, just because of the sheer
size of it.

~~~
jcurbo
Thanks, this is interesting! I work at JHU/APL where there are people that do
similar stuff, although I don't personally. The APL Astronomy Club is actually
in the midst of trying to get some of our unused scopes & equipment in better
shape and possibly put to use so I'm thinking about ways to automate things.
Plus I'm not happy with the state of data processing for amateurs, it seems
like everything is bespoke and old freeware or expensive software packages.
That is what got me thinking about how the pros do things. Looking forward to
your email!

------
arca_vorago
"Linux Only"

I like this approach more and more these days. For example, I run
murmur(mumble) servers sometimes, and they deprecated d-bus support for ZeroC
ICE (gplv2 or proprietary), but it seems almost as bloated if not more so. The
reasoning was mostly around the portability bindings...

Recently though, I have been refusing to support Windows and OSX as a concious
decision. One thing I've found is that the constant want/need to target every
platform adds an ever-increasing amount of complexity, which really seems to
go against the unix philosophy. So I applaud others willing to buck the trend
and narrow scope down.

In the end, I think the main problem with the many eyes theory is that code
has gotten so complex that there simply aren't enough eyes, and therefore I
think the future of software is going to be in reduction of complexity. For
example, loc isn't the best measure, but the Minix 3 kernel is at ~20kloc,
while the Linux kernel is now at, what ~11mloc!? Not even redhat can audit
that shit properly. (another reason we need a Hurd microkernel, but I digress)

~~~
rleigh
Ice is doing a lot more stuff than dbus, it's mainly features rather than
bloat.

~~~
arca_vorago
I def see a lot more features that actually work in ice, that's for sure.

------
zokier
Just noticed that this lives under bus1 github organization; does that imply
that eventually it will be using bus1?

Btw, whats happening at bus1, haven't heard about it lately?

~~~
tomegun
> Just noticed that this lives under bus1 github organization; does that imply
> that eventually it will be using bus1?

That is something we intend to explore. The idea would be to let bus1 be used
under the hood by dbus libraries to do peer-to-peer communication where
possible (circumventing the broker) but still stay compatible to the D-Bus
semantics.

> Btw, whats happening at bus1, haven't heard about it lately?

We spent half a year working on dbus-broker ;)

------
baybal2
As I remember from more than a decade ago, the selling point of DBUS was that
they were not trying to design a high performance message bus with
sophisticated work mechanisms in spirit of Corba and Bonobo, but a small,
flexible, and utilitarian one.

Things like implicit message buffering were deliberate design decisions.

~~~
ajross
IMHO the problem with D-Bus was that it was _never_ small and utilitarian.
They decided (correctly) to ignore all the engineering effort involved in
performance and scalability, and put all that overengineering into the API
instead.

D-Bus code is basically unreadable, as not only are the bus names heavily
scoped (java-style) to avoid collisions, but also the interface and method
names. A tiny python (or whatever) script to invoke a single method on a well-
known object _should_ be a one-liner but in practice lives over 6-7 lines just
due to verbosity.

D-Bus types are inexplicably low-level for a "utilitarian" IPC mechanism,
leading to a bunch of type conversion to do simple things, and a _ton_ of
marshalling code in the core. Javascript has shown us how far you can get with
just IEEE doubles and UTF-8 strings, yet D-Bus suffers with a type model that
looks more like C.

~~~
fiddlerwoaroof
Yeah, I used to have a whole bunch of shell scripts to automate KDE 3 apps via
dcop and then when dcop was dropped in favor of dbus, the complexity of the
latter system discouraged me from porting the scripts.

Whatever technical limitations dcop may have had, its command line was
amazing: space separated words and a emphasis on discoverability made it a joy
to use

~~~
simcop2387
qdbus is an almost adequate replacement for that. It's still more verbose and
a bit more difficult to pass some arguments (this is all from memory) than
dcop was, but it's servicable.

------
revelation
Ahh yes, we know this all too well, the Linux desktop trap:

iterative work is lame, the old solution is so bad it's not even wrong, here
is my idea for a rewrite, look it's even still compatible (for another few
minutes).

~~~
asveikau
I disagree with this characterization, because they explicitly state that
compatibility with the original is a goal. When people have the more careless
"re-write and throw-away" attitude, they often abandon the old API too.

So this reads more of a strategic re-write, or re-do of implementation while
keeping API, which I think is often a smart way to do it.

------
chme
So is the dbus-broker the latest project from the kdbus/bus1 guys.

Since from the text dbus-broker does not use the bus1 kernel module, does that
mean the bus1 project is dead?

~~~
tomegun
bus1 is very much not dead. We intend to work on the next RFC soon.

~~~
oconnor663
If bus1 magically landed tomorrow, what would that mean for dbus-broker? Are
the projects related at all, or mostly doing different things?

~~~
tomegun
At the moment dbus-broker does not have code to take advantage of bus1, but we
intend to explore adding bus1 support to dbus-broker, so that peers (if their
libraries support it), would seamlessly communicate peer-to-peer
(circumventing the broker) when possible.

------
throw7
are you still required to reboot the system if you upgrade "dbus-broker"?

~~~
tomegun
Yes, for the time being we do not support reexecution.

------
JdeBP
One thing that I wonder about this is how it deals with the D-Bus Death
Rattle.

* [https://jdebp.eu/FGA/dbus-death-rattle.html](https://jdebp.eu/FGA/dbus-death-rattle.html)

~~~
tomegun
This is an application issue, not a brokre/daemon issue. The broker will (as
dbus-daemon(1)) does, deliver all signals that clients subscribe to. If they
subscribe to things they don't care about, that is something that should be
fixed in the clients. During the kdbus times, quite some time was spent on
fixing clients to avoid too broad subscriptions exactly to fix the issue
described in that blog post.

~~~
JdeBP
Not every page on the WWW is a blog post, nor even every WWW site a web log to
have blog posts on.

~~~
tomegun
My apologies, it read like a blog. Seeing the parent page I see that it is
not. Either way, my comment stands.

------
j_s
I'm sure systemd would be happy to take over responsibility for this
functionality. (Sorry, couldn't resist!)

~~~
Rjevski
If systemd does it right (just like it did with making service management)
then I see no issues with this.

~~~
work_account
The problem with systemd usurping basic Linux functionality is that it makes
it really difficult for non-systemd distros to keep up.

It's not like you can run 'systemd-udevd' standalone, for example. Instead
there are massive "porting" efforts like eudev and elogind, just to extract
the functionality BACK from systemd. And then you have obsolete-but-necessary
components such as ConsoleKit and PolicyKit that are stuck on ancient pre-
systemd versions with no current replacement.

I started using systemd back before they even took over "udev". Back then
systemd was a breath of fresh air. Now I'm using a different service manager
and observing systemd gobbling up various critical parts of the Linux desktop
like some damn Katamari is like watching a train accident in slow motion.

~~~
kuschku
Have you ever considered why it is so much effort to extract this
functionality?

systemd can iterate quicker, and work faster and better, because they can
share more code between projects.

Code that should have been in the stdlib, provided by the distro, but which no
one does. So it ends up in systemd.

You see the issue even in GNU yes, which implements its own version of a
buffered output, or in cat, which does the same, but slightly different.

All these things should be in the stdlib, and because they’re not, those
projects that can use premade solutions iterate a lot quicker, and can get
better, faster.

~~~
CyberDildonics
If what they are doing is so fundamental, why isn't it more modular?

------
digi_owl
And i see the PR team is already out in force to sell this and defend what has
already been sold.

We should really just move to BSD already and let them sink this ship.

