
DBus, FreeDesktop, and lots of madness - billiob
http://gentooexperimental.org/~patrick/weblog/archives/2014-11.html#e2014-11-23T09_26_01.txt
======
captainmuon
I really liked the "predecessor" DCOP better. Applications publish objects,
others can call methods (one-shot) or functions (wait for response).

From DBUS' own (old) FAQ:

> D-Bus is a bit more complex than DCOP, though the Qt binding for D-Bus
> should not be more complex for programmers. The additional complexity of
> D-Bus arises from its separation of object references vs. bus names vs.
> interfaces as distinct concepts, and its support for one-to-one connections
> in addition to connections over the bus. The libdbus reference
> implementation has a lot of API to support multiple bindings and main loops,
> and performs data validation and out-of-memory handling in order to support
> secure applications such as the systemwide bus.

> D-Bus is probably somewhat slower than DCOP due to data validation and more
> "layers" in the reference implementation. A comparison hasn't been posted to
> the list though.

IMHO DBus suffers from the kind of overengineering that's been endemic in
Desktop Linux in the last few years: DConf/GSettings (as a less flexible, IMHO
unneccessary replacement for GConf), PulseAudio and NetworkManager (initially
really bad, now work nicely, as long as you don't have to debug problems...),
all the *Kit stuff, systemd, and so on. For me, libraries like this don't
really solve problems on average, but cause regressions.

~~~
hp
I think people who haven't hacked on making nice UIs for system features don't
understand what problems these things solve. But people who have do, and
that's why they code them.

Is there a simpler, less-engineered way? Probably, in the truism sense that
all software sucks. But then, anyone could have coded this better way and made
it work, and they didn't. So the current work has the advantage that somebody
did it and it exists. I'll take that.

Knowing the problems solved here I actually think the current stuff is pretty
good. Not flawless -it's software - but good. It does pretty much work. Go use
15 year old Linux if you want to replace nostalgic memories with a good dose
of how much it sucked :-p

~~~
djcb
Agreed -- I remember some of the alternative ways to solve the same problems,
such as CORBA and various custom protocols, but they were painful to use...

Now, early DBus was a bit lacking in that respect as well, but with GDBus
(GLib's DBus implementation) it's become as straightforward as IPC can be,
even in plain C; I've seen quite a few people new to the technology that were
able to get up to speed with it quickly.

It'll be interesting to see how people are taking DBus and pushing the
boundaries of what it can be used for, such as kdbus.

Thank you hp!

~~~
emcrazyone
Interesting perspective. I thought I read somewhere that ZeroMQ was
championing becoming apart of the Kernel. I work in the embedded space and not
knowing much about DBUS other than it's some fancy IPC and ZeroMQ, we went
with ZeroMQ just because we saw a clear path for what we needed get done.
Granted all our IPC communication was between our own apps. but after reading
your post I keep wondering what REAL problems does DBUS solve besides having a
standard messaging format? Could someone explain why a simple messaging format
couldn't be deployed on top of something like ZeroMQ?

~~~
hp
This probably deserves a long blog post or something (maybe I already wrote it
somewhere) but here's a teaser.

dbus is not mostly about IPC.

Linux desktops, including gnome, KDE, and those before them and alternatives
to them now, use a "swarm of processes" architecture. This is as opposed to an
alternative like smalltalk, Eclipse, Firefox, or Emacs where lots of plugins
are loaded into one huge process.

Problems common in server side IPC which aren't as big an issue here:
scalability; network partitioning; protocol interoperability.

Problems which are more of an issue: service discovery (can't just use DNS);
tracking lifecycle of other processes; inherent singleton, stateful nature of
hardware, the kernel, and user interfaces.

The main way dbus helps with this is the star topology with a daemon that can
start on demand and track all the processes. IPC is then coordinated with this
in such a way that race conditions can be avoided, for example you can start a
service and send it a command without a race that your command arrives too
soon.

Anyhow this is just enough to get an interested person tracking down the
details, I'm not spelling it out obviously.

------
newuser88273
X11 was a great success because it was a protocol that you could write X
servers and X clients and X window managers against.

Nobody wants to do that anymore. There are two major forces to blame.

First, open source. A well designed protocol is much more work, and you can
avoid it by just pointing to the open source'd implementation, as this article
(hilariously) shows for the case of DBUS.

The second force is the adware/spyware model of web and app monetization. You
don't want people to use their own clients against a protocol (email, usenet,
web 1.0) because you can't serve ads as effectively and you can't run
analytics on their every mouseclick, touch gesture and keypress.

The whole systemd debacle would be much defused if systemd had been a couple
of well thought-out and stable protocols, much like X11, instead of a big
source-blob of underspecified and ever-shifting implementation.

~~~
nly
It's interesting that you mention X11. The Wayland developers recognised that
X11 had not only become fat, which isn't inherently problematic, but that
almost all of the old fat had become dead weight. X11 essentially became a
horrific over-engineered, vaguely graphics related, IPC mechanism... just like
DBus. Wayland _is_ an RPC protocol.

I'm kind of hoping IPC mechanisms similar to those used by Wayland, which fell
out of all the work on XCB (a clean X11 equivalent binary protocol for Xorg
that, iirc libX11 is now built on top of), will ultimately be adopted by other
projects. Interestingly they implemented RPC dispatch using libffi, which is
pretty elegant.

The Wayland FAQ, in fact, has a rationale for avoiding DBus[0]. The core
Wayland framework certainly hasn't suffered in terms of bloat or complexity by
avoiding it. Go look at the code[1], and compare it to DBus[2]. Admittedly the
topologies are different, but as far as I'm aware nothing prevents Wayland
clients from establishing their own P2P communications.

[0]
[http://wayland.freedesktop.org/faq.html#heading_toc_j_10](http://wayland.freedesktop.org/faq.html#heading_toc_j_10)

[1]
[http://cgit.freedesktop.org/wayland/wayland/tree/src](http://cgit.freedesktop.org/wayland/wayland/tree/src)

[2]
[http://cgit.freedesktop.org/dbus/dbus/tree/dbus](http://cgit.freedesktop.org/dbus/dbus/tree/dbus)

~~~
panzi
> XCB (a clean X11 equivalent binary protocol for Xorg that, iirc libX11 is
> now built on top of)

I thought XCB implements the same protocol but provides another (more modern)
API. Am I mistaken? I never wrote code using libx11 or libxcb. Well, not 100%
true, I forked and improved a tiny project that touches X11 at two small
points. Doesn't really count. I don't get any understanding of X11/XCB from
that:

[https://github.com/panzi/qjoypad/blob/88ee6c1ed82999febc64b9...](https://github.com/panzi/qjoypad/blob/88ee6c1ed82999febc64b965d4bcf27432f221e0/src/event.cpp#L5)
[https://github.com/panzi/qjoypad/blob/88ee6c1ed82999febc64b9...](https://github.com/panzi/qjoypad/blob/88ee6c1ed82999febc64b965d4bcf27432f221e0/src/getkey.cpp#L47)

~~~
dllthomas
This is basically true. It's the same packets on the wire - the biggest
difference is that for most packets libx11 waits for a response while xcb
returns a token representing a promise.

------
Animats
Somehow, attempts to bolt message passing onto Linux never seem to be very
good. Probably because the primitives underneath are a poor match.

QNX, which is a real-time microkernel, got message passing more or less right.
You connect to a port of another process. Then you send with MsgSend, which
sends a message of N bytes, and waits for a reply. So it's like a procedure
call.

The receiving end (considered the server) does MsgReceive, which blocks
waiting for work, gets the bytes, and returns a reply with MsgReply. That's
QNX messaging.

Everything goes through this, including all I/O. It's very fast, and
integrated with the CPU scheduler, so most message passes just transfer
control to the other end without scheduling delay. This allows rapid tossing
of control back and forth between processes without going to the end of the
line for CPU time.

Because QNX is a real-time OS, there are some additional features. Messages
are ordered by thread priority, so real-time requests are serviced ahead of
non-real time. (This works well enough that running a compile or a browser
doesn't impact hard deadline real-time work.) Any request can have a timeout,
in case the other end has a problem. Finally, when a MsgSend from a high
priority process goes to a lower-priority process, the receiving process gets
the higher priority until the MsgReply, to avoid priority inversion.

Linux messaging almost always goes through unidirectional byte pipes of some
sort. So you need a protocol just to figure out where the message boundaries
are. D-Bus seems to be struggling with that. Building a call-like mechanism on
top of unidirectional pipes means that callers do a write followed by a read.
For a moment, between the write and the read, both sender and receiver are
ready to run. This means a trip through the scheduler, or worse, starting the
receiving process on a different CPU and suffering cache misses.

It's one of those things where the wrong primitives at the bottom cascade into
layers of complexity above.

~~~
cbsmith
Why, oh why, don't people use datagram sockets for this stuff!?!

~~~
Animats
That's still a one-way communications scheme, it has a length limit, and you
can be spoofed on message source.

~~~
cbsmith
No, it's not a one-way communications scheme. It's a connectionless
communications scheme, and no more vulnerable to spoofing than the connection
oriented protocols.

------
drdaeman
The protocol is ASCII-only, but strings are UTF-8, and then they talk about
endianness?

That's not a modern message bus, that's a post-modern one.

~~~
hp
It isn't ASCII only, and I don't even see where the author of this article got
that idea. Read the spec instead of the article and you'll learn more:
[http://dbus.freedesktop.org/doc/dbus-
specification.html](http://dbus.freedesktop.org/doc/dbus-specification.html)

~~~
krig
Okay. From the spec:

"D-Bus is low-overhead because it uses a binary protocol" -

Alright, got it. It's binary.

"The protocol is a line-based protocol, where each line ends with \r\n. Each
line begins with an all-caps ASCII command name containing only the character
range [A-Z_], a space, then any arguments for the command, then the \r\n
ending the line. The protocol is case-sensitive. All bytes must be in the
ASCII character set."

Wait a second. This section describes a line-based ASCII protocol. Is this
some other protocol?

"A nul byte in any context other than the initial byte is an error; the
protocol is ASCII-only."

Ookay. So... it's ASCII only, except for the first byte?

"Returns auditing data used by Solaris ADT, in an unspecified binary format.
If you know what this means, please contribute documentation via the D-Bus bug
tracking system."

Oh, no, okay... unspecified binary format. THIS IS ACTUALLY IN THE "SPEC".

Sorry for yelling. I lost it a bit there. Not quite as much as the authors of
the dbus "specification", though.

~~~
hp
Yes, there's an auth protocol before the actual dbus protocol - there are two
protocols in the spec. I think that's what you are missing.

Perhaps it's confusing but slow down and understand the tech before
criticizing. It is not in fact an ASCII-only binary protocol. Other engineers
do sometimes know what they are doing.

~~~
krig
Let me get this straight. To you, there is nothing wrong with embedding a
separate ASCII protocol for authentication inside a binary protocol? Of course
I realize that the quoted text about ascii-only describes a subset of the full
protocol. To do this is ridiculous and bad engineering.

I also note that you completely ignored "If you know what this means, please
contribute documentation via the D-Bus bug tracking system." appearing in the
specification. Seriously. This is not a specification. This is a poorly
written description of an existing mess of a wire protocol.

Other engineers clearly do _not_ know what they are doing.

~~~
hp
I don't see the problem with having two protocols on one socket, as long as
its defined how it works (how to switch over). HTTP for example supports
switching to websocket.

The spec has always said it was informal and needed more work, for at least a
decade now. Many people have implemented dbus and rewriting the spec hasn't
been enough of a priority for any of them to do it. To me that says that while
many are willing to say "it should be better" (including me) none of them are
willing to say "and it's important enough to spend my next few months on"
(including me). But the beauty is that at any time someone is free to change
that.

I think it's wrong to say something is must-have when it obviously by
existence proof has not been must-have. And in fact a lot of tech that had the
must-have failed. Say CORBA, which certainly had specs. They were just specs
that specified the wrong thing. I'd rather have the (approximate, good enough)
right thing with an informal but good enough spec (to a motivated reader
giving it a chance), than pay a bunch of committee people to write down a
design that failed to solve the requirements.

~~~
krig
Alright. I didn't realise that I was talking directly to one of the designers
of dbus, which makes some of the things I said unnecessarily harsh and perhaps
even personal. Apologies for that.

I have to ask, though - why engage in a thread like this? Clearly, dbus is
successful, it is being integrated into the kernel, it is used all over Linux
user space by now and whatever flaws it has are clearly not impeding its use.
So why bother arguing about the spec on HN?

I could sit down and discuss what an improved protocol might look like, but I
don't even know if I agree that /any/ protocol that does what dbus does is the
right approach, and either way, this is not the place to do so.

~~~
hp
Why engage is a good question :-) I had the misfortune to see a link on
Twitter and discover people were wrong on the Internet.

I do think there's useful stuff to learn and discuss here about software
development and dbus itself if people dig in and understand it. Perhaps some
bystanders will learn something.

I welcome improving and even disrupting and replacing dbus but I don't think
the kind of criticism found in this article will lead to that.

~~~
Demiurge
I did learn a lot, thanks for commenting. It also seems like you agree with
the premise of the article that there is confusing stuff in, what you called
'informal' spec, but explained why. So, thanks again.

------
quotemstr
Under Windows, COM and RPC are a dream compared to this mess. Interfaces are
clearly-defined, versioned entities that support transmitting arbitrarily
complex values (even ones containing circular references!) over a variety of
protocols. It's also very fast: the "ncalprc" transport uses ALPC, which is
the best IPC system I've seen on any platform. [1]

Using dbus, I feel like I have to type the same goddamn namespace name three
or four times and that I never really feel like I've gotten it right.

Also, FOSS developers are in general completely obvious to interprocess race
conditions. When you refer to something by an "id" and that "id" can be reused
and recycles between subsequent calls, you've gotten it fundamentally wrong. I
see developers make this mistake over and over in POSIXland.

[1] No, ALPC isn't like RPC-specific like kdbus: it's just a very, very good
implementation of message passing over socket-like kernel handles.

~~~
pjc50
Interesting, because one of the things that attracts me to Linux over Windows
is the use of well-defined interoperable wire protocols rather than RPC
systems. This makes it much easier for the different ends of the communication
to be written by unrelated teams without quite such coordination costs.
Meanwhile RPC-style is much easier _if the same team is writing both ends_.

This is why there are a zillion IMAP clients and several servers but only one
reliable Exchange client and server, the Microsoft one. The wire protocol is
basically MAPI over DCOM, and this requires a server architecture identical to
the MS one (and at much greater risk of copyright/patent problems).

GNOME have been able to produce such a bad wire protocol for dbus because
they're writing both ends and happy to break compatibility regularly. It
doesn't have to be this way and is a function of the project management style.

(There are some Linux clients for Exchange but no server, which is the wrong
way round as it's much easier to get people to switch on the server rather
than the desktop)

~~~
quotemstr
> because one of the things that attracts me to Linux over Windows is the use
> of well-defined interoperable wire protocols rather than RPC systems.

NFS is an RPC system. Both protocol-based and RPC-based systems can be well-
documented and well-specified. Both kinds of system can be arbitrary
nightmares. I don't think you're drawing a useful distinction. Both ecosystems
have some services provided over RPC and some services provided over some kind
of other message format.

> [RPC] requires a server architecture identical to the MS one

No it doesn't. It "requires" Exchange-style architecture in the same way that
IMAP "requires" maildir storage. In the end, you have a message-passing
protocol, and you can choose to respond to messages any way you'd like.

> GNOME have been able to produce such a bad wire protocol for dbus because
> they're writing both ends and happy to break compatibility regularly

They claim that the dbus wire protocol is stable and that individual dbus
services will present broadly compatible interfaces. Dbus is for everyone, not
just internal communication inside GNOME. That's what makes dbus being awful
especially annoying.

~~~
pjc50
It's a stylistic opinion rather than a hard distinction. Yes, in both cases
you're exchanging messages and responses between turing-equivalent systems.
However, the systems you end up with are different depending on which pieces
are designed first and with higher priority.

------
bitwize
Careful! If you accidentally discover what dbus is for and why it is here,
Lennart will immediately deprecate it and replace it with something even more
bizarre and inexplicable! And we'll never interoperate and the year of the
Linux desktop gets pushed back another decade or two...

~~~
panzi
"Some people say this has already happened."

------
digi_owl
Find myself reminded of Joel going over the Excel file format specs after MS
released them, and found off hand references like "See Lotus 1-2-3 spec" or
something of that nature.

------
tux3
That is actually really scary.

I tried reading about DBUS a couple of times before, but gave up before
getting more than a high level overview.

------
teddyh
So the documentation apparently sucks. OK, this is a bad sign, but can be
fixed. Also, apparently the raw on-the-wire protocol is less than optimal.
This is harder to change, but the implementation of kdbus has an excellent
opportunity to fix this for itself and its associated client library.

~~~
felixgallo
If it's not clear from this that kdbus should be rejected and everyone
associated with it removed from any kernel related work immediately...

------
voltagex_
And this shit is being bolted on to the kernel?! At least the documentation
might improve a bit...

~~~
Spidler
Actually, the stuff in the kernel explicitly doesn't care about content of
messages.

Also, systemd-dbus has already done away with the XML config files. They are
parsed only as legacy/compat mode.

~~~
deno
As someone not very familiar with D-Bus I’d like to ask what’s exactly wrong
with the XML configuration files? It just seems like something everyone takes
for granted. From the examples linked in this thread[1] they seem pretty clear
and readable to me.

[1]
[https://news.ycombinator.com/item?id=8649477](https://news.ycombinator.com/item?id=8649477)

------
guipsp
>So there seems to be some confusion what things like "binary" mean

I'm pretty sure binary, in this context just means you can't open a payload in
notepad.

~~~
hp
No. It means everything is on-wire essentially the same as it would be in
memory. Read the spec: [http://dbus.freedesktop.org/doc/dbus-
specification.html](http://dbus.freedesktop.org/doc/dbus-specification.html)

This article is essentially "I didn't understand something after spending 15
minutes (or so) on it, and here are my criticisms of how I speculate this
might work."

You know, fair enough. But if you as the reader want actual knowledge you can
read the docs and code yourself and spend more than 15 minutes (or however
long it was, but not long enough to have accurate info for sure).

There are hundreds of people and packages using dbus after many similar
technologies were tried and didn't catch on. A curious person might ask why.

~~~
guipsp
I think DBus is a great technology, and that this article is terrible. What I
was saying is that "binary format" really doesn't mean anything.

------
general_failure
In this day, I would just use http and websockets instead of dbus even for
local communication. It just makes sense

~~~
hp
Switching dbus to websocket/http instead of its own outer layer equivalent
would get you about 1% done implementing dbus. It doesn't address or answer
99% of why dbus exists or what it does.

So it's a fine thing to consider (since websocket exists now) but I'd question
the word "just" here.

It's like saying the way you'd implement an Amazon web service would be to
"just use http." OK. Now what is the service? ;-) I hope that makes sense.

It's a mistake to view the problem solved here as "sending messages." The
problem is all about the semantics of sending them and the lifecycle services
provided by the central daemon.

~~~
general_failure
Can you tell me something which dbus solves but something over http cannot? By
this I mean in practice and not what it can possibly do. I have used it KDE
and network manager and the like. They are all better off with rest style APIs
actually.

~~~
hp
That question is like asking "can you tell me something the Twitter API solves
that http does not?" \- it doesn't make any sense. They are distinct layers.
One builds on the other.

http+websocket is a way to set up a full-duplex stream of messages where
messages are anything you like.

dbus does have that part, but then it defines additionally what the messages
actually look like in enough detail to bind them to method calls; it defines
semantics such as guaranteed ordering and errors; it defines a central bus
daemon; it adds broadcast messages over the daemon; it adds security features
to allow mixing user and system domains; it adds a way to locate the bus
daemon; it adds a way to launch and track the lifecycle of named processes;
etc, a number of other APIs. It is not just a socket.

Could you implement a dbus-equivalent using http? Sure. But http does not
include a "free" implementation of dbus, any more than it includes an
implementation of Twitter.

dbus-on-http would have to define how method signatures and types map into
http, and then it would still have to actually implement the daemon with its
features and semantics.

http wouldn't make any material difference here; it would have some bikeshed-
level pros and cons, but not change the system design in a material way.

~~~
general_failure
I get all that. What I was trying to say is that services that provide REST
api's are better and easier to use than those that provide DBus api's.

~~~
dragonwriter
How are REST and DBUS exclusive alternatives? DBUS is a protocol, REST is a
protocol-agnostic architectural style. Why not use the REST architectural
style when defining DBUS APIs?

Or is "REST" being used to mean "HTTP" here?

