
MQTT: A Conceptual Deep-Dive - rchaudhary
https://www.ably.io/concepts/mqtt
======
bdamm
One of the best features of MQTT that is often overlooked is that the topics
are created on the fly and also destroyed on the fly. The protocol does not
require the server to maintain topics lists (assuming no retained messages).
The server therefore scales based on the throughput and the number of clients
(and their subscriptions) but not the size of the topic tree. The topic tree
designer is therefore free to use enormous topic spaces with billions of
topics if they so desire.

~~~
Arnavion
It's also a bad feature for bandwidth-constrained or low-power devices since
it requires sending the whole topic string with every message. That's why
MQTT-SN, and now MQTT 5, allow registering a topic string -> smaller ID
mapping and then using the ID for messages at the expense of persistent state
on both ends.

~~~
bdamm
I've avoided MQTT-SN (and consequently have limited MQTT to high power
segments of our stack, yes a phone is high power) due to lack of robust
implementations. If that changes then I'd reconsider.

~~~
rndmio
If you follow the minutes of the MQTT TC at OASIS, after the release of MQTT
v5 there is now talk of formalising the MQTT-SN spec in a similar fashion,
that should lead (in time) to more robust implementations.

------
baruchthescribe
I once had a client where the only port available for me to use on their
firewalls was for MQTT (1883) because that's how we were getting sensor data
from them. They would not open anything else for us no matter how we implored
them so I wrote a live TCP wrapper over MQTT to get around it. It was a local
multithreaded TCP daemon that listened for outbound requests on a certain
port, wrapped them in MQTT and then published them using a unique topic. The
server daemon would detect these topics and unwrap them before forwarding to
our server processes. So the client machine thought it was making a live TCP
connection to our server but in the middle was a funky invisible MQTT wrapper.
It was really elegant once it worked but my goodness was it a pain to debug -
a couple of months before I got the whole thing right because of all the blind
alleys I went down.

~~~
jsilence
Open sourced it?

~~~
baruchthescribe
Sadly not - the code belongs to the company I worked for. I should cleanroom
it from scratch and open source it.

------
shaunpersad
I personally have found the following blog posts extremely helpful in learning
about MQTT: [https://www.hivemq.com/tags/mqtt-
essentials/](https://www.hivemq.com/tags/mqtt-essentials/)

Also, the spec is actually pretty readable: [http://docs.oasis-
open.org/mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-o...](http://docs.oasis-
open.org/mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-os.html)

Note: I'm not affiliated with HiveMQ.

~~~
thunderbong
That's a really good series of blog posts. Thanks.

------
Matthias247
Having tried recently to implement a good MQTT library for embedded devices
from scratch (there is surprisingly none which can actually do async message
delivery) I found the protocol to have a surprising amount of shortcomings. I
found the biggest one to be about the actual reliability of QoS1/2 transfers
due to how sessions and message IDs work:

Each message in MQTT carries a 16bit message ID. There protocol allows for
retries of message delivers for QoS1/2, but in revised specifications limits
to allow retries only if the session got disconnected.

Now this opens the question what a client should actually do when it does not
get a ACK in a certain timeframe? Since retries are not allowed, an
alternative is to close the connection. That allows for resending this
message, but opens up other problems. First of all reconnecting when only a
single message timed out introduces a huge overhead, which is not what we want
for small connected devices.

In addition to that there exists an issue with the persistent session feature,
which requires the client and server to track all message IDs across
connection attempts. This has the implication that if a client does not get an
ACK for a certain message ID it can never reuse the ID - even not with
reconnects (assuming clean_session=false).

The tiny 16bit message ID space also requires clients to remember sent
messages and prevents it from easily dropping/cancelling pending transmissions
if they are outdated or superseded. The server might still respond to those
IDs, and if we send a new message with the same ID there exists ambiguity. So
technically the client would need to track all used message IDs until they are
ACKd. This might be never, and is not reasonable for a small IoT device.
Common clients just ignore the problem and reuse message IDs whenever
convenient, which means they are actually not reliable. A bigger address space
would have made the problem going away by being able to utilize unique IDs.
But with the current spec it seems hard to really provide reliability on non-
ambiguity utilizing the MQTT QoS guarantees.

So thereby the conclusion was that there is no way to provide real reliability
with the defined QoS classes on MQTT, since that would not allow for
ambiguity.

From my point of view the best way to add reliability on top of MQTT is to add
a custom reliability layer on the application layer and just use QoS0
transmissions. Those work somewhat better.

Depending on the application a different protocol (fully custom, HTTP long
polling with a persistent connection, Thrift, grpc, etc) might also be a
reasonable choice.

~~~
geokon
I'm pretty ignorant on this topic, but shouldn't retrying and missing messages
be handled by the TCP layer? You don't want multiple network layers be doing
the same work after all

~~~
Matthias247
Yes, TCP already guarantees reliable byte streaming. However messages can get
lost on higher levels: Some MQTT libraries or brokers will drop messages if
they are out of memory of some internal queue is full. Or the application
level software does the same or forgets to explicitly ACK a message (some MQTT
libraries delegate all ACK sending to the user, and don't handle it in the
library). In those cases the remote peer would want need to handle the missing
ACK in a reasonable fashion.

~~~
vinay_ys
We can learn from the mechanisms applied by the TCP layer for reliable end to
end packet transmission and adapt those mechanisms at application layer for
reliable message delivery. For example, for any pair of applications that need
to send/receive messages, they can efficiently keep track of sequential
message ids that have been transmitted, and acknowledged, yet to be
acknowledged, via a windowing mechanism. Then stop transmitting and wait for
acks when the unack'ed message window is full, have timeout for these waits
and reset the windows to recover and retransmit. We can have performance
statistics that provide visibility without much fuss.

~~~
jacques_chester
You might be interested in RSocket: [http://rsocket.io/](http://rsocket.io/)

------
noobiemcfoob
Can someone justify MQTT over HTTP and WebSockets?

Before you jump down my throat, I've used all three protocols to a fair
degree. MQTT was the most painful, and without strong justification, what's
the point? I always read a bandwidth usage justification, but if that's the
case, someone should be able to tell me the number of overhead bytes saved
using MQTT over WebSockets.

~~~
dwild
MQTT is interesting for IoT devices. Theses devices are low power, low memory,
low speed, may have intermittent connections.

The alternative to MQTT wouldn't be HTTP or WebSockets, but actual sockets.

MQTT is simple to implements and doesn't require much performance. Many of its
implementations don't support its QoS feature that retries because the
embedded processor doesn't have enough memory.

~~~
noobiemcfoob
Both MQTT and HTTP are built on top of TCP sockets.

A HTTP client can be implemented by hand with just a few examples in a few
hours. I was unable to do the same with MQTT.

~~~
dwild
Sure an HTTP client can be done, but you won't get the functionalities of MQTT
and you'll have to support it yourself.

MQTT is always on and it's bidirectional. The client can both listen to the
server and talk to it. You sure can do that using some kind of push/pull over
HTTP but that's going to be costly. You can sure do HTTP long polling, but
that's most probably contains code smell.

MQTT can be used simply to broadcast something, which is quite useful in the
IoT world. You got a data that has changed, another client can easily get it
and do something with it (either log it, show it, or change its behaviour
based on it). Again something that you'll have to implements over HTTP.

> A HTTP client can be implemented by hand with just a few examples in a few
> hours.

If an implementation for it already exists (and most likely than not it does)
you are better of with using it than rewriting it.

Sure you can't do it in hours, but MQTT is simple enough that there's already
an implementation for you somewhere. Though seeing this source code, I'm
pretty sure it can be pulled off in hours still [1].

[1]
[https://github.com/knolleary/pubsubclient/blob/master/src/Pu...](https://github.com/knolleary/pubsubclient/blob/master/src/PubSubClient.cpp)

~~~
noobiemcfoob
This is a much better argument than your first comment.

For a persistent connection, you might want to move from HTTP to WebSockets,
which also brings the bidirectional quality and some more of the pub/sub
qualities. Though WebSockets' complexity compared to just HTTP is similar to
MQTT.

Being able to implement the protocol yourself is less something I would do in
production and more a comment on how easy a protocol might be to grok.

I wouldn't expect to implement my own WebSockets client. So this and the
relative overlap of features puts MQTT and WebSockets in the same bucket for
me (far more so than MQTT and a raw socket).

Summarizing my views: MQTT seems as opaque as WebSockets without the benefits
of being built on a very common protocol (HTTP) and being used in industries
beyond just IoT. The main benefits proponents of MQTT argue for (low
bandwidth, small libraries) don't seem particularly true in comparison to HTTP
and WebSockets.

~~~
dwild
> built on a very common protocol (HTTP) and being used in industries beyond
> just IoT.

For use case beyond IoT, sure use everything else, but MQTT is great
specifically for IoT. It's like saying x86 is better than others 32 bits
architectures used by microcontroller because it's used in industries beyond
embedded. What is used in embedded make sense for embedded only. If your needs
are beyond that, go beyond.

> I wouldn't expect to implement my own WebSockets client. So this and the
> relative overlap of features puts MQTT and WebSockets in the same bucket for
> me (far more so than MQTT and a raw socket).

I wouldn't consider even implementing a WebSockets client. I would consider
porting the 600 lines of that library I linked though. Would you consider
porting that librarie instead [1].

> don't seem particularly true in comparison to HTTP and WebSockets.

I use MQTT over my ESP8266 devices. That thing has 96 KB of RAM and 4 MB of
flash memory. Sure Websocket can works on it, sure you can write an HTTP
server on it, but in comparison, the library I linked is 600 lines of code, is
simple to use and does everything that is needed of IoT really simply.

Even if you were to try to replicate that in WebSocket, you would still have
to add a way to broadcast that information easily instead of simply having
another client that subscribes to the same server and listen to the topic you
publish to.

[1]
[https://github.com/Links2004/arduinoWebSockets/tree/master/s...](https://github.com/Links2004/arduinoWebSockets/tree/master/src)

~~~
noobiemcfoob
> What is used in embedded makes sense for embedded only

Certainly, with that attitude. Personally, I fight that mentality when I can.
The reason to lean on technology that is used in multiple disciplines is to
increase the pool of viable programmers. If it's the wrong tool, then it's the
wrong tool. But otherwise use the tool that the most can use.

On porting libraries...well, the WebSockets client I play with is only 60
lines. So, I don't know, you tell me:
[https://github.com/danni/uwebsockets/blob/esp8266/uwebsocket...](https://github.com/danni/uwebsockets/blob/esp8266/uwebsockets/client.py)

And yeah, that's micropython (because I love inviting the wrath of embedded
developers). But that will run on your ESP8266 just fine. I use the D1 mini :)

You're definitely right on having to implement more of the pub/sub on your
WebSockets server. I misspoke in saying WebSockets gives you those qualities.

~~~
dwild
> Certainly, with that attitude. Personally, I fight that mentality when I
> can. The reason to lean on technology that is used in multiple disciplines
> is to increase the pool of viable programmers. If it's the wrong tool, then
> it's the wrong tool. But otherwise use the tool that the most can use.

Which is my point, the right tool in that case is one that fit on the memory
footprint of an embedded microcontroler.

> On porting libraries...well, the WebSockets client I play with is only 60
> lines.

Theses 60 lines does almost nothing. You ignore the 240 lines inside
protocol.py just beside. You also ignore the 600 KB binary required for
MicroPython. Already you simply CAN'T run on an ESP8266 that's only embedded
with 512 KB (so can't be run on any of my ESP-01 module).

That does make me think though that the Websocket protocol is quite simpler
than I though, but if I were to use it, I would plainly do a socket connection
instead which would be even more simple.

> I use the D1 mini :)

Sure with one big enough, but we are talking about embedded, where resources
are scarce. Running Micropython is a luxury for many embedded developers.

~~~
noobiemcfoob
Yeah, I did miss the import from protocol.py. I knew it couldn't be just 60
lines but got excited and hit enter. Regardless, to your earlier point, I
don't dig into the protocol's details often and just rely on the library.

Micropython and this WebSocket client _DO_ run on the ESP8266. As noted
before, I'm using these libraries on a D1 mini. I don't know the specifics of
the memory footprint as I stay away from applications where I'm up against the
wall of my device's capabilities. But the point stands that there's plenty
enough space on the ESP8266 for the glorious luxury of micropython.

------
ultrafez
I have to question the overall quality of the article when it repeatedly uses
the term "channel" to refer to the well-established concept of a "topic" in
MQTT.

~~~
adrianmonk
I'm not very knowledgeable about MQTT specifically, but it seems like fair
game considering the protocol spec ([https://docs.oasis-
open.org/mqtt/mqtt/v5.0/mqtt-v5.0.html](https://docs.oasis-
open.org/mqtt/mqtt/v5.0/mqtt-v5.0.html)) uses the term channel:

> _3.3.2.1 Topic Name_

> _The Topic Name identifies the information channel to which Payload data is
> published._

------
mrighele
> 0 = at most once = server fires and forgets — messages may be lost or
> duplicated

Isn't "at most once" a misnomer ? By definition if it is at most once it is
not duplicated.

~~~
Arnavion
Yes, the article is wrong. A QoS 0 message is never resent by a well-behaved
sender, so it could not be duplicated. It might not even leave the sender's
network stack if the connection was broken before the send attempt.

------
hangonhn
"Designed for at most once, at least once and exactly once message delivery"

Exactly once message delivery cannot be guaranteed from a theoretical point of
view (if it does exist then the Two Generals Problem can be solved, which has
been proven to not have a solution). I don't know how MQTT can get around
that, especially in an environment where transient network issues are
expected.

~~~
bdamm
Well, there's a confirmation sequence that's basically a three-phase commit.

~~~
jonquark
A QoS 2 flow is 2-phase (Publish and PubRec being the first phase and PubRel
and PubComp being the second).

------
cbluth
Hi, interesting article.

Anyone care to expound on this part:

>>> and without any way to distinguish a deliberate disconnection from a
transient network issue

~~~
blackflame7000
Because MQTT is optimized for low-bandwidth environments, it only has a
limited number of error responses. Therefore, when a connection has been
established, there is no way to distinguish a network issue from a forced
disconnect.

~~~
flipper65
that is not strictly true for the statement in the article. Deliberate
disconnects can be identified by implementing tombstoning, in which a
connected node sends an epitaph message to the channel prior to disconnecting.

~~~
rad_gruchalski
Yes, but that is not defined by the protocol. MQTT does not have such a
mechanism. That’s a custom solution.

~~~
rad_gruchalski
To clarify on the last will, as there was a comment here (and a downvote to my
parent comment, thanks for that, by the way...), comment since then deleted.

Last will is given to the broker at connect time. It doesn’t tell the broker
that a disconnect is / isn’t deliberate. Last will has nothing to do with the
reasoning for disconnect. It tells the broker what to do in case of a
disconnect.

~~~
blackflame7000
It's highly annoying when you get downvoted so close to 500 where you earn the
coveted ability. Furthermore, you are correct so enjoy some karma :)

------
will0
Oh hey, I work with Andy Stanford-Clark

------
terminalhealth
What's the relation of this to Bluetooth Low Energy (BLE)?

~~~
terminalhealth
To answer my own question, it looks like MQTT has a slightly higher power
usage due to the TCP/IP-based connection (though still a much more compact
header than HTTP). Both offer notification mechanisms, but BLE can only do
"one master (e.g. iPhone) to many slaves (e.g. some IoT devices)" connections.
MQTT can be done via Wi-Fi, Ethernet, serial etc., but even over BLE.

[https://dzone.com/articles/protocols-and-standards-behind-
io...](https://dzone.com/articles/protocols-and-standards-behind-iot-coap-ble-
mqtt-d)

