
Real-time notifications from systemd to Slack - mirceasoaica
https://www.scaledrone.com/blog/posts/real-time-notifications-from-systemd-to-slack
======
lima
Why is everyone okay with relying on Slack for sensitive communication or even
business critical stuff like monitoring?

It lacks everything a good alerting system has (acknowledgements, fine-grained
notifications...).

Even those addresses in the stacktraces could be sensitive in other situations
since they contain ASLR offsets.

~~~
notheguyouthink
What would you suggest? We're in need of an alerting platform and making use
of Slack, not because it's flawless or anything, but because we're already
paying for it and it's convenient for the mass junk. Ie, not strictly mission
critical alerts.

What medium might be better than Slack?

 _(to be clear, i 'm asking, not saying Slack is good for this)_

~~~
amyjess
Zabbix. It's open-source and pretty flexible. I maintain my company's Zabbix
installation, and I've come to appreciate it (and 3.x finally has a non-ugly
UI!).

~~~
lima
We use Nagios + CheckMK, but I believe neither this nor Zabbix solve the
alerting part.

Our solution is a custom notification broker that decides whom to alert and
then waits for an acknowledgement. It uses different backends including our
company chat.

Not complicated at all, just 100 lines of Python code that contain the
business logic.

Anything that relies on a single medium is unsuitable for anything but
unimportant alerts. What if Slack goes down for 2 hour? Unlikely, but
definitely possible.

This ensures that every alert is explicitly acknowledged by someone, and that
unimportant alerts are quickly forgotten without wondering whether someone
handled them or not.

We have different applications sending alerts, not just Nagios (because Nagios
sucks at processing events as opposed to states), and it would quickly become
unmanageable without some sort of middleware.

~~~
daveguy
Is it something where the business logic part could be easily abstracted /
separated? It sounds like an interesting and useful yet simple tool. The open
source community can always use more of those.

Edit: or maybe something like a blog post to describe the structural details.

~~~
lima
Would need some clean-up but sure, why not. Ask again in a month or so :)

------
Xylakant
For a previous setup we built a small script that attaches to dbus and just
listens to the notifications and pushes interesting events to our logging
service. Once you understand how dbus works, that's actually pretty neat and
catches crashes as well as intentional status changes.

------
arjie
Can anyone who's used systemd share what it's like to have it for process
management? We use monit, and it's all right, but I occasionally wonder.

Does anyone run it also as a separate user to manage certain applications? So
that you could have certain people log on and operate some things and others
operate other things?

~~~
sandGorgon
systemd is far superior to anything else out there. It matured at roughly the
same time as docker.

However it doesn't support being the CMD in a Dockerfile. Which is why it's
not very common in software deployment scenarios in the post-container world.

For older deployments, it may not be worth switching to systemd because the
base OS may not be compatible.

So it's kind of a catch-22.

If you are on baremetal, systemd is much preferable to run it/supervisord

~~~
chimeracoder
> However it doesn't support being the CMD in a Dockerfile. Which is why it's
> not very common in software deployment scenarios in the post-container
> world.

Er, that's only half-true. systemd isn't great for running as PID 1 inside a
Dockerfile, but that's because Docker already monitors PID 1[0], and systemd
can be used to monitor your container itself.

In other words, think of containers as individual applications that you want
to monitor, and systemd can be used either to monitor them or even to run the
containers directly. (Yes, systemd can even run Docker containers directly,
without Docker![1])

[0] you _are_ using exec mode, right?

[1] [https://chimeracoder.github.io/docker-without-
docker/#1](https://chimeracoder.github.io/docker-without-docker/#1)

~~~
sandGorgon
Actually I'm not sure we are referring to the same thing here.

I'm referring to pid1 inside the docker container. systemd does not run inside
the container as pid1 very easily.

Take a look at this -
[https://github.com/docker/docker/pull/13525](https://github.com/docker/docker/pull/13525)

I think your presentation was about replicating docker functionality using
systemd-nspawn...Which pretty cool...But it's not the same as what I'm talking
about.

I'm referring more generally to production decisions with docker. Also read
this [https://blog.phusion.nl/2015/01/20/docker-and-the-
pid-1-zomb...](https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-
reaping-problem/)

~~~
chimeracoder
> Actually I'm not sure we are referring to the same thing here. I'm referring
> to pid1 inside the docker container. systemd does not run inside the
> container as pid1 very easily.

We are - I'm saying that you don't actually want to run systemd as PID 1
inside a Docker container; the Docker model is built around the container
being an application unit, not a system unit.

But if you want to have isolated system(d) units, you can use systemd to get
that behavior inside containers. In that case, you'll want to use systemd to
run your containers instead of Docker, because systemd's tooling is container-
aware (ie, you can have integration between units that run on your host and
units that run inside a machine - 'machine' being the systemd term for
'container', in this case).

~~~
sandGorgon
That is a simplification - even Facebook runs sshd inside its containers.

I know what you are saying - that an atomic unit of work is the program
itself..But we run stuff under supervisord even if it is a single program. It
helps us to make quick debugging changes to scripts,etc and "restart" them
without restarting the container.

In theory it seems the same - in practice it is not. This is the reason for
the existence of tons of different init tools for docker.

BTW, I had trouble understanding what you meant because you are constantly
moving from docker-as-an-application-unit concept (which is reasonably true)
to systemd-nspawn-is-better-than-docker (which is something I am not generally
opinionated about).

------
skarap
What if slack is down/unreachable? Won't it keep the service from starting?

~~~
lima
If it makes slacktee fail with an error code, yes, it would prevent startup.
The fix is a - sign:

    
    
        ExecStartPre=-/usr/bin/foo
    

Would still mark the service as "failed" but wouldn't prevent it from
starting.

------
orf
What about using sentry and getting better stack traces with local variables
as well as slack notifications?

Stack traces are fine and all, but without locals it's often hard to track
down the issue.

------
hendzen
This will also notify if you intentionally shut down the service.

If you want to only see failures, you can use an OnFailure directive.

------
qznc
I'm currently building something similar for matrix instead of slack. My idea
was to clone the mail(1) interface, but tee(1) is also a nice idea!

In the process, I also discovered sendxmpp, which provides mail(1) for XMPP.
It does not support encryption and is written in Perl, so I am building my
own.

------
praveshjain
I don't understand. How is it better than using HTTP POST to post to the slack
channel yourself?

~~~
jon-wood
It doesn't require changing the underlying systems that are being managed,
giving it a lot more flexibility. So long as the service can log to STDOUT it
can be integrated with Slack.

~~~
praveshjain
Oh I get it. Thanks.

------
Curious42
Is Systemd a nohup equivalent? If it is, then what's the benefit of using one
over the other?

~~~
fnord123
nohup just runs a process that will ignore SIGHUP.

systemd can also run processes that ignore SIGHUP. But systemd does a lot of
things that nohup doesn't do. Please don't attempt to use nohup as a daemon
management system for anything but the noddiest of tasks.

If you're going to use anything, you're probably best choosing from this list:

[https://en.wikipedia.org/wiki/Operating_system_service_manag...](https://en.wikipedia.org/wiki/Operating_system_service_management)

~~~
JdeBP
... which misses out quite a lot, from s6 through perp to initng.

------
nodesocket
Interesting service scaledrone. How are you different than pusher, pubnub, or
hydna.com?

~~~
raresp
At a very first view I can say it's cheaper. And this is important for
startups and small/medium businesses.

They also provide examples for a lot of languages, I guess I'm going to try
their service.

Good job, Scaledrone!

~~~
SEMW
Shameless plug: if you want something almost as cheap but without having to
drop the ability to subscribe from all client libs (not just js), queryable
history, presence, connection state recovery, webhooks, stats, firehose to
queues, etc, then have a look at [https://ably.io](https://ably.io).
(disclaimer: I work there)

------
lorenzhs
Not to be overly nitpicky here but it's spelled "systemd", not "SystemD". From
its website: _Yes, it is written systemd, not system D or System D, or even
SystemD. And it isn 't system d either. Why? Because it's a system daemon, and
under Unix/Linux those are in lower case, and get suffixed with a lower case
d. And since systemd manages the system, it's called systemd. It's that
simple._
[https://www.freedesktop.org/wiki/Software/systemd/](https://www.freedesktop.org/wiki/Software/systemd/)

~~~
maccard
if you're not being overly nitpicky, what are you being? if you'd clicked on
the article you would have seen that the mistake was only in the HN submission
title, and not in the title.

~~~
Raidok
Both of them are fixed now.

