
Some thoughts on security after ten years of Qmail 1.0 - bello
https://blog.acolyer.org/2018/01/17/some-thoughts-on-security-after-ten-years-of-qmail-1-0/
======
pilif
Something to keep in mind with regards to qmail is that it's extremely
feature-poor and it never got features beyond its initial design goal.

This makes it much easier to keep the bugs out, to the point that making
software under such constraints is much more similar to traditional
construction projects.

I mean: Nobody ever tells you after you have built a bridge that they are now
going to upgrade gravity to gravity 2.0 with 100% more pull. And nobody will
ever tell you that your bridge will now get a shopping mall in the middle of
it where people can purchase products of their favorite brands.

Software starts to break down when it has to be taken above initial design
constraints and when there is not enough time to rewrite subsystems (or all of
it) but instead when you have to make the abstractions leaky and compromise.

But back to qmail:

qmail itself is so feature-poor that traditionally, nobody was and is actually
running qmail. Instead everybody is running "qmail" which is qmail plus some
patches. Sometimes home-grown, sometimes taken from third parties.

But more often than not they are unmaintained and very far removed from the
high quality standards of the underlying software.

This is the downside. Yes. You have a bug-free core that totally meets its
designers (limited) use-case, but in reality nobody is actually running that.

~~~
cesarb
And you need these patches to fix what is IMO qmail's worst problem:
backscatter. As far as I remember, when receiving an email with a forged
return address to a non-existent mailbox, qmail first accepts the email, then
sends a bounce message to the forged return address. Other MTAs (and patched
qmail) reject the email directly in the SMTP session, preventing this issue.

I personally consider this backscatter issue a design bug in qmail.

~~~
NoGravitas
I worked at a web hosting company 1999-2005 that used qmail, and while there
were _many_ things wrong with qmail, due to it not being designed for the
realities of email circa 2001, backscatter was by far the worst. We were
processing significantly more backscatter than valid email, and to the best of
my knowledge, the patches to address it didn't yet exist.

We certainly should have switched mail servers, but qmail was deeply ingrained
in our home-grown hosting automation system, and it would have been a big deal
to change.

------
diafygi
As we cast about trying to figure out ways to make software more secure or
reliable, please remember that in other engineering fields (civil, chemical,
mechanical, etc.) prioritizing safety and reliability is a _solved problem_.

(1996) [https://www.fastcompany.com/28121/they-write-right-
stuff](https://www.fastcompany.com/28121/they-write-right-stuff)

> It is perfect, as perfect as human beings have achieved. Consider these
> stats: the last three versions of the program — each 420,000 lines long-had
> just one error each. The last 11 versions of this software had a total of 17
> errors. Commercial programs of equivalent complexity would have 5,000
> errors.

> The process isn’t even rocket science. Its standard practice in almost every
> engineering discipline except software engineering.

The problem is consequences. We had centuries of people dying in bridge
collapses before we got our shit together and started prioritizing safety in
civil engineering (i.e. engineers and managers going to prison if they don't).

The same will be true for software. As more people get harmed by thrown
together software (e.g. mass panic in Hawaii, state psychological exploitation
on social media), we'll start regulating it like other engineering fields.

As a former chemical engineer, I welcome this transition, but I realize it
will likely also take centuries of hard lessons.

~~~
Joeri
The difference is that while you can’t make a bridge in your bedroom you can
make an app.

Should we forbid people from writing code without the proper certification?
Should we close down the open internet and replace it with a regulated zone
where only compliant software can be run?

I agree that we need a higher standard of engineering in software, but I’m not
clear on how to achieve it without draconian measures.

~~~
mulmen
I made a lot of bridges and cranes in my bedroom. They all failed
spectacularly but GI Joe didn't seem to mind that the shear strength of a Lego
pin was inadequate to support his tank.

The difference is that I don't then promise a municipality that I can build
them a bridge that can take their citizens across a river, even in case of a
100 year event. Or an early warning system that can save their citizens from
nuclear holocaust. This is the difference in maturity between software
"engineering" and real engineering disciplines.

You're free to experiment with your own resources but as soon as you make a
promise to the public or your customers you should be required to meet your
promises in _all_ circumstances.

~~~
BurritoAlPastor
This is a fine sentiment, but it's hardly a bright-line rule. We don't
criticize Twitter's handling of abuse or fascists because they promised us a
platform free of abusive fascists; Twitter didn't promise us shit except 140
character messages. All the problems were emergent.

So many of the great (financial) success stories of our sector are about
startups stumbling into an untapped demand, and then running with it for as
long as the money lasts. Nobody sets out to build a bridge – they build a 2x4
plank, and then realize that rather a lot of people want to walk on it.

------
jlgaddis
Damn, it's been nearly 20 years since qmail 1.03 was released (June 1998)? It
sure doesn't seem like that long!

I recall setting up qmail "toasters" on FreeBSD to do virtual hosting. Maybe I
was just too much of a "n00b" but I remember it being a big PITA to get all
the services to play well together. There was this hip new outfit named Yahoo!
that was using it for their new webmail service, though -- as opposed to
sendmail, which pretty much every MTA on the Internet used at the time (and I
was proficient enough with sendmail that I would edit my sendmail.cf by hand;
pffft, who needs m4!?) -- so I assumed it was certainly capable of handling
_my_ volume of mail. (I wasn't running authoritative DNS servers at the time
or I probably would've used djbdns over BIND as well.)

qmail, unfortunately, never did become _too_ popular (relatively speaking, of
course) and that's really a shame, because, as the quote in the article says:

> "We _need_ invulnerable software systems, and we need them today, ..."

While that was certainly true _then_ , it's even more true now.

On a side note, I'm surprised that the "qmail security guarantee" [0,1] wasn't
mentioned in the article:

> _" In March 1997, I took the unusual step of publicly offering $500 to the
> first person to publish a verifiable security hole in the latest version of
> qmail: for example, a way for a user to exploit qmail to take over another
> account. My offer still stands. Nobody has found any security holes in
> qmail. I hereby increase the offer to $1000."_

[0]:
[https://cr.yp.to/qmail/guarantee.html](https://cr.yp.to/qmail/guarantee.html)

[1]:
[https://cr.yp.to/qmail/qmailsec-20071101.pdf](https://cr.yp.to/qmail/qmailsec-20071101.pdf)
(PDF)

~~~
SwellJoe
While qmail has faded in popularity as it has been sporadically maintained by
a random bunch of folks over the years, there has been at least one other MTA
written by someone with excellent security cred, and that has been continually
maintained and has an excellent security record. We don't really need to mourn
what could have been with qmail; we have Postfix, and it's really very good.

~~~
JdeBP
Making a world-readable, world-searchable, and world-writable drop directory
because of a decision to have no set-UID and set-GID executables in Postfix,
even _appropriate_ ones; failing to learn the even then well-known lessons of
the batch job (at), UUCP, and printing (lpr) subsystems when it comes to
world-accessible input directories; was a fairly large blot.

* [https://cr.yp.to/maildisasters/postfix.19981221](https://cr.yp.to/maildisasters/postfix.19981221)

* [https://cr.yp.to/maildisasters/postfix.html](https://cr.yp.to/maildisasters/postfix.html)

* [https://groups.google.com/forum/#!msg/mailing.postfix.users/...](https://groups.google.com/forum/#!msg/mailing.postfix.users/Bif9N7Nx8gM/BoJH5ibspvwJ)

------
pmoriarty
My biggest takeaway from qmail has nothing to do with security, but rather
that excessively restrictive licensing, highly opinionanted/unusual setup, and
unwillingness to collaborate on its development squandered its potential.

If it wasn't for all that, we might well all be using qmail-based mail servers
today, as qmail was really ahead of its time in so many ways.

It was kind of like the Amiga of mail servers, back in the day. It could have
easily dominated the market, but it wound up a mere historical curiosity.

~~~
geocar
> …highly opinionanted/unusual setup…

That setup contributed to security.

It also made qmail very easy to extend: I operated a medium-sized mail service
a while ago and qmail's pluggability meant I could add features that didn't
exist in other mail servers (like postfix or exim) without forking the entire
project.

* I had SMTP AUTH when the only other mail server to support it was Netscape; before RFC2554 was written.

* My qmail-popup proxied to another machine so it didn't need root (making me immune to the Guninski vulnerability) so my users only needed to know about a single hostname regardless of where their mail was stored, and without needing to use something icky like NFS.

* I had a web interface with auto-enrolled client certificates for authentication and confidentiality for SMTP, POP, and IMAP.

* My qmail-remote recognised certain suggestions our IP was being blacklisted and would retry immediately with a new IP.

* My qmail-remote recognised certain greylisting error messages and rescheduled retry for that time.

* I had multiple mail queues based on the number of retries the message had seen.

And so on. I didn't start out to make those features, they just grew over a
decade or so organically. At no point would I have forked postfix or exim to
add any of those features because once you fork it you _own it_ unless you can
get your changes upstream. I had shit to do, so the real alternative was
simply buy more servers and/or pay for commercial software.

I wish the model had caught on, because it's a superior way to develop
software. I didn't understand why though until fifteen years later...

> …and unwillingness to share and collaborate on its development…

Dan absolutely collaborated, and I certainly was using betas back in 1996.

If my memory/anecdote isn't enough: There are a number of explicit points of
evidence in the changelog distributed with qmail.

What he doesn't do is let people save face when they say something incredibly
stupid and then try to backpedal when it's obvious how wrong they are. This
bruised more than a few egos, and contributed to a campaign to actively smear
his name and discredit his software.

> It could have easily dominated the market…

I think if Dan had let the peanut gallery have their way, we probably wouldn't
have gotten postfix, but then qmail wouldn't have been qmail except in name.
What's the value in the qmail brand if it isn't secure anymore?

~~~
oblio
> I wish the model had caught on, because it's a superior way to develop
> software. I didn't understand why though until fifteen years later...

Can you elaborate?

~~~
geocar
Which part. The model? The article explained it well enough. I think of it as
two main parts: (1) make fewer bugs, and (2) show your users the correct way
to do things (which is the inverse of getting the correct requirements).

Or are you asking what took me fifteen years to understand?

Dan basically planned the whole thing. When the whole thing was too big, he
would break off a piece that wasn't and develop it as a wholly separate thing.

I understood this strategy worked, but I didn't fully understand why this
strategy was successful until I realised source code matters more than I
thought[1]: If your program is short enough, there won't be any bugs in it.
This is why learning how to read and write dense code[2] can make your
programs better -- and that's what I learned how to do[3].

[1]:
[http://www.jsoftware.com/papers/tot.htm](http://www.jsoftware.com/papers/tot.htm)

[2]:
[http://www.nsl.com/papers/origins.htm](http://www.nsl.com/papers/origins.htm)

[3]:
[https://github.com/geocar/dash/blob/master/d.c#L62](https://github.com/geocar/dash/blob/master/d.c#L62)

~~~
hursortue
> learning how to read and write dense code

> and that's what I learned how to do

This is good code?

    
    
      for(i=b=0;i<r;++i){switch(p[i]){
      case'\n':if(!e)e=i;if(!q){q=kpn(p+w,g-w);if('\r'==(cf=p[i-1]))cf=p[i-2];}else if(!c){w=!!strchr("kK1",cf);if(!o)sc(d,0,sd);else if(w){if('g'==o)cwrite(f,OK("200","keep-alive")BLANK);else cwrite(f,OK("204","keep-alive")END204);}else{if('g'==o)cwrite(f,OK("200","close")BLANK);else cwrite(f,OK("204","close")END204);close(f);}a=k(o?-d:d,"dash",knk(2,q,xD(a,v),0),0,0);b=i+1;if(!o){if(!a){poop(f);R 0;}if(10!=a->t){r0(a);poop(f);R 0;}writer(f,kC(a),a->n);r0(a);if(!w)close(f);}if(b==r)R 1;q=0;o=0;a=ktn(11,0),v=ktn(0,0);}else{if((c-m)==10&& !strncasecmp(p+m,"connection",c-m))cf=p[s];js(&a,sn(p+m,c-m));jk(&v,kpn(p+s,e-s));}w=e=g=s=c=0;m=i+1;break;
      case' ':case'\t':case'\r':if(w&&!g)g=i;if(s==(e=i))++s;break;
      case':':if(!c)s=1+(c=i);break;
      case '/':if(!w)w=i+1;case '?':case '&':if(r>=(i+4)&&p[i+1]=='f'&&p[i+2]=='=')o=p[i+3];default: e=0;break;}}if(a)r0(a),r0(v);if(q)r0(q);R b;}

~~~
geocar
It could be shorter if I spent some more time on it.

~~~
irundebian
LMFAO

------
viraptor
I still don't get djb's distinction between untrusted and minimal privilege
code. What he calls "not violating security requirements" is effectively a
successful least privilege approach. Very few elements can become hacked
without breaking security requirements. If you can't gain anything from
hacking a piece of software, then why is it even executed? - it obviously
didn't deal with anything the user wants.

In his example, yes, you could change the DNS responses, but you still could
not escalate to a higher lever where you can potentially modify stored user
data. That is a success in practice.

~~~
geocar
Something that can respond to a DNS request can put whatever it wants in the
response. If there's a bug in that program, then whoever controls that bug can
put whatever they want in the response.

The only protection from this is to make the code that does this as small as
possible so that us human beings can convince ourselves that it is correct and
that the risk of a bug that someone can control is zero (or as close to zero
as to make no odds).

When Jim Reid wants to pat himself on the back because "at least they didn't
get root on my nameserver box", he misses the point: gethostbyname()'s spec
doesn't say "it may or may not return. if it returns it could return anything.
don't trust it, don't even use it!" They say gethostbyname() return a
structure describing the address of the named Internet host, so people expect
that and depend on that. Something that "suddenly" violates that gets in the
news[1]. Fortunately, nobody remembers what Jim said so the BBC doesn't ask
him for a comment.

Anyway.

"Minimizing privilege" doesn't solve that problem because the DNS server
_needs_ the privilege to respond to DNS requests.

It might be easier to think about a better example. Let's talk about zlib.

A program that needs to decompress some text is not concerned with the
contents of the compressed text, only the uncompressed text. Resource limits
on our program exist to keep some things from getting out of control[2], but
what about _bugs_?

If we could run zlib's decompress() with the permission _only_ to decompress
text, then the worst-case impact would either spin the cpu or be equivalent to
"getting out of control". What do we need to do that?

• No creating file descriptors can be done with setrlimit() except for the
dynamic linker is going to open a shittonne of files. We need to know what the
minimum number of files are, and decompress can't ever change that without
changing our program anyway.

• No accessing files or the network could be done with a setuid wrapper and
iptables. At least on Linux. Most programmers don't do this, and most
sysadmins only do what they're told, so in practice this doesn't happen.

• Sandboxing! Google published some clever user-level sandboxing that works on
Linux to whitelist each syscall. This "verifier" could do it as long as it's
smaller than decompress()!

That sandboxing one is tricky: A tiny inflate routine takes around 500 lines
of C done the normal way, but how big is our sandbox? Probably a lot bigger.

• Ask the operating system for help! This is what DJB suggests. Ask for a
disablefiles and a disablenetwork system call. OpenBSD is implementing this
with their pledge[3] system call.

There's not a portable and satisfying solution here yet, but you can see they
all cluster around reducing the privileges of the untrusted program.

Now, what's to prevent decompress _from lying_? What if someone can produce a
content stream that causes a future decompress run from producing invalid
results. Maybe something really sneaky[4]. What possible protection could we
have?

As you can see, in this case so long as decompress is supposed to produce
"text", there's nothing we can do to make sure it produces the "correct text".

That's why DJB doesn't want to focus on the "untrusted" aspect, and instead on
trying to solve the problem _that we have to solve anyway_ : How do we write
software that is correct?

[1]:
[http://news.bbc.co.uk/2/hi/technology/7496735.stm](http://news.bbc.co.uk/2/hi/technology/7496735.stm)

[2]: [https://swtch.com/r.gz](https://swtch.com/r.gz)

[3]: [https://man.openbsd.org/pledge.2](https://man.openbsd.org/pledge.2)

[4]:
[https://cmaurice.fr/pdf/ndss17_maurice.pdf](https://cmaurice.fr/pdf/ndss17_maurice.pdf)

~~~
irundebian
That's a great explanation, but I still don't understand why he says that the
principle of least privilege is _fundamentally_ wrong. I fully agree that POLP
could lead to an illusion of security or doesn't ensure user's security
requirements, but that doesn't make it fundamentally wrong. The correct point
is, that you shouldn't over prioritize POLP over code correctness. Maybe he is
just arguing against the very strict implementation of POLP I could also
agree, but in general, I would argue that POLP is fundamentally true and
necessary, but that doesn't mean you should implement complex fine-grained
solution with a lot of administrative overhead.

As soon as you build non-trivial systems, you have to contain error
propagation with POLP, although you are striving to build simple and secure
systems.

~~~
geocar
DJB is drawing a distinction between two designs in his paper.

1\. Netscape had a "dns helper" \-- which ostensibly could only do DNS
lookups, is designed in the principle of least privilege.

2\. Ariel Berkman's xloadimage implementation -- which implements every image
loader as a separate filter in a separate process who can do nothing but input
image data and output image data (in the "common" format), is designed around
eliminating trusted code.

The former could (and did) suffer a bug that affected DNS lookups, and was
convinced to perform all sorts of network traffic since, it by definition
needed to perform network activity to do it's function, and it could access
files like resolv.conf because again, it needed to do that to perform it's
function. That it couldn't be exploited to "yield root" wasn't really
relevant, since most people didn't run Netscape as root. It could read user
files and ship them over the Internet which is frankly bad enough.

The latter, is what DJB is recommending.

~~~
irundebian
I would argue that both are designed following the principle of least
privilege. Netscape haven't had the luck of having correct code. So what would
have helped in Netscapes case? How would eliminating trusted code work in this
case? Netscape has to do DNS lookups. I'm not sure if there was much more left
to do as writing secure correct code. And of course you should prioritize
writing secure correct code over implementation of least privilege. That
doesn't make the principle of least privilege fundamentally wrong.

My opinion is that if you design your software securely threat modeling should
result in the decision of implementing the least privilege principle and
whether it makes sense and benefits (complexity vs benefit) or not. Of course
you better eliminate trusted code so that there are less case where you have
to get to these decisions. I assume that soon or later, there are situation,
where you can't eliminate trusted code and it makes sense to implement least
privilege.

~~~
geocar
> I would argue that both are designed following the principle of least
> privilege.

Okay, but that's not what DJB means, and attempting to read his words with the
definitions in your head, instead of the definitions in _his_ head won't help
you understand him.

I'm not going to humour an argument about mere semantics: For the purposes of
this discussion they are not both the "principle of least privilege".

> So what would have helped in Netscapes case?

Writing the DNS client correctly.

DJB's point is that absolutely nothing else would help: You can't
realistically put a box around buggy code as long as the code needs
privileges.

And all that effort in writing that sandbox? A waste of time; _fundamentally_
the wrong thing to focus on. Writing a DNS client is far less work.

> I assume that soon or later, there are situation, where you can't eliminate
> trusted code and it makes sense to implement least privilege.

That was what DJB assumed when he wrote Qmail, however he is now convinced
that was wrong. His paper gives some explanation why.

If you can't eliminate trusted code, and it's still big enough you think there
might be bugs hiding inside, you should rethink your design.

------
joveian
My favorite quote from that paper is "I have discovered that there are two
types of command interfaces in the world of computing: good interfaces and
user interfaces."

As others have pointed out, one thing left out of the paper is not updating
the software. qmail doesn't support SPF or other security extensions, which
makes it useless these days without patches.

------
1110001110
Interesting article, the only thing I fail to see how this is related to
Meltdown and Spectre. Those are not simple 'bugs', it's multiple good features
of modern processors combined to yield an attack vector. My opinion is that
with any level of process problems like this will arise sooner or later just
because the complexity is so high.

~~~
dchest
They are caused by performance optimizations with disregard to security
(likely not intentional, but caused by not considering security aspects of
optimizations carefully).

------
farnsworthy
Nice summarizing article, with some programming concepts—explicit data flow,
for example—that are even more generally applicable (though the topics of
security and code volume/quality be linked).

