
The MIME guys: How two Internet gurus changed e-mail (2011) - breck
https://www.networkworld.com/article/2199390/uc-voip/the-mime-guys--how-two-internet-gurus-changed-e-mail-forever.html
======
niftich
MIME was hugely influential. Besides the bulk of all email being in MIME
format, it also popularized:

* MIME types, now called media types; a compound identifier for formats of data interchange. Media types are officially maintained by the IANA Media Type Registry [1].

* Base64. MIME lifted the encoding from the PEM RFC [2][3] but MIME brought it to prominence.

* Content-Disposition: a header to specify intended presentation semantics. Later lifted into HTTP.

* Other "Content-" headers, like Content-Location. These were also adopted by HTTP.

Furthermore, the various multipart mediatypes imbued adjacent parts with
additional semantics, like "multipart/mixed" to signal that the child nodes
are independent, but ordered, and "multipart/alternative" to signal that the
child nodes are to be treated as alternate versions of the same content, in
order of increasing richness.

The thorough design around multiparts meant that a MIME document is a tree,
where every node is a subdocument with headers, preamble, body, body, and
epilogue. It was suitable to use as a nesting data structure, albeit it most
applications opted for binary formats instead. Later, when text-based formats
became popular, they often invented XML-based formats to capture a similar
tree.

[1] [https://www.iana.org/assignments/media-types/media-
types.xht...](https://www.iana.org/assignments/media-types/media-types.xhtml)
[2]
[https://tools.ietf.org/html/rfc989#section-4.3](https://tools.ietf.org/html/rfc989#section-4.3)
[3]
[https://tools.ietf.org/html/rfc1421#section-4.3.2.4](https://tools.ietf.org/html/rfc1421#section-4.3.2.4)

------
noncoml
The whole email is based on archaic complicated system and it is long due for
a disruption.

The RFCs are too many and too long and complicated to get them right at first
try.

The SMTP protocol for example is based on ABNFs, which are not very machine
friendly, and many implementations end up using regular expressions.

Then there is MIME and the quest for the main content of the message; there is
no precise way to tell which of the parts of the MIME message are the main
contents to be displayed to the user. Some clients use "multipart/alternative"
and under that they have a "text/plain" and a "text/html" part. Others use
"multipart/mixed" and add them there. Yet some others use "multipart/related".
The you may have “multipart/alternative” under “multipart/mixed” and so on.

The way to achieve high-availability, is by using multiple MX records in the
DNS for the mail exchangers. Imagine if today you had to get a list of host to
try running your REST APIs against.

And of course achieving end to end encryption is not straight forward for the
users.

IHMO it is time to come up with a new REST base API for email and short
message exchange.

~~~
saurik
How is an ABNF grammar specification somehow not "machine friendly"?!

~~~
noncoml
Why you think ABNFs are machine friendly? Is it that easy to parse ABNFs that
we are flooding with libraries?

Just because it is a formal system it doesn’t mean it’s machine friendly.

~~~
saurik
How to work with BNF grammars is seriously second year computer science... if
you wanted to parse that precise format it should probably take something like
twenty minutes using an existing parsing framework, and then you can generate
anything you want.

I could see making "parse SMTP" a homework assignment for a compilers class...
if only it were challenging ;P.

FWIW, when I implemented the SMTP/IMAP specs, I spent a few hours first
writing my own parser combinator framework (to be 100% clear: from scratch,
with no reference) and then made short work of getting these specific grammars
fully implemented.

~~~
noncoml
> when I implemented the SMTP/IMAP spec

Link to the code?

> I spent a few hours first writing my own parser combinator framework (to be
> 100% clear: from scratch, with no reference)

I am sure you did. Like when you rolled out your own crypto in 5 minutes. Or
the time you rewrote FB in a weekend.

------
emmelaich
You can read more history from NSB's mime page on his home domain:

[http://www.guppylake.com/nsb/mime.html](http://www.guppylake.com/nsb/mime.html)

Including a video of the " ...Telephone Chords, the world's premier (=only)
all-Bellcore barbershop quartet, singing about MIME."

(note: auto-plays)

------
dvt
Awesome history lesson, I had no idea how MIME originated. A few years ago
(gosh I guess it's been like 4 now), I wrote a CORS-enabled MIME-type
checker[1][2].

The main issue with MIME (which the article barely touches on, unfortunately)
is that the type can be spoofed. It can be a dangerous attack vector and trust
should never be given to external systems that claim x.jpg or y.mov is
_actually_ a "image/jpeg" or "video/quicktime."

[1] [http://lecoq.herokuapp.com/](http://lecoq.herokuapp.com/)

[2] [https://github.com/dvx/lecoq](https://github.com/dvx/lecoq)

~~~
zAy0LfpBZLC8mAC
You have it all backwards. A MIME type cannot be "spoofed" because a MIME type
is a processing instruction, not a certification. If you receive an entity
that is labeled as "image/jpeg", that means that you are instructed to treat
it as a JPEG image. That is to say, you should only hand it to a JPEG decoder.
As with any processing of untrusted input, that JPEG decoder should obviously
not have any vulnerabilities, and it should reject any syntactically invalid
input. If someone takes a video file and sends it to you labeled as
"image/jpeg", there is no spoofing going on, you simply received a
syntactically invalid JPEG image which you consequently should reject.

Unfortunately, browsers, in particular IE, had this habit of ignoring the
relevant standards and do what has become known as "content sniffing": They
ignore the declared MIME type and instead try to guess the correct decoder
based on the content. That was (well, still is, for backwards compatibility
reasons) a huge vulnerability in browsers. If browsers simply followed the
relevant standards, there would be absolutely no problem with serving an
uploaded "image/jpeg" entity without any verification, as no browser should do
anything with it other than notice that it's an invalid JPEG image. That you
have to care about this at all is because you have to work around
vulnerabilities in browsers, not because anything is being "spoofed".

~~~
dvt
A MIME type _can_ be spoofed because spoofing doesn't only apply to
certifications (e.g. IP spoofing, caller ID spoofing[1], etc.). Spoofing
merely means "tricking" or "lying" \-- and this can introduce all kinds of
complications. There are literally dozens of bugs (in all browsers, not just
IE) that exist due to the fact that MIME types can often be
misleading[2][3][4].

[1]
[https://en.wikipedia.org/wiki/Caller_ID_spoofing](https://en.wikipedia.org/wiki/Caller_ID_spoofing)

[2] [https://www.mozilla.org/en-
US/security/advisories/mfsa2005-1...](https://www.mozilla.org/en-
US/security/advisories/mfsa2005-16/)

[3] [https://blog.mozilla.org/security/2016/08/26/mitigating-
mime...](https://blog.mozilla.org/security/2016/08/26/mitigating-mime-
confusion-attacks-in-firefox/)

[4]
[https://bugzilla.mozilla.org/show_bug.cgi?id=1295945](https://bugzilla.mozilla.org/show_bug.cgi?id=1295945)

~~~
zAy0LfpBZLC8mAC
But the thing is that there is no tricking or lying. There is simply a
syntactically invalid entity. And idiotic software that does completely
irresponsible things when confronted with such syntactically invalid entities,
such as feeding syntactically invalid JPEG images to the javascript
interpreter.

MIME types also can not be misleading. MIME types are the authoritative
declaration of what something is. If it's not that, then there is nothing
misleading, it's simply invalid.

Framing this as "MIME spoofing" is about as sensible as calling a buffer
overflow in some font renderer "machine code spoofing". If your font renderer
under some circumstances takes pieces of the font description it is
interpreting and feeds them to the CPU for execution, that is not "machine
code spoofing", it's simply a buffer overflow vulnerability in your font
renderer. And just as a font renderer shouldn't feed pieces of the font to the
CPU for execution, a JPEG parser shouldn't feed pieces of the image to a
javascript interpreter for execution.

------
interfixus
2011

~~~
interfixus
Care to explain the downvote? The article _is_ from 2011, dammit.

~~~
unwind
You might have worded it a bit too tersely. I'd go for something like:

Mods: please add [2011] to the title, since the anniversary was in 2011 (MIME
was released in 1991).

That makes it clear what should be adjusted, while sounding a bit more polite.
That's my guess about the downvote, anyway.

~~~
interfixus
Probably. I keep forgetting there are ordinary social niceties in play on HN
:)

~~~
dang
Your comment was just fine because minimalism is a value here too. But of
course many readers don't know that. Anyhow, we added 2011 above. Thanks!

