Hacker News new | past | comments | ask | show | jobs | submit login

Thanks!

It's true that filenames with whitespace or newlines are bad for interoperability ("make" is another example). There are three simple options: escaping filenames, making filenames NUL-terminated or declare such filenames as invalid. The latter way seems to have won for practical reasons, and it's a pity that "safe filenames" were never standardized (but C-identifier plus extension should be safe everywhere).

Mbox is definitely broken (for example body lines that start with "From" are changed to "> From"). I don't think it is ambiguous today (all software I know interprets "From " at the beginning of a line as a new mail), but it clearly was not much designed at all. It still has some precious properties which is why it's still in use today. For example, appending a new email (Mail server) is very fast. Crude interactive text search works also very well in practice, although automation can't really be done without a library.

Email is complex data (not line- or record-oriented), so various storage formats achieving various tradeoffs are absolutely justified.

> Binary formats present their own set of issues, but "accidentally unparseable" is more common in text-based formats.

It's true, especially with formats from the 70s where the maxime was "be liberal in what you accept", and where some file formats weren't really designed at all.

On the other hand, "accidentally unextendable" (for example, fixed-width integers) and "accidental data loss" is much more common in binary formats.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: