Hacker News new | past | comments | ask | show | jobs | submit login

I'm aware that the following is typical HN middlebrow matter, but I'm asking anyway. Does anyone know why RFCs are still formatted as if they were written on a typewriter in the seventies? I mean, here's a sentence quoted verbatim from this document:

    Sunset header fields will be served as soon as the sunset date is

    Wilde                           Informational                     [Page 9]
    --------------------------------------------------------------------------
    RFC 8594                        Sunset  Header                    May 2019

    less than some given period of time.
What problem does this solve? How is this more useful than it is cumbersome? also, how do people write this? Do they manually space space space space align the footer and then copy it every 30 or so lines?





Don't worry, there's an RFC for that!

RFC 7990 -- RFC Format Framework: https://tools.ietf.org/html/rfc7990

There's various tools to automatically format things as necessary, just like any other kind of text wrapping.

As far as the overall "philosophy" behind keeping it this way, the honest answer is that the IETF is just a particularly unlikely group to change things without a clear need, and there are likely all sorts of tools small and large that expect RFCs to follow these conventions at this point.


As an example of this (though for a non-RFC document):

Here's the "source" XML that is authored: https://openid.net/specs/openid-connect-core-1_0.xml

That can be compiled in to this HTML: https://openid.net/specs/openid-connect-core-1_0.html

Or to this RFC-like plaintext: view-source:https://openid.net/specs/openid-connect-core-1_0.txt

Most new RFCs are authored this way.


First line in the Abstract for that RFC:

> In order to improve the readability of RFCs while supporting their archivability, the canonical format of the RFC Series will be transitioning from plain-text ASCII to XML using the xml2rfc version 3 vocabulary;

Is it readable? Yeah

Is it archivable? Yeah, XML is (AFAIK) one of the most closely followed standards I could think of.


The legal world works the same way too. They're amazingly low tech. A lawyer in my family mentioned that it's basically a combination of:

- everyone can do it with any software (or even a typewriter)

- consistency with legacy documents. The format doesn't just change on you as you're reading through legal history

- it works fine, why change it?

I'd also add a guess:

- There's no room for implementation detail to affect formatting. Last thing you want is a whole bunch of formats that are similar but not identical, just because someone's software is a bit different.

- could you imagine trying to get everyone to change? We should be so lucky that everyone's already this consistent


> could you imagine trying to get everyone to change?

Yes, this is exactly what I do for a living, consolidating policy and legal documents and their related business workflows into modern applications. My requirements for how the text editors work are far more meticulous than your average app exactly for the reasons you stated. Concerns with formatting that most products would blow off as trivial are deal-breakers in this industry.


Is there a reason to not store the document in an abstract format that is more easily handled by systems useful for legal analysts (e.g. giving you the ability to diff text), and just “renders” to the accepted format? (I’m picturing storing the docs as LaTeX, but anything like that would work. Maybe there could be a legal “theme” for a markdown processor, for example.)

Because, in such cases, it wouldn’t really matter if the editor renders the source to text incorrectly, as long as the proofer renders it correctly. Just like with WYSIWYG desktop-publishing software.


Storage formats aren't the issue. We diff and merge documents just fine, and do render them in different formats in some uses cases for specific audiences. Nor is it about a final rendered document. It is the details of workflows and collaborations that happen before a document is ever finalized where the the editing and reading experiences must match.

Very likely: Because they are managed by software that has existed for decades when expectations and needs were very different than today, but which works well and correctly. Why invest time and effort into changing a system that is working well enough? Whether you need to view it on a screen or print it out, this format works.

Also, I guarantee there are any number of downstream consumers of RFCs which take this sort of format as a given, and which will break on even a minor change. And why break those downstream systems if you don't have to?

Basically, any changes will break something. So the benefits of the changes need to be bigger than the costs of the changes. Not to mention the cost in wasted time of all the humans bikeshedding how to change it to make it "better".

Dealing with the ongoing cost of humans having to read across artificial page breaks is a pretty minor concern compared to the costs of all that.


You can read it with any software you like, now and in 30 years when Microsoft Word is a quaint relic in a museum.

How does that not hold for a plain .txt document without page-marker-ascii-art?

I presume there exists a documented way to print them so that everything lines up properly.

The files contain ASCII form-feeds between pages. If you send that directly to a printer, it will cause it to start a new page.

As I first encountered MS Word running on a Xenix system 31 years ago I suspect it will be alive and well in 30 years time!

To be fair, I recall having conversations in the 1990s about how MS Word was such an evil proprietary format and that one day we wouldn’t be able to read it. And here we are nearly 30 years later and Word docs (in a much evolved file format) are still here and still widely supported and easy to read using many tools, including open source ones.

Not saying that proprietary formats aren’t still a bad idea for other reasons, but predictions of unreadability don’t seem to have panned out for any common file formats.


How do you open word documents from the 90’s?!? Do you have a Windows 95 VM or something?

Even with modern Microsoft Word, the formatting of old documents is often mangled.

To this day, up-to-date PowerPoint can’t reliably display presentations made with up-to-date PowerPoint on a different machine, let alone OS!


> How do you open word documents from the 90’s?

LibreOffice


I open them in Word 2016.

> still widely supported and easy to read using many tools, including open source ones

One someone who has not tried could possibly say that.

The numerous doc file formats are a constant headache for anyone doing document processing. Not even Word itself can read its own older formats reliably. Sometimes you have better luck with LibreOffice, sometimes not.

And that's the mostly widely used document file format. Anything else from the same era is completely dead in the water. Manually viewing them can be done in emulators with a bit of work but any automatic processing is a huge undertaking.


Good luck editing these WordPerfect and CorelDraw files!

I put my vaccination history into a ClarisWorks document. At least, I assume I did from the file name…

Could be worse. My dad used a video tape format even more obscure than Betamax.


Yikes. At least laserdiscs weren't homemade so I could replace what little I had.

I'm not sure about Word docs, but FWIW 90s era Excel files have become progressively harder to open.

Unlike Word I actually spent a few years of my life working on this.

At the surface layer this era of Excel ("BIFF" documents) isn't too bad, getting say, a table of small integers representing people's annual salaries out of an XLS file is very do-able and many programs today will get that right.

As you start to dig down it gets nastier pretty quickly. Formulae require implementations that match not just what Microsoft's published documents (I have loads of these on a shelf I rarely look at now) say, but what Excel actually did, bug for bug, back in the 1990s. Maybe the document says this implements a US Federal tax rule, but alas Excel got the year 1988 wrong, so actually it's "US Federal tax rule except in 1988".

You also run into show stoppers that prevent the oft-imagined "Just transform it to some neutral format" because Excel isn't a typed system. What is 4? Did you think it's the number 4? Because the sheet you're trying to parse assumes it's actually the fourth day of the Apple Macintosh epoch in one place, but in another place uses it to index into an array. Smile!

Finally in complicated sheets (often "business critical") there's a full-blown Turing complete programming language, complete with machine layer access to the OS. Good luck "translating" that into anything except an apologetic error message.


> Good luck "translating" that into anything except an apologetic error message.

I'm going to have to steal that line. :)


> Does anyone know why RFCs are still formatted as if they were written on a typewriter in the seventies?

They are formatted in plain text with fixed page sizes because that's what they've always done, it works fine, and there's no compelling reason to change.

> also, how do people write this?

The thing about keeping the same format for a few decades rather than changing it with each shift in popular fashion is that there is plenty of supporting tooling.

https://www.rfc-editor.org/pubprocess/tools/

https://tools.ietf.org/


It's super readable with any plain text editor or browser. And it's really nice to read something not filled with images, crazy fonts, colours and other junk. It's just pleasant.

It sucks on e-readers. It would be much nicer if there were no headers and footers, and no newlines except between paragraphs...


It could help applications using capability urls like in password reset links. Maybe the exact date isn't important, but the problems with capability urls is that they contain a (temporary) secret, but browsers and servers happily log all requests, even if that part of the URL is protected through TLS (edit: to outsiders).

Maybe having such a field will help treating those URL differently than "normal" ones, so that the secret is better protected.

edit: I failed to read the question correctly.



Here's the sunset rfc draft converted from XML to JSON.

https://www.dropbox.com/sh/duhmxzaehy0dwuc/AADyKPN5UVU1HKT9M...

Looking at the JSON, the structure is pretty basic. You could see it rendering in any format/style pretty easily.



Yes please? What tool is this? Are you suggesting that ASCII art is required for paging text?

Lynx[0], a text-based browser (now primarily) used for bash scripting.

[0] https://en.wikipedia.org/wiki/Lynx_(web_browser)


I can use Lynx for bash scripting?

It can be used in non-interactive mode from scripts, e.g.

  lynx -dump $URL > $FILE
This is a simple way to extract the content of a web page as plain text.

Interesting. Even as an experienced web scraper, it would have never occurred to me to use Lynx.

Wow, it sure looks pretty in lynx.

Evidently. Why does the text exist twice? Once in blue and green, and once in a smaller dark shadow behind it?

I believe it's a CLI tool to display RFCs, on a translucent terminal, with a browser window behind it. That still leaves other questions...



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: