Hacker News new | comments | ask | show | jobs | submit login
Why I love text files (matthias-endler.de)
70 points by telemachos on Oct 17, 2010 | hide | past | web | favorite | 43 comments

I'm sorry, I know one shouldn't criticise if it's not constructive, but this must be the piece with the largest text/content ratio I've ever read, and I've read a Stephen King novel.

Maybe the author just loves text.

He uses a text file for his TODO list so that mankind will be able to read it in 1000 years! I wish my work were that important.

When I was younger and first using a computer, I would "Open with Notepad" all sorts of files to see if anything about the structure was readable. That's why I liked text files then, these days I like them because they are often indicators of the mindset of simplicity.

Actually he uses it, I imagine, so that he will be able to read it next year.

I haven't even tried to use any todo apps, (the text file has always served me well) but what experience I have with proprietary formats is that they're a poor choice for storing data, since either they're poorly designed or the designer is constantly redesigning them and invalidating your data.

That's why the statement was doubly odd, because todos are pretty transient.

I like Things for Mac myself because it automates repeating and scheduled todo items, and has a distinction between completable projects and areas of upkeep. A programmer can leave any time they want by processing its XML file.

Assuming they still use ASCII (or an ASCII extended character set) in those days.

As long as people still use English (or letter-based languages), ASCII would be pretty straightforward to decipher — especially compared to a straight binary dump.

There is a great chapter in The Pragmatic Programmer on why you should stick with text files for everything possible. Section 3 / Chapter 14 "The Power of Plain Text". It argues that as programmers, our base material as craftspeople isn't wood or metal, but knowledge, and with plain text we have the ability to manipulate knowledge with virtually every tool at our disposal.


This blog post doesn't express things nearly so well as the book, but the argument has validity.

Interesting higher level questions arise from this -

Is it because we've built so many tools around text that text seems so powerful, or is it because text is powerful that we've built all these tools?

Knowledge is of the mind+brain+body+environs. Text is just data. Just as machine code is "just data" if you don't have a processor that "understands" it. Remember - it took 20 years to decipher the hieroglyphics on the rosetta stone. Also, good luck with storing images in plain text.

Moral of the story - don't get too attached to anything :)

If you can't and are paranoid about losing your digital stuff, maybe DSpace will help - http://www.dspace.org/

The blog post and his comment have nothing to do with backups. They point out the flexibility and human readability of text files, as they are used in numerous different formats (xml, yaml, html, ics, csv, sql, etc etc etc) which can be easily parsed.

Binary data isn't as easily human readable or parsable. Images can be stored in text when base64 encoded, or when a text-based image format is used, such as XPM (old hackers may remember using vi as their X11 image editor of choice, back in the day) and SVG.

DSpace is not about backups indeed. Its about problems like this one -

An acquaintance of mine has a true blue 5.25 inch "floppy" disk with text data of his research that he can't recover - 'cos the disk's format is inaccessible, the text is not in ASCII (probably EBCDIC), drives that read it aren't available now, the OS that knows the file system (CP/M I think) isn't available, the cpu that the OS ran on isn't available now .

I'll grant that you can recover it, but only with great difficulty - sort of a Rosetta problem.

It's an interesting question, but I think it can be answered by pointing out just how much effort we've put into trying to build tools around other forms of information. Audio Processing, Image Processing, have all come a long way but they're nowhere near as powerful as sed/grep/awk/etc.

I agree with you though, tools like netcat, ssh, and dd are super powerful without implying any kind of formatting Text works with these by virtue of being representable as a stream of bytes.

The Art of Unix Programming also has a section on why text trumps binary formats almost every time.

There's also a good chapter on self-documenting data formats in Jon Bentley's _More Programming Pearls_, and don't forget http://c2.com/cgi/wiki?PowerOfPlainText .

I look forward to his next post, "Why I love bits."

I think bytes are next. Then bits. Then the glory of square waves.

Seriously, I can understand the fascination with something so fundamental to programming. I'm just surprised the favorite editor wasn't mentioned.

Have I told you guys how much i love electrons lately?

Square waves and electrons are seriously some of the most beautiful, incredible things.

But for my money, how about logic? Why the hell do logical operators make sense, how have ideas with absolutely no physical basis taken a life of their own and woven all this magic around us?

You ought to read Stephenson's Anathem.

Blame Aristotle.

I usually think of logic as rigorously defined language, rather than built-up math. (Math being based in the physical world, and language being purely ephemeral)

From that point of view, 'or' 'and' 'not' and 'xor' make perfect sense.

> I'm just surprised the favorite editor wasn't mentioned.

It was, at least inferentially: Emacs.

> I think bytes are next. Then bits. Then the glory of square waves.

It's hard to really quite grasp just how crazy computers are without mucking about in some square waves for a few hours :)

Text became what it is to me today after my first revelatory experiences with Linux/bash--something like being six years old and realizing the alphabet is all I ever need to describe anything and everything.

Indescribable feeling.

Go ahead and use text files, but make sure you specify exactly what encoding is used. Saying it's "just plain text" is not enough. UTF-8 is fine and convenient, just make sure your code actually encodes/decodes it when it should.

From the quote at the end of the article: "The continued dominance of the command line among experts [...]" -- is that even true anymore? I certainly live in the shell, but looking at my younger co-workers who are in their 20's, they hardly ever use the command line.

The IDE is the new command line, it seems.

As a young 20-something (20, actually, to be exact), I'd say that the best programmers I know still use the command line to a significant degree. Though we are small in number compared to the number of people who live in some IDE.

I was under the impression that living in the IDE was only popular for certain languages (Java, C#, MSVC++, ...)

Not necessarily a bad thing - I like using Visual Studio when doing C#.

Virtually everyone at the startup I work for uses vim in a terminal.

I don't, because while I know vim verbiage, I've never found it more productive than having a big fookin' screen with TextMate and some terminals floating around.

I'd probably feel differently if I went back to working on Linux.

The concept seems obvious, but it hasn't stopped a slow sapping of functionality and permutability of data in the face of 'easy to use' solely GUI based tools. To say nothing of the rather difficult task of automating software without a CLI component...

i run my life off my todo.txt file as well. i also recently discovered the plaintext iphone app that syncs this to my dropbox - highly recommend it if you run your life out of txt docs too.

well you couldn't fit all of human knowledge in a text file like he claims, not to mention how hard it'd be to give it a proper filename.

You could create a text file called alpha2omega.txt, which contains the line:

Look at the binary representation of pi; you'll find the sum of all human knowledge encoded there in ASCII, EBCDIC or any other code ever invented. You just need Google to index it all for you so it's useful.

i don't know if i believe its possible to represent all of everything in text, downvote or no. Besides it'd fork at least once before it got finished.

Space Ruler Xzorganax, I present to you: misc.txt

and if you run out of space in misc.txt, there's always stuff.txt too.

You just named two of the files i currently have on my desktop.

I think we have the same desktop.



Nah, you could. ASCII art covers most of your non-text bases.

And, if you're stuck for 3D video or textures or audio, you can cheat a little and define a 'container' and then use that 'container' within the text file. For example, imagine an XML file containing the rules on how to parse it- that is just another text file. You could do the same with any form of data. Binary data can be represented in ASCII too, they would just be 7x as large as a binary file.

pok good point but even ascii cant represent meditative, religious, or otherwise universal truths, not to mention that finishing the text file (or even getting close) would have such a cultural impact that otherwise normal history recording would be shifted.

Love the design of that website.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact