Hacker News new | past | comments | ask | show | jobs | submit login

It's about the failure modes you anticipate. If a text file becomes corrupt a human can probably still make some sense of it. If a binary file does, you may well be completely out of luck. I mean try it for yourself, truncate a 4G core dump by a couple of bytes and watch GDB try to open it. Now imagine that all your system and application logs are vulnerable to this.

In a world where things are "fixed" by just blowing away a VM and spinning up a new one, you might not care but in that case why bother to log anything at all...?




> "truncate a 4G core dump by a couple of bytes and watch GDB try to open it"

This comes down to a failure of the binary design. It's possible to design a binary format that is uniform enough to handle truncation of a few bytes. To give you one example, you have a header for the binary with pointers to the start and end of each data block and you can keep multiple copies of this header. Another alternative is to specify that each block needs to specify where the data in that block ends. That way, even if you have partial corruption you can read the uncorrupted data blocks.


It is already invented by ASCII.


No it's not. ASCII doesn't give you enough metadata to permit for efficient and reliable parsing of data, you can do a lot better without using ASCII.


ASCII solved problem of "few bits/bytes lost", isn't? Binary flow is divided into bytes. Bytes are divided into control sequences and text characters. Control sequences contains sequences to separate columns (tab character) and records (new line character). Moreover, all that was standardized across various platforms. It was huge improvement. It is why UNIX sticks to ASCII: to be portable.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: