
Segfaulting atop and another trip down the rabbit hole - luu
http://rachelbythebay.com/w/2014/03/02/sync/
======
chubot
I think netstrings provide a good and dirt simple format for robust logging.
It's just a simple length prefix (which is ASCII, so the length is itself
variable length).

[http://cr.yp.to/proto/netstrings.txt](http://cr.yp.to/proto/netstrings.txt)

e.g. 3:foo,

So if one record gets borked, you can search for the next few length prefixes
that "line up".

This doesn't require any transaction library or anything. It's just a little
bit of redundancy in the data.

It's a good format for delimiting JSON too. I was surprised to find recently
that JSON-RPC makes you parse all the JSON to find the closing }.

~~~
signa11
> I think netstrings provide a good and dirt simple format

so netstrings is basically tlv encoded strings ? since it is always going to
encode strings, the 't' part can be omitted to just have 'lv' ?

~~~
chubot
Right. Zed Shaw actually did a typed version:
[http://tnetstrings.org/](http://tnetstrings.org/)

I played around with this. I'm on the fence on whether it has benefits over
say plain netstrings + JSON. It's less human readable, and sometimes it's
bigger than JSON. The encoder and decoder is a lot simpler, but that hardly
matters because JSON is everywhere.

JSON by itself is a little lacking because you can't encode arbitrary binary
values. But if you just add plain netstrings, you have the ability to encode
anything -- data in efficient formats, and metadata with types in human
readable JSON.

------
bsder
So, the solution to not using transaction/copy-on-write semantics in your
application or filesystem is to crud up your format with a whole bunch of
magic numbers, markers, and checksums to make your file have transaction/copy-
on-write semantics.

Um ...

------
quanticle
Write-ahead-logging has existed for how many years now? It disappoints me (but
doesn't surprise me, unfortunately) that unexpected record truncation is still
an issue that we have to deal with.

~~~
Confusion
How does write-ahead-logging deal with unexpected record truncation _in the
log_?

~~~
twic
If an entry in the log is corrupt, it doesn't get replayed. The filesystem is
left in a consistent state, just without that data.

~~~
Confusion

      If an entry in the log is corrupt, it doesn't get 
      replayed. 
    

Yes, and that is exactly the solution the article calls for: you have to
design your log format so you can determine a log entry is corrupt, so you
don't replay it, while still replaying the remaining entries. You can't solve
that using write-ahead-logging, or it'd be turtles all the way down.

