

Hadoop at Twitter  - ypavan
http://www.cloudera.com/blog/2009/11/17/hadoop-at-twitter-part-1-splittable-lzo-compression/

======
joubert
Has anybody here used Avro?

~~~
bravura
Do tell, I hadn't heard about it until now.

<http://hadoop.apache.org/avro/docs/current/> "Avro is a data serialization
system."

~~~
joubert
I have a project, <http://elev.at> \- it's a web API to convert data from HTML
tables, Excel spreadsheets, CSV files, etc. into XML - the premise is to make
information that is published for humans eyes, into computable form so that
other apps can consume the data.

So far I'm only supporting XML as the output, but am wondering about other
outputs. One obvious format would be JSON, but I'm also wondering about
others, such as Avro. It is similar in purpose as Thrift or Protocol Buffers,
but unlike these two, you don't need to pre-gen the code files to process the
resulting data sets, i.e. the generation and consumption of Avro data sets can
be more dynamic than with Thrift or Protocol Buffers.

I was wondering whether anybody here has experience with Avro.

