Hacker News new | comments | show | ask | jobs | submit login
JSON Compression by Rotating Data 90° (malctheoracle.com)
4 points by sfeather on June 3, 2016 | hide | past | web | favorite | 12 comments



For those that assumed the compression was all about the pipe - here is the follow-up.

http://malctheoracle.com/post/json-compression-part-deux


So it's basically column oriented json? But 90d json is so much hipster...


This looks a lot like the first compression step in JSON HPack: https://github.com/WebReflection/json.hpack/wiki


If you care about size, then the first step should be choosing the right protocol for the task, e.g. Protobufs.

Changing your data structure for the sake of compression harms the main advantage of JSON: convenience.


If you're taking this approach, why not just text/csv as your data format. If you already know the field order/names you can even omit the header row completely.


I would recommend gzip compression instead of this, which would compress the given data even more, without requiring any code changes.


While gzip compression is obviously better than this, which is more of an optimization, you can certainly pair this with gzip to get better results; albeit not much:

Sample JSON Object, using this technique:

>>> Length (Original): 25565

>>> Length (Original, gzip): 3966

>>> Length (Using 90deg): 15626

>>> Length (using 90deg, gzip): 3633

So while the optimization gives you almost 10k bytes in size reduction, the gzipped result is only a 300 byte reduction in this example.


Yeah, why go through all this?


Well, at the end of the article he shows this technique used along with gzip. He however does not show a comparison of the original vs his 90deg compression with gzip; so the difference there may be negligible.


"After GZip the original file was squeezed down to just 132 bytes, however GZip took my technique down to just 99 bytes which saves a further 33 bytes."

When you want speed - every byte counts right?


He does say that the original file gzipped is 132 bytes, doesn't seem worth it to me.


You end up basically returning a dump of an SQL query




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: