
Show HN: PHP Compressed Strings Library - orware
https://github.com/orware/compressed-string-demo
======
orware
Most users on HN might not be too interested in a Library such as this (and I
must admit the demo page I'm linking to above probably doesn't do the best job
of explaining the background or utility of such a package), but hopefully a
few of you will find it interesting.

As I mention in the page, I was drawn to the issue when I noticed how much
memory was being used when I pulled a large data set from one of our Oracle
databases here at work and I thought to myself, "Self, is there any better
options?".

My main goal with the application I'm still working on, is to create a
database proxy API (so I can submit a JSON request to this PHP application
with an actual SQL query, that could then prepared/sent to the Oracle database
and the results returned as JSON back to the user). Lots of security-related
stuff I'm working on as well to make it happen in a way that won't cause any
issues for the database, but this particular subproblem related to memory
usage took up a few weeks of my time as I looked into and experimented with
using JSON strings directly, and then into GZIP strings (particularly because
GZIP was going to be the final output in most cases with Content Negotiation).

All in all it was pretty fun to build the package and just run various
performance tests so if you have any questions feel free to ask and I hope
someone might be able to use this in their own projects too :-).

Enjoy!

~~~
orware
One other thing to add...in creating the demo I wanted to use the Shakespeare
JSON data set that someone had shared at some point on here (they had shared
the link).

I ended up having a bit of difficulty using it at first because it had some
extra tab characters throughout the file and PHP's json_decode() function
didn't quite tell me exactly where the problem was.

Other tools didn't seem to be useful either (online validators are only for
small files typically and don't allow file uploads) and even Notepad++'s JSON
related plugins didn't seem very helpful.

I finally ended up using the JSON Streaming package available for PHP which
allowed me to run through the file and identify some of the offending lines.

Then I noticed a pattern of the tab character not being escaped in the line
strings and just did a mass replace/removal of the tab characters in the file
which then allowed me to create the sqlite shakespeare.db file included in the
repository.

This by itself might be useful for others to use as well in their own projects
:-).

