let X, B, O, m, i, c, e, N, M, o, t, j, x, R;
There aren't any tests, either, so if you're using this and find a bug I guess you just have to hope that nothing breaks when you change a line like
If you go to the live demo, it will do a test to crush/encodeURI and uncrush/decodeURI to the string you pass in and verify that they are the same. Maybe I should write some automated tests.
EDIT: On second thought, you'd probably be better off using a completely different compression algorithm that doesn't sacrifice performance for golfability.
I mean it's doing significantly better than LZ. Anyone want to provide simple intuition on these algorithms and where they come from?
It can beat LZ's compression ratio for two reasons:
1. The input consists of only bytes that are valid in an URI, so it doesn't need an additional encoding step.
2. It gets very slow very quickly for larger inputs.
Example #1 (short string):
input: 103 bytes
gzip(input): 87 bytes
base64(gzip(input)): 117 bytes
input: 3122 bytes
gzip(input): 840 bytes
base64(gzip(input)): 1121 bytes
(This is a terrible idea. Don’t do it. Just for fun)
Why is this a terrible idea?
I'm thinking I can Whatsapp friends a huge letter as an image and they could use my toy app to decode it ;-)
(Though it's mostly meant for making things smoother in statically-typed languages.)
But when you put json in the URL, you normally do want exactly that. Making the URL shareable.
I do that because I wanted users to be able to save their state without burdening those users with accounts or burdening myself with maintaining a DB for something so simple.
I also somewhat abuse the history API and use my "read the URL and load state" logic to implement undo-redo via navigating back and forth, though that doesn't seem to work right now. I am working on a refactor that uses redux to implement undo-redo and just replace state, to keep the user's history clean.
Storing encoded JSON in the URL hash is a nifty hack in my opinion. Users can save state in a bookmark or share it with others easily, and it's clear to the users that "where they are" in the URL bar maps to the current app state. Plus, bookmark syncing is taken care of by most browsers to make that state available elsewhere, etc. For the site owner, it means not needing a DB to make an app with some kind of state persistence.
One risk: be sure the state you persist to the URL is in a schema you plan to retain compatibility with! Blind serialization and de-serialization is a recipe for bugs and misery the next time you add a feature.
Storing confidential data in the hash assures my users that I (the developer) don’t have access to this data, since anything after the hash never gets sent to the server by the browser.
It wouldn’t stop me from sending that information with a post request afterwards but the code is open source and it could be noticed in developer tools.
Then the web browser sends a message to the server that looks something like this:
GET /webpage?q=54 HTTP/1.1
Cookie: well maybe there's a cookie here
As you can see, the bit that comes after the hash isn't ever sent from the client to the server. It was originally meant so that you could link to a particular section of a longer web page, so it was quite irrelevant.
But what comes after the hash is never processed by any standards compliant web server not transmitted by any standards compliant web browser/client.
It is really unfortunate, because there are tons of use cases and zero reason to interfere with these requests.
The justification is also downright ridiculous. The argument is that "GET, DELETE, ... have no defined semantics for bodies". Meanwhile the 'defined semantics' for POST bodies is... whatever the application decides.
I'm slowly coming around to the idea that he might be right. The problem is the (semantic) question of what resource is being discussed. The semantics of GET, HEAD and OPTIONS are that (unless it gets modified) the same resource should always be answered the same way. If the resource that we're asking about is identified by the URL + the body, then those requests (and DELETE) all need a body too. And then there's an open question for PUT and POST about what resource exactly is being modified by the request - although as you say, the semantics are whatever so thats less important.
I think HTTP has a problem in that its hard to express a resource name that is complex. Like, "the records which match this specific elasticsearch query" is a hard thing to pack into a URL. If HTTP was different, I could imagine a second body-like part of each request called the "resource section" with its own Resource-Type: application/json header and so on. But instead we just have a URL, so we end up doing awful hacks like packing JSON into URLs and make them unreadable. The URL is long enough for this sort of thing - implementations have to allow at least 8k of space for them, and can allow more space. But they're hard to read and display, and there's no standard way to pack json in there.
So I wonder it'd be worth having a formal, consistent way to encode stuff like this in the URL. I'm spitballing, but maybe we need a standard way to encode JSON into the URL (base64 or whatever) defined in an RFC. Then if you do that, you can add a "Query-Type: application/json" header that declares that the URL contains JSON encoded in the standard way. Then we could at least have consistency and nice tooling around this. So like, when you're making a request you could just pass JSON into your http library and it'd encode it into the URL automatically in the standard way. And when the URL is being displayed in the dev tools or whatever, it could write out the actual embedded JSON / GraphQL / whatever object that you've packed in there in an easy to read way.
Using text/plain only exists as a backwards compatibility thing. I wouldn't be surprised if that stops working at some point, since it breaks the security model.
Reduces share urls by about 75% for these very repetitious JSON strings.