

URL Object Notation: A better JSON for URLs - vjeux
http://blog.vjeux.com/2011/javascript/urlon-url-object-notation.html

======
Robin_Message

      _foo_bar=Something&baz=Else
    

could parse as either of:

    
    
      {foo:{bar:"Something",baz:"Else"}}
      {foo:{bar:"Something"}, baz:"Else"}
    

which seems to ruin things. You need a character to end an object, maybe ; or
& would work.

Also, I don't like using : in query strings as a separator since that makes it
a bit ugly and not quite a query string. How about using = for both and using
say ! to identify non-strings, and @item@item@item for arrays?

So your example becomes:

    
    
      _user_name=Bob%20Smith&age=!47&sex=M&dob=5/12/1956&pastimes=@golf@opera@poker@rap&;children_bobby_age=!12&sex=M&;sally_age=!8&sex=F

~~~
vaporstun
> _foo_bar=Something&baz=Else

> could parse as either of:

> {foo:{bar:"Something",baz:"Else"}}

> {foo:{bar:"Something"}, baz:"Else"}

I don't think this is right, or at least, it shouldn't be right. I think
_foo_bar=Something&baz=Else should only parse as {foo:{bar:"Something"},
baz:"Else"}.

{foo:{bar:"Something",baz:"Else"}} should be _foo_bar=Something&_foo_baz=Else

It should be possible for the writer of the article to fix this without major
changes.

~~~
Robin_Message
> {foo:{bar:"Something",baz:"Else"}} should be
> _foo_bar=Something&_foo_baz=Else

This makes sense, but is not what I or the author was suggesting. Note that it
is somewhat repetitive (foo appears twice). In contrast the new suggestion
simply has an possible ; to disambiguate these possibilities.

------
adeelk
To me, this:

    
    
        _table_achievement_column=instance&ascending:true
    

is less clear than:

    
    
        table[achievement][column]=instance&table[achievement][ascending]=true
    

which also has the benefit of being parsed properly by most web frameworks.

~~~
vjeux
The URL strings are mostly read only. Once generated, they are not going to be
edited by the users anymore.

I find it better to have less readable output as long as there is a win in
term of length. Because no one likes big urls.

Also, what is less visible in URLON is the object structure. But usually when
you want to edit something, you are only interested in changing the values,
not the structure itself.

~~~
adeelk
If you want to optimize length over readability, then how about:

    
    
        col=instance&asc=1
    

Or even better:

    
    
        c=instance&a=1
    

:)

~~~
vjeux
It is still possible to make such small structure with URLON

    
    
      URLON.stringify({c: 'instance', a: 1}) == '_c=instance&a:1'
    

We also benefit from the typing: 1 is a number an not a string :)

~~~
adeelk
But

    
    
        URLON.stringify({c: 'instance', a: '1'}) == '_c=instance&a:1'
    

as well, right?

------
drdaeman
Unreadable, ambiguous and just plain ugly.

First of all, in my understanding URLON is primarily made for URI fragments.
No browser would send a URLON-encoded data, and if you're doing AJAX, then
URLs are barely a concern.

While the JSON is overly-verbose, the URLON is way too human-unfriendly:

    
    
        JSON: {"table":{"achievement":{"ascending":true,"column":"instance"}}}
        URLON: _table_achievement_ascending:true&column=instance
    

Humans suck at parsing grammars. We even suck with deeply nested parentheses
(that's why editors highlight matching parens), and counting underscores is
way more counter-intuitive.

Oh, and key names with underscores will look ugly.

If one'd want to be "URLish", and concise, he'd write something like

    
    
        table[achievement[ascending=true&column=instance]]
    

While not perfect, at least, a human could understand what's going on from a
first glance.

------
sethg
The standard URL query-string syntax, supported by libraries in every
programming language worth using, uses & and = to construct
“key1=value1&key2=value2”-style parameters. URLON completely breaks that
syntax: if I see something like

    
    
      _table_achievement_column=instance&ascending:true
    

my immediate reaction is “so there are two parameters, one called
‘table_achievement_column’, and one called ‘ascending’... wait... why isn’t
there an equals sign between ‘ascending’ and ‘true’?” The syntax is just
similar enough to query-string syntax that it can mislead to someone trying to
parse it by eye. And if you’re trying to pass complicated recursive structures
(the kind of structures that JSON was invented to describe) through the URL,
they’re not going to be parseable by eye in any format.

------
autarch
Right, cause no one ever uses an underscore in parameter identifiers.

------
stdbrouw
In cases where I really need to have data structures in URLs, I rather like
Rison: <http://mjtemplate.org/examples/rison.html> which has existed for quite
some time now and has parsers/generators in Python
(<https://github.com/stdbrouw/python-rison>), Ruby and JavaScript.

~~~
vjeux
Thanks! I didn't know about Rison, it is trying to solve the same issue :)

Edit: I've added Rison to the list of translations in the article. It is far
more readable than URLON but I find that it doesn't feel like it's a part of
an url.

~~~
stdbrouw
If you know what kind of structure each argument will have, the variants
O-Rison and A-Rison further cut down on the parentheses. For example:

    
    
        http://example.com/service?query=q:'*',start:10,count:10&pretty=false
    

Looks pretty natural to me.

------
bdfh42
I like the concept. However JSON works because it can be directly converted
into a JavaScript object and (without too much trouble) a JavaScript object
can be converted to JSON.

I thus think that to succeed this approach needs some support on the server
side to convert the notation into an object that can be interrogated from
code. OK, that would need to be different for each server runtime but things
like this tend to pick up support fairly quickly.

I use JSON to communicate with ASP.NET web services because the .NET runtime
provides a great de-serializer that can convert the JSON directly to one of my
server side classes and vice versa

~~~
vjeux
I am not sure to properly understand your second paragraph. Here's an answer:
URLON supports both "stringify" and "parse" operations. So you can go both
ways.

You can even do fun things like URLON -> Javascript Object -> JSON if you want
to. The fact that it is 100% compatible with JSON is a great benefit.

As for my use, the URLON is in the hash part of the URL (after the #) so
everything in running in the client. This is useful to give URLs that hold the
current state of the page.

~~~
roberthahn
I thought the work you did was cool but honestly? Hashes are not the place to
store page state. See [http://www.webmonkey.com/2011/02/gawker-learns-the-
hard-way-...](http://www.webmonkey.com/2011/02/gawker-learns-the-hard-way-why-
hash-bang-urls-are-evil/) for an example.

If you need to store page state RESTfully, I would suggest you create a page
state resource and PUT your state there. If you need a page's state, ask for
it by name. This also has the benefit of keeping URLs shorter and cleaner.

~~~
vjeux
I'm not sure to understand what you mean. Could you tell me how I would
implement that on a concrete example:

[http://db.mmo-
champion.com/items/#table__search_results_item...](http://db.mmo-
champion.com/items/#table__search_results_item=2%3A-category)

I want to store: page number, sorted-column, reverse.

~~~
roberthahn
Create a new /items/pageState URI that accepts POST, PUT, and GET requests.

POST to /items/pageState to get a handle - this could be a randomly generated
small sequence of characters. A response will contain a Location: header with
a URL: /items/pageState/aA1 (the aA1 is just an example for this description -
each user would get an unused sequence of characters)

Anytime the user changes the state of the page, PUT the page number, sorted-
column, and reverse fields to /items/pageState/aA1.

Now, the url of the items page to <http://db.mmo-
champion.com/items/?pageState=aA1>. When that page loads, the JS will make an
Ajax GET request to /items/pageState/aA1 to fetch the state of the page, and
re-render it appropriately.

If you're concerned about speed, well, don't be. Add support for eTags on
/items/pageState, and while the state is unchanged, that data will be fetched
from browser cache instead of the network.

~~~
vjeux
One state per user is not what I want. If the user sorts in one way, give a
link, then sorts in another way and put a link back. I want both links to be
differents.

Also, your solution adds a lot of extra server calls. The goal of client-side
sorting & pagination is to avoid those. I don't want to get them back just for
the sake of being RESTful.

~~~
roberthahn
Ah, new constraints on the problem! :) Love it!

My solution does not _have_ to add a lot of extra server calls - it all
depends on the caching strategy you choose. For example, if you use maxAge,
you would only be adding one extra request per _new_ state created, or per
_new_ fetch; all subsequent fetches for the same state would automatically
pull from the browser cache.

To satisfy the linking requirement, I would redesign my solution to just POST
the state to /items/pageState, and get back a URI that represents only that
state (which can be shared across all users). Combine this with the maxAge,
and yes, you would have to make more requests, but the extra overhead would
still be way less significant than adding a link to an image on every page
(especially in terms of bandwidth)

But, chacun à son goût! Everyone prefers their own tradeoffs :)

------
jrockway
So having a " in your URL is worse than having an &? Neither are representable
in HTML, so you're going to have to deal with escaping whether you want to or
not. And if you're already escaping & to &amp;, it's pretty easy to escape "
to &quot;. Which you should already be doing anyway. (Never put a literal
[&<>"'], in your (x)html document, please.)

------
jamesmoss
This smells bad. Passing in a load of JSON (or URLON) seems to go against
RESTful practices and really reduces the hackability of the URL.

~~~
arethuza
I thought that the URL structure is orthogonal to whether the service is
RESTful or not. I know that RESTful services tend to have nice URL formats
but, as far as I understand it, you really shouldn't be caring about URL
formats - you have a single entry point and everything is driven by
"hypertext" from there:

[http://roy.gbiv.com/untangled/2008/rest-apis-must-be-
hyperte...](http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-
driven)

------
josscrowcroft
One more point - `JSON.stringify` returns an error when attempting to
stringify a circular structure such as the `window` object...

I tried `URLON.stringify(window)` and nothing happened, no error or return
value (bad sign) so hit it a few more times, saw a couple stack overflow
errors, and the tab crashed.

If possible it should definitely throw a RangeError instead of attempting to
stringify a circular structure.

~~~
vjeux
You can use Douglas Crockford cycle tools in order to deal with this.

[https://github.com/douglascrockford/JSON-
js/blob/master/cycl...](https://github.com/douglascrockford/JSON-
js/blob/master/cycle.js)

As the only way to know if an object has already been visited is in linear
time in Javascript, it makes cycle detection a O(n^2) process. This is way too
costly to add by default unfortunately.

~~~
josscrowcroft
Thanks for the link, I see what you mean about costly.

Perhaps a way to hack-fix this is to set a parameter for maxRecursion, to
prevent the errors - seems like a pretty solid way of preventing endless loops
(RangeErrors) but is probably something that should be _off-by-default_ and
switched on globally via a setting or once via a parameter...

I can add it and submit a pull-request for your consideration if you like?

~~~
vjeux
<http://jsperf.com/object-cycle-detection>

I've added a jsPerf to see how bad it performs. For small objects it's
reasonable but if the object has 10 000 elements the Javascript test will take
1 second, which is not acceptable.

I'd rather give people a detection cycle detection library if their input may
be circular.

------
Kilimanjaro
A slight mod can make it easier to read/parse. First the rules: everything is
a string unless specified like num# date% bool! list* and objects are
delimited by : and ;

Here is an example:

    
    
        user:name=Bob,age#=47,sex=M,dob%=5-12-1956;pastimes*:golf,opera,poker,rap;children:bobby:age#=12,sex=M;sally:age#=8,sex=F

~~~
underwater
'#' denotes a fragment in a URL. Browsers will only send 'user:name=Bob,age'
to the server.

~~~
Kilimanjaro
Then lets use $, we still have a few chars available

------
aubergene
So the main advantage over rison is that it urlify should work correctly in as
many clients as possible? It would be good to start a test matrix to verify
this.

------
josscrowcroft
Are you planning to make a server-side implementation so that these URLs could
be parsed on the server?

Love the idea.

~~~
vjeux
If your server side is written in Javascript, you can use the implementation
here:

<https://github.com/vjeux/URLON/blob/master/urlon.js>

If not, someone need to write a URLON library for your language.

------
kablamo
Why would want json (or html) in your urls? Example?

~~~
vjeux
I want to store the current state of the page. For example, if there's a table
in the page, I want to keep track of what page I'm in, what column I sorted
...

[http://db.mmo-
champion.com/items/2/#table__search_results_it...](http://db.mmo-
champion.com/items/2/#table__search_results_item=1%3A-category)

~~~
DougWebb
If that's all you're trying to do, then why not store the json using local
storage and a randomly generated key, and just put the key into the url's
fragment?

~~~
vjeux
The URL has to be shareable. If I give the URL to someone else with your
technique, he won't be able to see the same page.

~~~
DougWebb
Yeah, that's a valid point. Requirements matter :)

------
dramaticus3
Why wouldn't you just base64encode your json and include that in the URI and
the base64decode it when you want to de-json it ?

~~~
vjeux
I've tried many things but I wanted it to be readable, small, and good
looking.

JSON:

    
    
        {"user":{"name":"Bob Smith","age":47,"sex":"M","dob":"5-12-1956"},"pastimes":["golf","opera","poker","rap"],"children":{"bobby":{"age":12,"sex":"M"},"sally":{"age":8,"sex":"F"}}}
    

Base64:

    
    
        eyJ1c2VyIjp7Im5hbWUiOiJCb2IgU21pdGgiLCJhZ2UiOjQ3LCJzZXgiOiJNIiwiZG9iIjoiNS0xMi0xOTU2In0sInBhc3RpbWVzIjoNClsiZ29sZiIsIm9wZXJhIiwicG9rZXIiLCJyYXAiXSwiY2hpbGRyZW4iOnsiYm9iYnkiOnsiYWdlIjoxMiwic2V4IjoiTSJ9LA0KInNhbGx5Ijp7ImFnZSI6OCwic2V4IjoiRiJ9fX0=
    

JSON + URIEncode

    
    
        %7B%22user%22:%7B%22name%22:%22Bob%20Smith%22,%22age%22:47,%22sex%22:%22M%22,%22dob%22:%225-12-1956%22%7D,%22pastimes%22:%5B%22golf%22,%22opera%22,%22poker%22,%22rap%22%5D,%22children%22:%7B%22bobby%22:%7B%22age%22:12,%22sex%22:%22M%22%7D,%22sally%22:%7B%22age%22:8,%22sex%22:%22F%22%7D%7D%7D
    

URLON

    
    
        _user_name=Bob%20Smith&age:47&sex=M&dob=5-121956;&pastimes@=golf@=opera@=poker@=rap;&children_bobby_age:12&sex=M;&sally_age:8&sex=F

~~~
sethg
You can translate the base64 version into a readable form from the terminal
using widely available libraries. For example, using a Python interpreter you
can do this:

    
    
      >>> from base64 import b64decode
      >>> from json import loads, dumps
      >>> print dumps(loads(b64decode('eyJ1c2VyIjp7Im5hbWUiOiJCb2IgU21pdGgiLCJhZ2UiOjQ3LCJzZXgiOiJNIiwiZG9iIjoiNS0xMi0xOTU2In0sInBhc3RpbWVzIjoNClsiZ29sZiIsIm9wZXJhIiwicG9rZXIiLCJyYXAiXSwiY2hpbGRyZW4iOnsiYm9iYnkiOnsiYWdlIjoxMiwic2V4IjoiTSJ9LA0KInNhbGx5Ijp7ImFnZSI6OCwic2V4IjoiRiJ9fX0=')), indent=2)
      {
        "pastimes": [
          "golf", 
          "opera", 
          "poker", 
          "rap"
        ], 
        ## [etc.]
      }

~~~
vjeux
The goal is to use it in a URL. I don't want my urls to look like that:

    
    
      http://db.mmo-champion.com/items/#eyJ1c2VyIjp7Im5hbWUiO
      iJCb2IgU21pdGgiLCJhZ2UiOjQ3LCJzZXgiOiJNIiwiZG9iIjoiNS0x
      Mi0xOTU2In0sInBhc3RpbWVzIjoNClsiZ29sZiIsIm9wZXJhIiwicG9
      rZXIiLCJyYXAiXSwiY2hpbGRyZW4iOnsiYm9iYnkiOnsiYWdlIjoxMi
      wic2V4IjoiTSJ9LA0KInNhbGx5Ijp7ImFnZSI6OCwic2V4IjoiRiJ9f
      X0

~~~
sethg
If you want to pack that much information into a URL, it’s going to look ugly
no matter how it’s formatted.

(I assume you have some reason for not just passing a session key in the URL
and keeping all the relevant state on the server.)

