
W3C HTML JSON form submission - ozcanesen
http://www.w3.org/TR/html-json-forms/
======
rspeer
Let me say first of all that I'm glad they're working on standardizing this.
When making REST APIs, I find HTML form scaffolds incredibly useful, but it
means that you probably have to accept both JSON (because JSON is reasonable)
and occasional form-encoding (because forms), leading to subtle
incompatibilities. Or you have to disregard HTML and turn your forms into
JavaScript things that submit JSON. Either way, the current state is ugly.

Here's the part that I don't particularly like, speaking of subtle
incompatibilities:

    
    
        EXAMPLE 2: Multiple Values
        <form enctype='application/json'>
          <input type='number' name='bottle-on-wall' value='1'>
          <input type='number' name='bottle-on-wall' value='2'>
          <input type='number' name='bottle-on-wall' value='3'>
        </form>
    
        // produces
        {
          "bottle-on-wall":   [1, 2, 3]
        }
    

I've seen this ugly pattern before in things that map XML to JSON. Values
spontaneously convert to lists when you have more than one of them. Here come
some easily overlooked type errors.

I don't know of any common patterns for working with "a thing or a list of
things" in JSON; that kind of type mixing is the thing you hope to get away
from by defining a good API. But all code that handles HTML JSON is going to
have to deal with these maybe-list-maybe-not values, in a repetitive and
boilerplatey way.

I hope that a standard such as this will eventually be adopted by real-life
frameworks such as Django REST Framework, but I also hope that they just
reject the possibility of multiple fields with the same name.

~~~
IgorPartola
PHP handles this by having lists like this by requiring you to use an I put
name like "bottle-on-a-wall[]". The brackets indicate that it should be a
list. I don't hate this convention...

~~~
hayksaakian
the only problem with the bracket syntax is when it comes to nested things

    
    
            <form enctype='application/json'>
              <input name='wow[such][deep][3][much][power][!]' value='Amaze'>
            </form>
    
            // produces
            {
                "wow":  {
                    "such": {
                        "deep": [
                            null
                        ,   null
                        ,   null
                        ,   {
                                "much": {
                                    "power": {
                                        "!":  "Amaze"
                                    }
                                }
                            }
                        ]
                    }
                }
            }

~~~
joesb
What's the problem with that? If it's deeply nested, then it's deeply nested,
no matter what.

~~~
byuu
It's pedantic, but it's a tad clunky now: I'd use something like
"wow/such/deep[3]/much/power/!" to differentiate child nodes from array
indexes, and also strongly advise against a node named "!" on principle. Of
course then you'd say "but I want '/' in my node name!", and then you get to
bikeshed an escape sequence, but you'd need that anyway for a node name with
'[' or ']' in it.

However, what they've done is workable if all the browser vendors implement
it.

~~~
stephenr
square brackets are a reasonably common way to access data in an hash-like
object. both Javascript (objects) and PHP (associative arrays) use/allow
square brackets for this.

------
bkardell
In fairness, you have to look at how standards get somewhere - this is an
editor's draft which is a starting point of an idea rather than a done deal.
Don't be surprised if the final product winds up being significantly different
than this - even better, get involved in the conversation to make it what we
need. That's not to pour cold water on it: It's good as it is, but there are
changes which potentially help explain the magic of participating in form
encoding and submission which may be better and allow more adaptation and
experimentation over time.

------
kijin
Why such an emphasis on "losing no information" when the form is obviously
malformed?

You need only to look at the crazy ways in which MySQL mangles data to realize
that silently "correcting" invalid input is not the way to go. The web has
suffered enough of that bullshit, we seriously don't need another. Example 7
(mixing scalar and array types) gives me shudders. Example 10 (mismatched
braces) seems to have a reasonable fallback behavior, though I'd prefer
dropping the malformed field altogether.

If the form is obviously malformed, transmission _should_ fail, and it should
fail as loudly and catastrophically as possible, so that the developer is
forced to correct the mistake before the code in question ever leaves the dev
box.

Preferably, the form shouldn't even work if any part of it is malformed. If
we're too timid to do that, at least we should leave out malformed fields
instead of silently shuffling them around. Otherwise we'll end up with
frameworks that check three different places and return the closest match,
leaving the developer blissfully ignorant of his error.

While we're at it, we also need strict limits on valid paths (e.g. no
mismatched braces, no braces inside braces) and nesting depth (most frameworks
already enforce some sort of limit there), and what to do when such limits are
violated. Again, the default should be a loud warning and obvious failure, not
silent mangling to make the data fit.

This is supposed to be a new standard, there's no backward-compatibility
baggage to carry. So let's make this as clean and unambiguous as possible!

~~~
JetSpiegel
That's not the Web way. The Web is the lowest common denominator, all that
talk of "correctness" goes over the head of most Web Developers.

------
luikore
I don't agree with Example 9, we should use data uri scheme for file content

    
    
        "files": [{
          "name": "dahut.txt",
          "src": "data:text/plain;base64,REFBQUFBQUFIVVVVVVVVVVVVVCEhIQo="
        }]
    

[http://en.wikipedia.org/wiki/Data_URI_scheme](http://en.wikipedia.org/wiki/Data_URI_scheme)

~~~
theandrewbailey
1\. Other form encoding types have discrete MIME type fields.

2\. While data URIs are awesome, it forces you to process that URI in some way
(parse, regex, whatever) just to get the MIME type. Also, this forces 13 bytes
of redundancy every time.

------
hughes

        {
          "name":   "Bender"
        , "hind":   "Bitable"
        , "shiny":  true
        }
    

Who puts commas at the _start_ of a continuing line? What good could that
possibly do?

~~~
pielud
It lets you comment out a line without having to remove the trailing comma
from the previous line. That'd be useful if JSON had comments.

I've seen people do this in the SELECT portion of SQL queries too.

Personally, I hate this.

~~~
nkuttler
> It lets you comment out a line without having to remove the trailing comma
> from the previous line

That would only be an advantage above comma at the end on the last line. It
really only moves the problem from the last line to the first one. Now you
can't comment out the first line without removing a comma...

~~~
diroussel
In the comma-first method, adding a new element at the end produces a one mine
diff. But when doing comma-last method, then adding a new element to the list
gives a two line diff.

It can make resolving merges just a little bit easier.

------
jimmcslim
I'm not sure whether to be heartened or concerned that the W3C is referencing
the doge meme in its specifications... see Example 6.

~~~
jl6
Concerned. Memes are in-group signalling one notch above the crudest kind such
as football chants, a few notches below the more sophisticated kind like
quoting Shakespeare, but all ultimately with the potential to exclude and
confuse - which is definitely the opposite of what a technical spec should be
trying to do.

~~~
icebraining
I don't see why would this exclude or confuse. It's essentially an in-joke,
and shouldn't affect anyone reading the spec who is unaware of the meme.

I'm not a fan, but I don't think we should make it out to be more than it
really is.

------
lorddoig
It amazes me that we're now at the point of standardizing sticking array
references inside strings and yet we're still not having a serious discussion
about what comes after HTML.

~~~
chc
It amazes me that you think we could have a serious discussion about what
comes after HTML when essentially nobody is seriously considering replacing
HTML. HTML is what it is, nothing else is what HTML is, and HTML is going to
be around a good long while.

~~~
Silhouette
_It amazes me that you think we could have a serious discussion about what
comes after HTML when essentially nobody is seriously considering replacing
HTML._

Of course we are. There are numerous new technologies competing to replace the
role we have shoe-horned HTML into playing -- look at all the different
templating technologies now in development, for example, and things like Web
Components. There are also numerous technologies for marking up semantic
content and/or styling such data for presentation.

The only thing that isn't changing right now is that we're still stuck with
eventually reducing these alternatives to plain HTML for display in browsers,
which is about as good an idea as insisting we reduce all styling to CSS and
all programming to JavaScript. It's a historical accident, it's resulted in
widespread dependence on tools that are nowhere near fit for the purposes they
are now asked to serve, but there is so much momentum in the industry that
building tools to accommodate the weaknesses as well as possible is the
preferred strategy over completely starting over. See also: Almost everything
about programming ever.

~~~
lorddoig
Fully agreed. A low(ish)-level, non-JS API to the sea of DOM "primitives"
inside blink/gecko/webkit/etc. suitable for targeting in an FFI-like manner by
any language is the standard I'd like to see. We don't need 3 languages: it's
all there in C++ classes and C structs.

I've been writing a lot of ClojureScript/React lately and with that I only
really touch HTML to include CSS and scripts. It's pretty glorious, highly
productive, and makes a very strong case for opening up lower levels to allow
new paradigms to evolve - with this set-up, HTML and JS only get in the way.

The browser is now (almost) an OS in a VM and, imho, the sooner we start
treating it like that the better.

------
tomchristie
Seems pretty decent. Also neat that the nesting style could be repurposed to
support nested structures in regular form-encoded HTML forms.

Main limitation on _actually_ being able to use this is that `GET` and `POST`
continue to be the only supported methods in browser form submissions right
now, so eg. you wouldn't be able to make JSON `PUT` requests with this style
anytime soon.

Might be that adoption of this would swing the consensus on supporting other
HTTP methods in HTML forms.

------
tootie
They're still working on XForms after 10 years
[http://www.w3.org/MarkUp/Forms/](http://www.w3.org/MarkUp/Forms/)

~~~
matthewmacleod
Isn't that kind of like saying "They're still working on HTML after 23 years"?
Technically you're right, but version 1.1 of XForms was completed and
published over 5 years ago.

That said, XForms is dead AFAIK, and that's not a bad thing.

~~~
tootie
Yeah, my point was more along the lines of they've been working for 10 years
and gotten near zero adoption. Not saying it will fail again for sure, but if
this had value, XForms would have found it. URL-encoded POST data works just
fine.

~~~
jkrems
XForms has a bigger scope and is connected to XHTML. Two good reasons why this
might succeed where XForms failed.

------
jdp
The latest release of my jarg[0] utility supports the HTML JSON form syntax.
Writing out JSON at the command line is tedious, this makes it a little nicer.
The examples from the draft are compatible with jarg:

    
    
        $ jarg wow[such][deep][3][much][power][!]=Amaze
        {"wow": {"such": {"deep": [null, null, null, {"much": {"power": {"!": "Amaze"}}}]}}}
    

[0]: [http://jdp.github.io/jarg/](http://jdp.github.io/jarg/)

------
techtalsky
Kind of nice, basically turns form submission into a bare-bones API call.

~~~
tootie
Which they pretty much already were. The only value I can imagine this adding
is a way to encode forms with nested structure.

~~~
hayksaakian
The changes they make solve the problem of corner cases submitting weird input
to an API that expects only JSON

In 2014, I think this is a good idea.

It also solves the issues with ambiguous syntax surrounding arrays of values.

------
chronial
Am I the only one who is worried about the fact that this is exponential in
size?

    
    
      <input name="field[1000000]">
    

Will generate a request that is ~5MB.

~~~
icebraining
I don't see why that should be worrying. What's the scenario you're
foreseeing?

------
mnarayan01
The JSON-based file upload would be nice (AFAIK there's not great way to do
this ATM, but I haven't looked in over a year). The rest seems pretty weak-tea
though. I can see multiple issues with more defined type (e.g. numeric rather
than string values, null rather than blank string), but without dealing with
that stuff, this seems of extremely limited utility.

~~~
alex_duf
It's nice for small files but base64 really is inefficient. I think it is
still used in mails, but really, you should use it only when you control all
the usages that will be done with it.

------
stu_k
Submitting files with this form encoding is of course going to have the base64
overhead, but otherwise this looks great!

~~~
treve
Yea this makes me wish there was a better way to deal with this. JSON is
popular because it's simple, but as a result sucks for a bunch of use-cases.

~~~
tonyarkles
I'm mobile so don't have a good way to just test this myself... Any idea how
good/bad base64+gzip is (ie gzipping the json before submitting it)? If it's
within a few percent then this probably isn't a bad solution!

~~~
dnet
The browser doesn't gzip _requests_ by itself, only the server does so with
the _response_ if the user-agent (including browsers) states that it supports
such content encoding. Of course you can implement gzip in JavaScript, but if
you do that, you can already mangle the request and send the file to the
server without Base64 encoding.

------
pmontra
A discussion about the implementation of the spec in jquery. It started on
June 21

[https://github.com/macek/jquery-serialize-
object/issues/24](https://github.com/macek/jquery-serialize-object/issues/24)

------
homakov
How is it solving CSRF JSON problem?
[http://homakov.blogspot.com/2012/06/x-www-form-urlencoded-
vs...](http://homakov.blogspot.com/2012/06/x-www-form-urlencoded-vs-json-pros-
and.html)

------
billpg
A new standard for referencing a point in a JSON object? I wonder if they
considered RFC 6901 and rejected it.

I personally prefer this new square bracket notation, but being a standard
already gets more points.

~~~
tomchristie
JSON Pointer isn't quite the same thing.

~~~
billpg
I humbly disagree. JSON Pointer is a syntax for specifying a location in a
JSON object. Multiple form items with JSON pointer strings as names would map
to the equivalent of a number of add operations in a PATCH call that start
with an empty object.

------
skratlo
Wow, W3C at it's best again. Non-modular, non-negotiable, JSON it is, take it
or leave it. Well fuck you W3C. Base64 encoded files? Seriously? What if my
app workes better with msgpack encoded forms? Or with XML encoded? So you're
going to support one particular serialization format, quite a horrible one,
but that's subjective and that's the whole point. Every app has different
needs and you should spec. out a system that is modular and leaves the choice
to the user, even for the price of "complicating things".

~~~
tomchristie
How would you deal with rendering arbitrary form encodings in the browser? A
proposal adding support for form submission of _arbitrary_ encodings could be
valid, but it'd have to just be a single form input with the data included
verbatim.

This proposal allows regular HTML forms with multiple input elements, but
submitting over JSON. I can't see how you could define that for arbitrary
encodings without first defining how the form fields map to the encoded data
for all the encodings you'd want to support.

~~~
skratlo
Eg: By referencing an encoder function using the standard on* attribute. Could
be called onbeforesubmit=encodeMsgpack. This function would take a JS object,
generated according to the W3C's JSON form spec, and return a pair of [string,
arraybuffer]. String being the MIME content type, and arraybuffer containing
the request body.

~~~
icebraining
If you're going to run custom JS code, why not simply submit it through JS
HTTP requests? What have you gained by this new API?

------
edwinvdgraaf
Guessing that it's interesting when using an uniform endpoint for both forms
and js-driven requests.

------
Patrick_Devine
Can we just get rid of HTML and replace it with JSON while we're at it?

~~~
byuu
I don't know if there's a proper name for this ability, but neither JSON nor
YAML allow you to embed child tree nodes inside of node values. This ability
requires named end tags. Example:

    
    
        <p>This is <b>bold</b> text.</p>
        "p": "This is ... uh ... nevermind."
    

You could make a compelling argument that you shouldn't do this (separate
block level and inline elements into separate encodings), but remember that
even the relatively minor HTML->XHTML movement to put a bit of sanity into
single-use tags like <br> -> <br/> failed miserably.

~~~
Xophmeister
Nodes have multiple children. In this case, you can split the <p>'s children
into three: a "text node"; a <b> node; and another text node. You'd end up
with something like this:

    
    
        {
          p: [
            { text: 'This is' },
            { b: { text: 'bold' }},
            { text: 'text.'}
          ]
        }
    

Not that I don't agree with you -- my suggestion is pretty messy and doesn't
even go into how we'd deal with tag attributes -- but it's doable.

~~~
JetSpiegel
This has to be pre-processed somehow, if people don't like writing HTML by
hand, consider writing that.

~~~
byuu
Yeah, and that's not really equivalent HTML either. That's more like:

    
    
        <p>
          <text>This is</text>
          <b><text>bold</text></b>
          <text>text.</text>
        </p>
    

(which is a lot easier for a machine to parse, at least.)

JSON and YAML are great for what they do: data serialization, but they're just
not appropriate choices for text markup. You have to use the right tool for
the job.

There was a time when the industry wanted to cram 100% of everything into XML,
and that gave us XSLT, XAML, SOAP, etc. We don't want to go back to that,
either.

Personally, I'm most fond of an extended Markdown syntax to replace HTML. But
I'm not going to hold my breath waiting on web browser vendors to agree on
such a syntax, so instead I have my HTTP server do the conversion to HTML for
me.

But if you had to force it into JSON, then I would suggest using a different
markup for inline elements, eg:

    
    
        html
          body
            p: "This is [/italic/] text."
            p: "This is [[google.com => a hyperlink.]]"

------
woutervdb
> wow[such][deep][3][much][power][!]

And there goes my interest for this submission. Don't use overused memes in a
submission. Liking the idea though.

~~~
Donzo
I enjoyed this. I also liked the Bender reference.

It reminded me of that show that makes me laugh on Netflix.

It also illustrated the point that he was trying to make. I prefer warm
examples, rather than ones using book titles.

