
JSON Patch – a format for describing changes to a JSON document - kolev
http://jsonpatch.com/
======
azinman2
We (Empirical) have implemented a similar couple years ago in our json-backed
model. I see two big problems with this standard:

1\. Why express paths as a string if you have the ability to encode it as an
array in json? Otherwise you run up against the escaping problem for which
they use ~0 and ~1 (of all things?!) which is only going to cause bugs.

2\. Indexing into arrays is a problem in distributed computing because things
can go out of sync super fast especially if these documents are concurrently
shared across devices/users/caches. We solved this by making arrays into sets,
whereby the sets are index by either a value (string, number) or in the case
of containing more objects, having those objects specify a unique key (e.g.
["someArray", {"myUniqueKey": uniqueValue}, "propertyInsideTheObject"]. We
maintain an ontology to know which keys are unique, but because we encode it
in the patch itself the underlying platform doesn't need to know that
hierarchy... it can simply look at each document's "myUniqueKey" to perform
the update.

Thus our patches are always idempotent unlike the proposal.

If anyone is interested in any of this being open sourced (we have it
production-quality in scala, java (android), objc, and javascript) let me know
at aaron@empiric.al. It's a pretty powerful stack that we've considered open
sourcing before... a bit like having your own version of Parse/Firebase but
with a nicer/easier API.

~~~
jasonkostempski
Damn, thought I invented the array path thing :) I love how clear it is to
read but my favorite thing is how easy it is to apply it:

    
    
      var x = { a: 1, b: 'B', c: { x: "X", y: "Y", z: "Z" }, d: [{p:"P"},{q:"Q"},{r:"R"}] };
      ["d", 1, "q"].reduce(function (prev, curr) { return prev[curr]; }, x);
      >>> "Q"
    

I got stuck at indexing as well. I feel like there's a good solution involving
using an array element within the path array to denote a selector query of
sorts but I haven't hashed it out yet.

~~~
afandian
If you're interested, this is the general structure and access method of the
'trie' data structure.

[http://en.wikipedia.org/wiki/Trie](http://en.wikipedia.org/wiki/Trie)

Also, Clojure has a nice functions called get-in and friends:

    
    
        user=> (def x {:a 1 :b 'B' :c { :x "X" :y "Y" :z "Z" } 
                       :d [{:p "P"} {:q "Q"} {:r "R"}] })
        #'user/x
        user=> (get-in x [:d 1 :q])
        "Q"
        user=> (assoc-in x [:d 1 :q] "QQ")
        {:a 1, :c {:z "Z", :y "Y", :x "X"}, :b B', :d [{:p "P"} {:q "QQ"} {:r "R"}]}

------
chacham15
I think that this idea is missing one key point of patching: the patch should
be smaller than the size of the new document, or at the least the sum of both
the new and old documents (in case you want the patch for record keeping).

In the example given the original text is 34 bytes long, the patch is 151
bytes, and the result is 42 bytes. Storing both the new text and the old text
is 76 bytes which is about half of the size of the patch. If this were a
corner case that rarely happened, then I wouldnt bring it up. However, it is
actually a common case (and is even given as the example!) and has such
terrible characteristics, there is no reason to use it.

~~~
nacs
I use this patch system in production and I agree. For small JSON documents,
this system actually can result in larger patch files than the original
document.

Instead of using this everywhere we output JSON however, we only use this
patch system on large documents. It cuts our average transfer size from ~60KB
(full JSON data) to ~2KB (patch data) per API request after the initial load
since only a handful of values change over the polling interval.

------
drderidder
This spec has come up a couple times; the first time I wondered whether a
simpler approach might be acceptable:

    
    
      * setting properties to be updated
      * omitting properties to be unchanged
      * explicitly setting 'undefined' properties to be deleted
    

I've used that approach and github seems to have done something similar. I
like the idea of a patch spec for JSON - it's required by a strict
implementation of REST with HTTP PATCH - but the proposals I've seen so far
seem just beyond the ken of the forehead-slapping simplicity that everyone
loves about the JSON format and the parse & stringify methods.

~~~
Dashron
This proposal is more in line with the system you describe (except null
instead of undefined).
[https://tools.ietf.org/html/rfc7386](https://tools.ietf.org/html/rfc7386)

I prefer it, and think it's much easier to understand.

~~~
drderidder
Oh, good - someone wrote an RFC for it - thanks for the link! It looks a lot
like a post on partial updates in REST [1] where I recommended using null in
the context of relational databases like mysql. Now I kinda feel undefined
might be a better way to explicitly delete properties, especially if using a
NoSQL database.

1.[http://51elliot.blogspot.ca/2014/05/rest-api-best-
practices-...](http://51elliot.blogspot.ca/2014/05/rest-api-best-
practices-3-partial.html)

~~~
logn
I think they used null for deletion because undefined isn't in the JSON spec.
It's more of a Javascript standard.

~~~
drderidder
Thanks, you're right. "null" is what I do use - I forgot undefined won't work
with JSON.parse. It would be nice to have a way to explicitly define a
property as null, versus non-extant.

------
mikeknoop
The thing you're competing against in terms of performance and efficiency is:

1\. JSON.stringify()

2\. Diff-match-patch: [https://code.google.com/p/google-diff-match-
patch/](https://code.google.com/p/google-diff-match-patch/)

3\. JSON.parse()

There are several short-comings to the above approach though, specifically,
you have to explicitly trade off performance vs. efficiency depending on the
JSON you're dealing with.

JSON Patch is going to give you a consistent performance vs. efficiency curve
which is desirable in a lot of circumstances.

~~~
nhaehnle
It is my understanding that the JSON encoding of objects is not specified
canonically because the order of keys can be arbitrary. That is,
JSON.stringify() is free to return different strings for the same JSON object.

Is my understanding correct? If it is, that would totally break your
suggestion.

~~~
drostie
You _are_ correct, but if you control the environment enough to be doing this
at all, then you control the environment enough to get the serialization
consistent: either by having ordered dictionaries in the language a la Python
and PHP, or implementing your own, or by sorting the keys lexicographically a
la bencode.

That is, `diff(sorted_json_stringify(json_parse(p)))` will be adequate if
`diff(p)` is not.

------
chj
The op missing is string diff. If a very long string value has changed, no
matter how little the change is, you'll have to include the entire string in
the patch.

~~~
andrewstuart2
If you have strings that long, maybe it would be wise to break it up
semantically into some collection of resources (paragraphs, lines, etc.). It
could be reassembled as needed for display but then stored and updated as its
semantic parts.

~~~
DonHopkins
Taken to an extreme (and isn't that what programming is all about), that could
end up using a lot more memory and overhead, storing an array of many short
strings, instead of storing one big flat string. And it would also require
more work and pointless code complexity to deal with many short strings.

~~~
yxhuvud
As opposed to the pointless complexity that comes from implementing support
for string diffs?

Having implemented a json differ (which is similar, but with different
representations of the paths due to having uniform access to nodes in the tree
due to having id elements everywhere that matters), I would consider string
diffs overengineering.

------
lobster_johnson
Seems like a good starting point, but it's a bit too simplistic for my taste.

"Add" seems like it's not idempotent as the existing value seems to change
what it does.

Seems like a mistake to bundle array operations into "add" and "remove".

Array indexing is going to be useless for most applications because it makes
assumptions about what's already there, and there is no way to specify fine-
grained preconditions (or relativity such as "insert before the element 'b'").

No support for "increment", "decrement" or similar operations you'd want to
make atomic.

No support for sets.

Unlike azinman2, I think paths as strings is fine. But why not support both?
Let a path be either a string or an array.

~~~
oneeyedpigeon
Surely supporting "increment"/"decrement" would break idempotency?

~~~
lobster_johnson
Well, of course. Idempotency is something you express on the client. Sometimes
you want it, sometimes you don't.

------
disordinary
Having the paths in ever operation seems needlessly verbose. If I had an
object

    
    
      {
            id
            name,
            password,
            email,
            phone
      }
    

And I wanted to replace the email and phone items would I need to go:

    
    
        {"op": "replace", "path": "/user/0/email", "value": "new@email.address"}
    
        {"op": "replace", "path": "/user/0/phone", "value": "+64 4 555 5555"}

If I have big objects then that is going to be allot of information to
transport, something like:

    
    
        {"op": "rplc", "pth": "/user/0/", "ptc": 
                { "email": "new@email.address", "phone": "+64 4 555 5555"}
        }

~~~
skrebbel
Why? A compression algorithm will remove that verbosity better than a more
complex format will. Just use gzip on your HTTP and you're done.

~~~
disordinary
Still compressed its going to be bigger. The other thing is its going to be
more efficient for the processor if we define where in the object we are going
to be working rather than navigating through the object for every single
change even if they are adjacent to each other.

------
jameshart
I can't help thinking that

    
    
       [
         { "op": "replace", "path": "/baz", "value": "boo" },
         { "op": "add", "path": "/hello", "value": ["world"] },
         { "op": "remove", "path": "/foo"}
       ]
    

would be better described as

    
    
         value["baz"] = "boo";
         value["hello"] = ["world"];
         delete value["foo"];
    

Why go to the trouble of defining your grammar as a strict subset of JSON
documents, which are themselves written using a strict subset of JavaScript
syntax, when you could just limit yourself instead to the strict subset of
JavaScript syntax that encapsulates the operations you actually want to
perform?

~~~
Morhaus
Because it's much easier to implement?

~~~
jameshart
To support this JSON patch format, you've got to implement a system which
validates that all the array entries are in a valid format, securely
interprets the 'path' attribute, and processes the rules according to a
specification which may or may not cover every edge case

To support my approach, you need to validate that each line uses a strict
subset of JavaScript, then securely evaluate each one according to the well
defined rules of JavaScript.

Both have basically the same essential complexity - not sure I see why the
JSON format patch is inherently 'easier' to work with.

~~~
derefr
You can write a simple library to apply a valid JSON patch in any language.
Your method basically requires a JavaScript engine.

------
chris-teague
I would love to know what the PostgreSQL developers thoughts are on this and
whether this could be in the pipelines as an extension to their growing JSON
support.

~~~
rpedela
I proposed this feature a few months ago. I am not sure if anyone is working
on it though.

[http://www.postgresql.org/message-
id/CACu89FSYikxdUj+J01BoAv...](http://www.postgresql.org/message-
id/CACu89FSYikxdUj+J01BoAvoyieaGsXr1wSo1oQgGBMXsBNoMhg@mail.gmail.com)

------
MBlume
This is cool! My only substantive comment is criticism, but this is cool.

I think it's a problem that remove and replace don't indicate what they're
removing or replacing, the same way .patch files indicate what lines they're
removing. This makes patch files reversible, which JSON patches appear not to
be. It also provides some redundancy/error correction, which is handy.

~~~
tracker1
I think that patch updates could be closer to the mongo syntax myself... I
also don't like the "path" following slash notation over dot notation.

I'm also not entirely sure about adjustments to arrays with this syntax.

------
buro9
I've implemented an API against this and think it's pretty good.

[http://microcosm-cc.github.io/#conversations-single-patch](http://microcosm-
cc.github.io/#conversations-single-patch)

We didn't implement the entirety of the standard as our resources are fairly
small and simple and it would have been overkill.

The scenario we had is that we needed to take a set of changes to a resource,
but parts of the JSON document are permissioned differently. i.e. You might
have permission to change the document body, but not some audit trail meta
data returned as part of the resource.

Treating patches as a batch of operations to be performed on a document
(usually within one transaction) is a good way to do the processing for this
scenario.

I know that there are other approaches, such as calculating a diff and just
sending that... but describing operations does make the system less fragile
and does make it easier for scenarios like the one we had where you could not
assume that the entire resource is under a single permission structure.

------
kosinus
We built a similar JavaScript library to create and apply JSON patches:
[https://github.com/Two-Screen/symmetry](https://github.com/Two-
Screen/symmetry)

The format is not formally specified in any way, but it looks like the biggest
differences are that we 1) make a distinction between object and array
operations and 2) don't encode paths, but instead nest patches.

In practice, we found our JSON structures are never deep enough for nesting to
a problem. To save space on the wire, our patch object has single character
keys, such as `s` for set, `r` for remove, etc.

It looks like these libraries don't actually create patches? Symmetry has
routines to diff between two objects, and implements Myers' algorithm for
array comparisons. (Already fast, but can be further optimized.)

On the other hand, Symmetry doesn't have copy, move or test operations. Those
are interesting.

------
bkeroack
This is a great complement to the PATCH HTTP verb for RESTful APIs. It's much
better to have a standard way to describe how a resource should be changed and
not some ad hoc, implementation-specific syntax. Coincidentally, I just
implemented support for this in one of my projects today
([https://pypi.python.org/pypi/elita](https://pypi.python.org/pypi/elita)).

Here's a link to the original RFC from 2013:
[https://tools.ietf.org/html/rfc6902](https://tools.ietf.org/html/rfc6902)

A cool thing is the "test" operation. JSON patches are atomic, so you can put
in something like:

    
    
      { "op": "test", "path": "/foo/bar", "value": 32 }
    

...and if { foo: { bar: 75 }}, none of the patch will be applied.

------
andrewstuart2
As long as they don't try to call it RESTful I think it's a good idea.

On the other hand, if HTTP/2 can minimize the overhead of just making many
consecutive HTTP calls, that feels a lot better to me than using what is
essentially a constrained RPC interface.

~~~
drderidder
I might misunderstand your comment but FWIW, buried in RFC 5789 the
description of PATCH calls for a "description of changes" to be sent, so JSON
Patch is trying to define a format for the set of changes to meet the
requirement for using PATCH (and implementing REST) correctly. Still, JSON
Patch is just too clunky compared to the elegant simplicity of JSON itself,
IMHO.

~~~
andrewstuart2
I definitely agree that it's a bit too clunky.

I think it's arguable though that PATCH itself isn't quite RESTful as it
doesn't describe the state of a resource at the identifier, but instead some
subset of the state of that resource. Doing that kind of destroys the
semantics of the resource identifier.

You'd probably still find me arguing for PATCH as it's obviously preferable
not to resend an entire resource to reflect a small change in that resource.

Perhaps the more elegant solution is to forget PATCH and use more nested
resource identifier semantics (never thought I'd say that) so that you can
appropriately identify the entire subset of the resource being altered.

~~~
drderidder
Hmm, I got the impression that REST purists _insist_ on using PATCH. Rails 4
uses it for their REST implementation, for example. Also check out Will
Durand's post [1]. I mused about the relative merits of PATCH vs POST in a
blog post [2] - if you have any references you wouldn't mind leaving in the
comments, I'd love to check them out.

PS. nesting resources is a pretty interesting issue in itself; there seem to
be a few different schools of thought about that, ie. using the '/' in URLs
like the '.' scoping operator on objects, versus a flat scheme where all the
different collections live at the topmost level, versus a combination of both
or even using one as an alias for the other.

1\. [http://williamdurand.fr/2014/02/14/please-do-not-patch-
like-...](http://williamdurand.fr/2014/02/14/please-do-not-patch-like-an-
idiot/)

2\. [http://51elliot.blogspot.ca/2014/05/rest-api-best-
practices-...](http://51elliot.blogspot.ca/2014/05/rest-api-best-
practices-3-partial.html)

~~~
steveklabnik
It's more complex than that: [http://roy.gbiv.com/untangled/2009/it-is-okay-
to-use-post](http://roy.gbiv.com/untangled/2009/it-is-okay-to-use-post)

~~~
drderidder
Great link, thank you!

------
dheera
Nice idea. However, what if we based it on MongoDB's syntax, for
compatibility?

    
    
        [
          { "$set": { "baz": "boo" } },
          { "$unset": { "foo": 1 } }
        ]

------
mnot
FWIW - we're gathering issues / improvements at: [https://github.com/json-
patch/json-patch2/issues](https://github.com/json-patch/json-patch2/issues)

~~~
kolev
JSON Pointer is a pretty nasty. How can a sane person decide to escape "~" and
"/" with "~0" and "~1"?!

------
outdooricon
Would anyone be willing to explain when this is used? I'm a little confused,
is this format for patch updates from the client to the server? If so, then I
assume it's for NoSQL db's that have deep documents, so that you know what
part of the document needs to be updated... do you currently have to send the
entire document in a patch? Are there client side model libs that will track
object changes and format output in this way for the patch request (the js
libs listed on the page look pretty low level)? Or am I misunderstanding this
entirely...

------
ecksor
Anyone interested in this (JSON patch, CRDTs, operational transforms etc),
should definitely take a look at ShareJS:
[https://github.com/share/ShareJS](https://github.com/share/ShareJS)

There's been some discussion recently on the wave-dev list: [http://mail-
archives.apache.org/mod_mbox/incubator-wave-dev/...](http://mail-
archives.apache.org/mod_mbox/incubator-wave-
dev/201410.mbox/%3CCADrYLAg1AnkQpHk8GdfeQxJzWC2Ca%3D_7O4Zt4wd08dfksweGbA%40mail.gmail.com%3E)

------
jpierre
Relatedly, Mattt Thompson of Gowalla/Heroku/Panic/AFNetworking fame proposed
Rocket, a technique that pairs JSON Patch and Server-Sent Events to aid in the
construction of realtime apps via REST services. The proposal has seemingly
been abandon, which is unfortunate as the formalization of handling was
interesting, even if one implemented it on top of a different transport layer
like Web Sockets or a push notification service.

[http://rocket.github.io/](http://rocket.github.io/)

------
EGreg
[https://www.npmjs.org/package/json-diff-
patch](https://www.npmjs.org/package/json-diff-patch)

~~~
hmbg
I've just used this to get a solid undo/redo in our WebGL editor. It works
amazingly well, and is very performant. However, our backend is not in
Javascript, so if I'd want to use it for data transport I'd have to make a
compatible python implementation. So I get the point of a standard, so long as
it's up to par.

------
cobralibre
For anybody working in the identity space, the SCIM 2.0 API for partial
modifications is based on RFC 6902:

[https://tools.ietf.org/html/draft-ietf-scim-
api-12#section-3...](https://tools.ietf.org/html/draft-ietf-scim-
api-12#section-3.3.2)

(SCIM is a set of specs describing a schema and a REST API for data exchange
between identity providers, more or less.)

------
icholy
I was experimenting with the idea of adding 'slices'. They let you do map like
transformations on arrays.
[https://github.com/icholy/JsonPatch.js](https://github.com/icholy/JsonPatch.js)

    
    
        /foo/1:3/bar
    

translates to

    
    
        foo.slice(1, 3).map(function (obj) { return obj.bar; });

------
tolmasky
I highly recommend taking a look at react's immutability helper update[0] for
these kinds of operations. Keeps your data structure persistent and
intelligently re-uses internal objects when possible.

0\.
[http://facebook.github.io/react/docs/update.html](http://facebook.github.io/react/docs/update.html)

------
sly010
Reacts update() helper is very similar. Though it's less generic, it's also a
bit less verbose.
[http://facebook.github.io/react/docs/update.html](http://facebook.github.io/react/docs/update.html)

------
DonHopkins
Google Wave did something like this. What protocol did they use to update
JSON, and is it modular and can stand on its own, without all the other Wave
stuff? Does anyone know what its features and drawbacks were?

~~~
espadrine
I'll help you, but you should know that what they do is nothing like that.
They used Operational Transformation. They have convergent properties which
guarantee that you won't ever have merge conflicts. They also have, depending
on the implementation, certain causality guarantees that ensure that what you
edited stays the way you intended.

The basic idea is that you receive operations that modify your local
operations. The challenge is in writing the function that takes each two
operations and mutates them. Therefore, the protocol is of no practical
interest, it simply serves that function's purpose.

Here's a set of libraries that implement this for strings:
[https://github.com/Operational-
Transformation](https://github.com/Operational-Transformation)

Implementing this for JSON is certainly possible, and has been done. See this
for instance:
[https://github.com/share/ShareJS](https://github.com/share/ShareJS)

------
madskristensen
Here's the JSON Schema draft v4 for JSONPatch documents
[http://json.schemastore.org/json-patch](http://json.schemastore.org/json-
patch)

------
alexchamberlain
If a data format does not easily represent what you are trying to convey, why
use it? ie should a patch be encoded in JSON at all?

I can understand not using a raw text patch, however.

------
hardwaresofton
Wow was actually working on this with a friend at a previous job, glad to see
someone has done it (and is trying to standardize).

------
EGreg
I once wrote JSON.diff and JSON.patch functions to do this thing.

Then I kind of stopped because I didn't see the point

------
pistle
it's like xml diffgrams for json?

that wouldn't foretell an optimistic future for this.

------
fiatjaf
Is Choco Leibniz similar to Digestive? Better?

