Hacker News new | past | comments | ask | show | jobs | submit login
Select Transform: JSON template over JSON (selecttransform.github.io)
89 points by gliechtenstein on Oct 18, 2017 | hide | past | favorite | 75 comments

Gotta love JSON. It's XML without the tools so you get to reinvent everything to pad your resume.

    XML -> JSON
    WSDL -> JSON schema
    XPath -> JSON path
    XSLT -> JSON template?
    SOAP -> Swagger and friends
One or two decades from now there will be too many tools and things to learn about JSON so another generation will reinvent a new "perfect format for everything". And a new cycle will have started.

I kinda get this criticism, but this is also a bit like lamenting that we write the same libraries for different programming languages. I mean... yeah. Some ideas are good across protocols. That's why they're copied.

On a merits-based thing, if you're just looking at XML, it has a lot of things going against it .If someone sends me something "in JSON", I can have a good guess what it'll look like. Not as much luck with XML. XML punishes you for using attributes (because you can't place composed types within them), but the awkward alternative is placing attributes in sub-elements.

I have a list of products with prices. I store the price as a quantity attribute on a <product>. I decide later on to store currencies for all my "money" objects. Do I now double the amount of attributes? It's all super awkward, and JSON's mental model is more straightforward, even if you end up with approximations.

Why is it super awkward and how does JSON solve this problem any better?

When you’re designing an XML schema, answering the question “should this be an attribute or a nested elember” is awkward because there is rarely an obvious correct answer.

With JSON, the answer is always a name/value pair.

That's a problem with XML Schema, not XML.

A schema language in which subordinate values being presented as attributes vs. elements was a presentational option in the concrete XML representation rather than a dictate of the schema language would solve this simply. There are already multiple schema languages for XML; needing a better schema language doesn't begin to provide a reason for a different representation language, since schema and representation language are separate concerns.

JSON has data types built into it, XML needs its data type understanding defined by an external technology - often in the form of an irritating schema language that is hard for many people to understand, extraordinarily verbose, and which also has the downside of not being able to describe many of the constraints often encountered in actual XML formats.

JSON is a lot easier to read.

      "employees": [
          "name": "John Crichton",
          "gender": "male"
          "name": "Aeryn Sun",
          "gender": "female"

        <name>John Crichton</name>
        <name>Aeryn Sun</name>

It's a lot easier to read when a data format is written in JSON, when a document format is written in JSON the reverse is the general order (minus incidents where the format is obnoxiously constructed so as to defy understanding by outside parties)

I hate those comma's, forgetting them or having too many. So therefore I use YAML. Still way easier to read.

        - name: John Crichton
          gender: male
        - name: Aeryn Sun
          gender: female

I like YAML and do find it slightly easier to read than JSON. But it hasn't won the popularity contest (yet), for whatever reasons.


      <employee name="John Crichton" gender="male" />
      <employee name="Aeryn Sun" gender=female" />

I believe that was OP's point. Using the annotation you cannot add an attribute with subattributes. What if you wanted to add the employee's last six salaries. Suddenly you need to restructure your xml:

        <name>John Crichton</name>
        <name>Aeryn Sun</name>
Using JSON, you just so: “salaries”: [10000,10000,10000]

JSON is not only less verbose, it is also more flexible and easier to read and understand. You don’t have to worry about should I use tags or attributes for that because I late might have to use sub tags, and that makes it far easier to use (and honestly parse as well because many XML documents I have gotten are very inconsistent).

Why do you need to restructure the xml? You can just keep the attributes. To add the salaries there are quite a few options, as XML is just as flexible as well.

Using a List (XSD List) type:

      <employee name="John Crichton" gender="male" salaries="10000 10000 1000" />
      <employee name="Aeryn Sun" gender="female" />
And using mixed content (complexType):

      <employee name="John Crichton" gender="male">
      <employee name="Aeryn Sun" gender="female" />

Still more typing and less readable than the JSON. I happily used XML for years and have nothing against it. But JSON is almost always more readable. Big deal. Technology evolves. Trends that aren't an evolution usually fade. JSON does not seem to be one of those. It offers enough advantages over XML to have quickly replaced it in many cases.

Wether it is easier to read or not is entirely subjective. There is no quantifiable measure to state either way. Both have their use-case; one now more than the other.

Yeah, makes total sense to traffic in XML then convert back and forth to/from JSON for web layer. Gotta love XML.

I tweeted in Apr. 2016,

"You know how you feel when you have to parse XML from 2005? That’s how we’ll all feel about JSON in 2025. It’s not bad, just bloat-y."

XML, JSON, gRPC… These are all technologies to solve a problem. The trouble is people confuse technology with a business solution and they start applying it as though by "having a REST API" you've done something. You can't say you've accomplished anything until you actually demonstrably deliver value!

Seems like JSON should have already been overturned. It hasn't and I don't think it will. My personal belief was that XML was kicked out because it allowed this: <books:book> stuff here <xsd:schema .../> <xslt:transform .../> </books:book>

And so you couldn't just READ that, you had to have a parser that fully understood EVERYTHING going on there. JSON won't get namespaces, and therefore it will remain good. We will continue to replace what namespaces did for XML, but will keep that data in its own metadata document, outside the main document. I think that's why JSON won't be replaced as quickly as XML.

Also XML didn't kill XML, webservices (ws-* standards) killed XML.

Exactly my observation! JSON became famous as a revolt against the complex XML ecosystem, but it reinvented a square wheel. YAML is no different - its specification and feature set is quite overwhelming.

There's so much quality work done with XML that all this effort is a complete waste. From streaming parsers to the XQuery, and great projects such as VTD-XML [0], it's sad to see us spinning wheels. I see TOML as another wasteful attempt at something that's only slightly better but requiring a new toolchain and an ecosystem.

[0]: https://vtd-xml.sourceforge.io/

The fact that all these use-cases pop up again tells us that they're real. I was dubious the first-time-around =)

Well in green-field projects it's certainly advisable to minimize what is "done in"/"expressed in"/achieved-via the current-interchange-format de-jour --- but in messy real-world projects such rabbit holes can unavoidably open. And then you sooner or later end up thinking, as I did back then, "XML is kinda sucky but XPath and XSLT make it actually pretty nifty & fun to deal with and somewhat escape its own very suckiness".

> It's XML without the tools.

XML can be really powerful, but a lot of real users actually don't know how to use XML. Many just use it as an enterprise version of JSON in which case I am happy to just work with JSON, plain and simply.

Also JSON processing can be easily parallelized, because it is easy to express in a newline delimited format (http://ndjson.org/).

And it goes on...

    XML comments -> JSON: nah -> YAML
    XML CDATA (multiline raw strings) -> JSON: nope -> YAML

Yep. I agree. Although maybe XML schema (https://www.w3.org/XML/Schema) -> JSON Schema. And then there's also XQuery, DTDs, namespaces, entities,… which are also in need of reinventing I guess.

JMESPath [0], started by Amazon and used by Microsoft in Azure, is something that tries to accomplish some of the XQuery goals in a simpler way.

[0]: http://jmespath.org/

Oi... XML wasn't the first format to be used for data interchange and encounter these problems either. At least JSON is designed for data interchange in the first place...

Too many tools? Are you not selective about what tools you use anyways? The field is decades old. Do you program in assembly? Fortran?

I'm not sure why we didn't at least adopt JSON5 [0] or Hjson [1]!

[0]: http://json5.org/

[1]: http://hjson.org/

WSDL -> JSON schema is the wrong mapping, obviously should be XML Schema (WSDL is an awful web services things - so is there some combination of a router and JSON schema spec somewhere out there?)

> WSDL is an awful web services things

The thing is, when creating webservices you like creating tests. With WSDL you have tools which given a WSDL will generate a test-suite for you and some mocked services.

With JSON you'll have to hack-up something using Swagger.

I think there's a lot of useful unexplored territory in this sort of tool. So much work nowadays is happening in maps/dictionaries/what-have-you, with a lot of work being, essentially, data shape transformation.

Despite this, many languages aren't great at this! we have tools that work alright on first-order operations but fall apart on higher level things.

Even really simple things like "conditionally include a key" are rarely supported in a way that leads to concise code. How many times have we all written

  foo = {...}
  if bar:
     foo['a'] = x
instead of something like

   foo = { ... , 'a': x if bar }
and then having bugs because of branches? Tools that solve this sort of problem will be to modern programming what things like the Unix shell was to data mungers in the past.

Haskell lenses kinda goes into this stuff, though Haskell itself is a bit hard to use in a lightweight fashion. Clojure has Specter. But still looking for some more stuff to fill this hole, especially in the "transitionally typed" stuff like Typescript or Python + Mypy

IMHO any tool that converts JSON to JSON is pointing to the wrong direction.

Json should strictly be used as a serialization format and everyone who touches it should immediately parse it into a typed data-structure that represents your domain better.

In fact you really should represent your data in your code using whatever native object hierarchy / algebraic data types / option types / etc. your language supports. (Algebraic datatypes are awesome at this sort of thing, it allows you to avoid the aforementioned conditionality)

The problem is the right way to do this is also the hard way, because people hate writing "boilerplate" classes and serialization code and once you start thinking about making it "easy", you end up using json as the in-application data structure because "who needs types anyway".

You are very correct. Boilerplate (de)serialization code is 100x better to have to write than code with JSON-blobs pretending to be business objects.

Since you're using Python for your example, it's worth mentioning that 3.5 extended the unpacking syntax [1] to sort of do what you want.

    foo = {
        'a': 1,
        'b': 2,
            'c': 3,
            'd': 4,
        } if bar else {}),
This adds `c: 3` and `d: 4` into `foo` conditional on `bar`.

[1] https://docs.python.org/3/whatsnew/3.5.html#pep-448-addition...

oh, this is most definitely a thing I want. Not sure if my coworkers would appreciate this however. Would still have liked a bit of a simpler syntax for inclusion but this could be worth the ugliness

Nice that this works on lists as well.

Yeah, I personally find this sort of syntax convolution ugly and hard to read but I thought it relevant. Enough rope to hang yourself with and whatnot.

> Haskell lenses kinda goes into this stuff

They seem neat but I haven't come across a case where whipping up, in vanilla old-school Haskell, a custom set of small simple ADTs and the related handful of recursing walk-and-transform functions didn't do the job perfectly cleanly and comprehensibly.. this sort of stuff is already very compact to write and code-gen like TH seems almost overkill to me. Guess I'm itching to come upon a much more painful scenario to finally give them "lenses" a fair chance.

If you like writing

    let x = y { foo = foo y { bar = bar (foo y) + 1 } }
instead of

   let x = over (foo.bar) (+1) y
then more power to you!

A neat example indeed! Yeah that makes sense. Especially I guess in code-bases where such deeply-nested record updates abound everywhere repeatedly time-and-again --- have not run into such myself yet. Writing the above on-the-rare-occasion is hardly troublesome. But I get the idea there now. (Although in the above, half the verbosity comes from using records instead of ADTs' ctors and I believe aren't there common idiomatic "standard" GHC lang-extensions to trim record updates in a shorter fashion --- ah well, been a while, I'm dabbling more in PureScript these days which sports a terser record-updates notation anyway =)

It's even worse if you don't have records!

    let x = Y (Foo (old_bar + 1) quux) baz
        where Y (Foo old_bar quux) baz = y

>How many times have we all written

A decent amount, but I'd rather have all of that separate because it makes the code easier to read.

I'm not interested in how concise a block of JSON text can be with all of my transformations of conditionals and loops. That's going to be difficult to read without highlighting. I also can't put a breakpoint in the middle of some JSON text without the appropriate support from the language/runtime/IDE.

However, this syntax is much better than JSON Patch which I've considered using for document templates.

agree that readability is super important.

I think there's a bit of a conflict with regards to immutability. When I see a variable in the code, it's much easier to handle a single definition rather than a "half-definition" followed by a bunch of conditionals.

I've also seen some waaay to concise code that becomes super legible after just one extra line of code to split things up.

I've run into the opposite problem a lot though. Even if every operation is simple, the length of code makes the intent unclear, especially in cases involving aggregation. Death by a thousand cuts

   results = []
   for elt in other_array:
      if elt is None:

   results = [ f(elt) if elt is not None else 0
               for elt in some_other_array ]
In one example we have to read several lines to figure out some basic stuff, but in the other I can tell even at a quick glance that we have a mapping from one container to another, an expectation they're the same size, etc.

Clarity is the most important, and after a couple conditionals it makes sense to be more explicit about control flow. But many common tasks are not _about_ control flow, even if C-isms make us write them as if they do. "if" statements generate so many bugs, eliminating them is great.

What's wrong with:

    return {
        labels: data.items.map(function(value){
            return {
                type: "label",
                value: value > 10 ? value : value * 100
I hope your project does not get too popular.

Even cleaner with ES6.

return { labels: data.items.map(value => { type: "label", value: value > 10 ? value : value * 100 }) };

Looks cool, but I can't help but think that you're just reproducing what map() and filter() do, without any real benefits?

I perceive it being middleware as the benefit. For when your client isn't set up to manipulate the incoming data and you can't edit the backend resource that generates it. Or vice versa.

This seems pretty interesting, and I will be happy to play around with it.

I am a bit concerned that there is no discussion in the docs of the potential security risks of allowing direct native JS execution of arbitrary instructions passed in by an untrusted source.

I use a project called JSONLogic (jsonlogic.com) which bears some similarity to ST in terms of being able to select and transform values. The biggest advantage I see with it is, unless you explicitly plug in a rule that parses and executes user data, there is no way for the data to "escape the sandbox". This means you can safely build a query syntax on top of it where you can directly consume the arbitrarily complex query from an untrusted source and execute it in a secure manner.

Hi, my name is Ethan. I'm the creator.

I thought I would provide some context on why I wrote this library, and how I'm using this right now. So here it goes:

Other than ST.js, I also work on another open source project called Jasonette (https://www.jasonette.com), which lets you write an entire native iOS/Android app in nothing but JSON markup.

And when you can express the entire app logic--from model to view to controller--in JSON, you can split them up whichever way you want and load them from anywhere (from a file, from cache, from a remote server, or from local memory).

But the biggest benefit of all is: you can load an entire ios/android native app from the server in realtime, just like how web browsers load HTML/JS/CSS in realtime.

When working on Jasonette, implementing model and view was relatively simple. For model it's natural since JSON is all about describing data. For view i just needed to come up with a syntax to describe layouts and all the standard mobile UI components in JSON.

However the biggest challenge was how do i actually describe functions in JSON. Without a function, it's just a mockup and won't really do anything meaningful.

Which brings us to ST.js.

When you think about what a function is, it takes one value and turns it into another. So basically what I needed to do was build something that will take one JSON, and turn it into another JSON, but most importantly I would have to do it using JSON. Basically I needed to implement a finite state machine in purely JSON.

And this is what templates do. So I set out to build a JSON template engine that turns one JSON into another using a template which itself is written in JSON.

What's really cool about this is, since the template is JSON (As far as I know there doesn't exist any prior art that uses JSON as a template, otherwise I would have used it instead), it has all the benefits of the JSON format itself:

1. You can store it anywhere (Most DBMS nowadays support JSON natively)

2. You can send it over the Internet

3. You can compose them easily

4. You can validate them using JSON schema

5. etc.

To use a more specific example, I use ST.js in both Android and iOS versions of Jasonette as the template engine. And a JSON template is absolutely necessary in this case.

For example, if I want to take a piece of device-generated data and render it, I need to be able to somehow parse it client-side (can't send it back to server to re-generate a JSON) http://docs.jasonette.com/templates/#3-device-api-generated-...

This also applies to many cases where the client-side data contains privacy-sensitive data. The only way to dynamically parse something that happens on the client side is by writing the template itself in JSON and sending it over as data.

Anyway, I hope you guys take a look at the examples on the homepage to get a glimpse of what makes ST.js powerful. Each example is almost its own library, except that you don't need a separate library for those purposes since all you're dealing with is JSON.

The project is interesting... But I don't understand your point about JSON vs JS.

1. You can store JS anywhere (all DBMS support text).

2. You can send it over the Internet... Like everything else nowadays.

3. You can compose JS easily.

4. You can validate JS using linters.

5. etc.

Besides... Aren't you only disguising functions in strings? You can eval strings anyway.

This is cool, nice job. And thanks for open sourcing the Jasonette code.

I've done something similar to ST out of the same frustrations at work, but my use case was way more specific, and you took it way farther with better execution.

So Jasonette functions are just about transforming the model, all in a reactive way? Like view->events->transform model->view?

It can transform anything that's written in JSON, which means both model and view and anything else.

One of the use cases for ST.js in Jasonette is dynamic client-side rendering, where it takes dynamic data and renders it against the template, after which it becomes native components.

Another use case is implementing actual functional programming. This one is not as straight-forward since it involves maintaining another separate state machine for function call stack. But this is also achieved using ST templates. Here's a blog post that talks about it in detail http://blog.jasonette.com/2017/02/15/functional-programming-... but it's a bit involved.

Also Jasonette has something called mixins, which you probably can guess based on its name. It lets you mix in JSON objects into another, so that you can make the app modular. That's also achieved internally using ST.

Overall, I believe even I'm just scratching the surface of what this can do, which is why I decided to take some time to clean things up and prepare and open it up as its own repo, because I think there's tons of other things that can be done by taking advantage of the fact that all that's happening is:


Hope this makes sense!

> Another use case is implementing actual functional programming

I remember our chat 8 months earlier on Reddit about this: https://www.reddit.com/r/functionalprogramming/comments/5ufm...

Yeah. You're still aiming for "functional programs encoded as (essentially) s-expressions but written in JS-Object-Notation (JSON) instead of LISPy parens". And that's still fine if you see that as a major leap forward. Not sure about the audience, those who want to "functionally program a mobile quasi-native APP that can self-update from a server" and know what the "functionally program" part means, wouldn't they reach for JS via React Native? Same selling points implicit already. Is it for those who never programmed but are expected to declaratively express functional idioms in your JSON notation without needing to install all sorts of dev tools and SDKs? Not entirely implausible at all I guess. Curious to see who will end up as your target audience. =)


That's the first thing that came to my mind too.

XSLT is very powerful but also incredibly messy because it has to be expressed in XML.

Doing the same in JSON seems like it would result in the same problem.

> XSLT is very powerful but also incredibly messy because it has to be expressed in XML.

Being expressed in XML does not help, but the language is just bad with implicit data flowing through and very odd constructs, there are XML template languages which I found much more readable than XSLT e.g. I remember my experience with genshi much more fondly than the years I spent with XLST.

all code generators that are not code end up being like that. and just yesterday there was an article about that magazine reporting the end of programming because of some BASIC code generator. heh.

dbaseIIIplus was the one that actually almost managed this feat.

I could imagine use cases for this for example for REST apis that consume json and then have to transform it generically to some other json for db storage in a document database. The internal storage could change over time with the outside spec staying the same, and such transforms could be used to do the mapping per version.

This particular implementation looks a bit messy though - XSLT is valid XML and Xpath, while this solution uses templating syntax. That's more human readable, but also slightly inconsequent - you wouldn't be able to validate such a template with json schema for example.

> That's more human readable, but also slightly inconsequent - you wouldn't be able to validate such a template with json schema for example.

Could you elaborate?

Just to use the json-schema example - because the transform uses templating syntax inside strings, I wouldn't be able to write a json-schema to check if a given transform object is valid or not, which can quickly become an issue for growing sizes of transforms.

Oh man I have nightmares of XSLT from an internship I had a few years ago. Thankfully there's an XSLT guru on stack overflow that would answer every question I posted.

This is the guy: https://stackoverflow.com/users/317052/daniel-haley

Do people not remember how terrible XSLT was?

I mean, sure, it could do the job, but did anyone ever sit down at a file of XSLT and think "yes, this is the best way this information about transforming this XML could be represented"?

JSON is even less suitable for this. It's horrible to read this stuff, horrible to write.

XSLT was, and still is (imho) a great language - at least after 2.0.

There's a steep learning curve full of false summits, for sure, but once you grasp it, there is always an elegant way of solving a problem using it. Sometimes there's even a fast way!

Use of it in the wild, though, leaves a lot to be desired. The false summits lead people to write bad code, the bad code gives it a bad rep, the bad rep leads people to dislike using it, people's dislike leads to then writing bad code... and so on.

I just don't buy it - sure it can be used to do good things, but it's not about learning curves or "grasping it" - it's about "is this a good way for humans to write and read this transformation", and the answer is clearly no.

XML is not a good form for a transformation language. It's not the quality of the code written in it or anything like that.

Take XSLT and change the syntax, and sure, it might be great. You can't ignore the terrible, terrible choice of making the transformation itself XML. It's completely unreadable and horrific to write.

Conversely, I personally find it very readable, and enjoyable to write; don't forget, when people talk about using the right tool for a job, it's not all about the language and the task - the dynamic between the person and the language is just as important :-)

The vast majority of people find XSLT to be an unreadable mess - and for any non-trivial project that requires collaboration, just because it's possible for one person to like it doesn't make it a good choice.

If most people agree a representation is terrible, it's a bad choice for anything serious, because suddenly only a subset of your team or contributors can work on it effectively.

There is a reason XML isn't used as syntax for programming languages. Anyone claiming that an `if` construct, for example, is well represented by a mess of XML tags has a different brain to me and the vast majority of people I've met.

XSLT was meant to be a general-purpose tool. Having a syntax that the vast majority of people find incredibly hard to work with is clearly a huge problem with the spec.

Have you considered reaching out to DevOps folks to consider something like this for complex configurations?...There are related solutions such as jsonnet/ksonette but I think something like this might be preferable.

I work on a similar problem though using a UI based approach, I gave a demo to the Kubernetes SIG App group a while back which be found here: https://www.youtube.com/watch?v=alEzE8MSNaI&t=30m

Btw Ethan, maybe my email is landing in your Spam folder? I've tried reaching out :)

Thanks! I actually built this library for my own usage (Jasonette) and decided to open source it because I realized it can be powerful for all kinds of other purposes.

In essence ST.js is a finite state machine implemented purely in JSON. So I can imagine it being used for things I haven't thought of, which is why I open sourced it. Would be really cool.

Please feel free to reach out at ethan.gliechtenstein[at]gmail thanks!

Now that's a goddamn demo. No fussing about.

I can't wait to play around with this.

Currently I'm having issues trying to figure out a new stack where we can't figure out where to put the backend for front-end logic in a React App over the top of GraphQL. I've been thinking of implementing something like this as the transforms or the formatters don't make a ton of sense to be on the client or the API or in GraphQL and I'm loathe to make another API call in the chain as it feels like I there's enough of those already?

Would that be a good use case for this?

Creator here. Yup, in fact you can check out something like that here https://github.com/SelectTransform/JSONQL

Basically you can send the JSON template over JSON to the server, and let the server transform the raw data using the template before returning.

The most important benefit here is that, since everything can be expressed in JSON, it can be stored or transmitted over the Internet, which enables interesting cases like this.

I like the fact that this exists, but I prefer to use jq.

I also would like to know in which situation this is preferable to jq.

There's also http://jmespath.org -- the syntax for which is supported in AWS CLIs.

Not sure which one is better at this point. Using jq still often feels like trying to decode a code golf contest entry.

Really? I find jq syntax rather pleasant to read and write.

The jq tool is great. Start using it today as a JSON pretty printer and learn it's more powerful features (selection, object creation) as you need them. The only thing I would like to add to jq would be the ability to use multiple cores.

Just saying learn ramda

This is game changing.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact