> How often do you believe this really happens in practice?
Regularly if you're doing refactoring of code. Otherwise code becomes unchangeable because it's too big of a burden once it's clear it needs to change.
> And does that truly outweigh the benefit of being able to define a precise contract on your APIs?
I would point you to the XML standards which allowed people to do exactly that, and instead JSON won.
Are we talking about a published library or your internal-only code? If the former, I sympathize with the argument that relaxing a requirement should not force consumers to change their code. If the latter, then I find it much harder to sympathize. You're already refactoring your code, what is a few more trivial syntactic changes? You could almost do it with `sed`.
> I would point you to the XML standards which allowed people to do exactly that, and instead JSON won.
You know- this is an interesting point. And I guess I'm consistent because I absolutely hate JSON. I've only had to work with XML APIs very few times, but every time, it was perfectly fine! I could test my output against a DTD spec automatically and see if I did it right. It was great. JSON has JSON Schema, but I haven't bumped into in the wild at all. So it seems like "we" have definitely chosen to reject precision for... easy to read, I guess?
You might really enjoy going and reading about CORBA and SOAP -- two protocols that have tight contracts. I'm sure you can still find java/javascript libs that will support both. And if you really really want, you can put them into production -- CORBA like it's 1999 while singing to the spice girls.
And what you'll find is that the tighter the contract, the more miserable the change you have to make when it changes. It's one thing if it's in one code base, it's another if it affects 10,000 systems.
I'll admit that I've never deployed a service with 10,000+ clients.
And CORBA (after looking it up) seems to include behavior (or allow it, anyway) in the messages. That's about much more than having a precise/tight contract on what you're sending. It's much more burdensome to ask someone to implement so much logic in order to communicate with you. I'm fine with the contracts only being about data messages.
SOAP is closer to what I'm talking to. Or even just regular REST with XML instead of JSON.
I'm asking genuinely, how would life be worse between a REST + XML and a REST + JSON implementation of some service? In either case, tightening a contract will cause clients to have to firm up their requests. In either case, loosening requirements (making a field optional, for example) would not require changes in clients, AFAIK.
The only difference that I see is that one can write JSON by hand. And that's fine for exploring an API via REPL, but you surely don't craft a bunch of `"{ \"foo\": 3 }"` in your code. You use libraries for both.
It just seems insane that we don't have basic stuff in JSON like "array of things that are the same shape".
> And CORBA (after looking it up) seems to include behavior (or allow it, anyway) in the messages. That's about much more than having a precise/tight contract on what you're sending.
The IDL (interface description language) for CORBA is a contract. It defines exactly what can or can't be done. It's effectively a DTD for a remote procedure call, including input and output data. (Yes it can do more than that, but realistically nobody ever used those features)
A WSDL for SOAP is similar. CORBA is basically a compressed "proprietary" bitstream. SOAP is XML at it's core with HTTP calls.
> I'm asking genuinely, how would life be worse between a REST + XML and a REST + JSON implementation of some service?
So REST+XML vs REST+JSON alone (no DTD/XSD/schema) would be very similar -- other than the typical XML vs JSON issues. (XML has two ways to encapsulate data -- inside tags as attributes and between tags. Also arrays in XML are just repeated tags. In JSON they are square brackets []).
But lets say you need to change the terms of that contract (new feature usually), will code changes be required on client systems?
* If you used a code generator in CORBA with IDL the answer is yes, there will be code changes required.
* If you used a WSDL and added a new HTTP endpoint, the answer was no. If you added a new field to an existing endpoint, the answer was yes. (See [2])
* If you used a DTD/XSD, the answer is usually yes, since new fields will fail DTD validation using an old DTD -- that is if you validate all your data upon receipt before you process it.
And this was fine for services that didn't change frequently or smallish deployments.
In large systems, schema version proliferation became a nightmare. Interop between systems became a pain of never ending schema updates and deployments, hoping that you weren't going to break client systems. And orchestrating deployments across systems were painful. Basically everything had to go down at once to update -- that's a problem for banks, say.
What's sad to me is that was well known back in 1975. [1] When SOAP was developed around 2000 they violated most aspects of this principle.
> but you surely don't craft a bunch of `"{ \"foo\": 3 }"` in your code. You use libraries for both.
In python, JSON+REST is:
resp = requests.post(url, data={"field":"value"})
What I find really appealing in REST+JSON is that validation just happens on the server side, and that's usually good enough. Sure there's swagger, but that's a doc to code against on the client side.
I don't feel that schemas and the need for tight contracts are all bad. I think if your data is very complex, a schema becomes more necessary than not when documents are bigger than a 1MB, say. I also think it's fine if your schema changes rarely. And yeah, if you need a schema for tight validation, JSON kinda sucks.
But that's the question, do you really need tight validation, and therefore coupling, or is server-side validation good enough? And in most cases people tend to agree with that.
> If you used a DTD/XSD, the answer is usually yes, since new fields will fail DTD validation using an old DTD -- that is if you validate all your data upon receipt before you process it.
I'm not sure I follow. DTD, as far as I know, allows both optional elements as well as attributes. If you add a feature, a client with the old version should continue to work correctly if you add optional elements. If they are NOT optional, then the client will fail regardless of whether you did XML+DTD or JSON, because your API needs that data and it simply wont be there.
What am I misunderstanding?
> What I find really appealing in REST+JSON is that validation just happens on the server side, and that's usually good enough. Sure there's swagger, but that's a doc to code against on the client side.
As a client, you don't have to validate your request before you send it. But it's nice (and probably preferable) that you can.
requests is not built-in to Python, right? So you are still using a library to JSONify your data. If you were to use urllib, then you'd have to take extra steps to put JSON in the body: https://stackoverflow.com/questions/3290522/urllib2-and-json
What's more, you still are not crafting the JSON yourself if you call json.dumps on a dictionary.
But, yes, crafting a dictionary with no typing or anything is still many fewer keystrokes than crafting an XML doc would be, even with an ergonomic library. But again, how much are you doing what you typed in your real code? That looks more like something I'd do at the REPL.
> If you add a feature, a client with the old version should continue to work correctly if you add optional elements. If they are NOT optional, then the client will fail regardless of whether you did XML+DTD or JSON, because your API needs that data and it simply wont be there.
Sure but, that begs the question, How is that better than JSON exactly? Maybe strong typing? And why isn't just sending a 400 Bad Request enough if the server fails validation?
I mean you could say well, "I know the data is valid before I sent it". But you still don't know if it works until you do some integration testing against the server -- something you'd have to do with JSON, anyway. XML is only about syntax, not semantics.
From what I've seen, XSD's tend to promote the use of complex structures, nested, repeating, special attributes and elements. And if you give a dev a feature, s/he will use it. "Sure, boss, we can keep 10 versions of 10 different messages for our API in one XSD" But should you?
JSON seems to do the opposite, it forces people to think in terms of data in terms of smaller chunks say. Yes you can make large JSON API's that hold tons of nested structures, but they get unwieldly quickly. And most devs would just break that up in different API's, since it's easier to test a few smaller messages than one large message.
> As a client, you don't have to validate your request before you send it. But it's nice (and probably preferable) that you can.
If you unit test your code, good unit tests serve as validation -- something you should be doing anyway. If you fail validation on your send, you have a bug anyway -- it's just you didn't get a 400 Bad Request message from the server. But to the user/dev, it's still a bug on the client side.
> requests is not built-in to Python, right?
Yes. But there's a lot of stuff not in the standard library that should be. The point is normal day to day code can be just a one-liner using native python data types.
> What's more, you still are not crafting the JSON yourself if you call json.dumps on a dictionary.
Sure, maybe a technicality here. If I type this, is it python or JSON?
{ "field": [ 1, 2, 3 ]}
Well, the answer is that both will parse it. json.dumps() just converts it to a string. No offense here, but I see it as a distinction without a difference.
Regularly if you're doing refactoring of code. Otherwise code becomes unchangeable because it's too big of a burden once it's clear it needs to change.
> And does that truly outweigh the benefit of being able to define a precise contract on your APIs?
I would point you to the XML standards which allowed people to do exactly that, and instead JSON won.