HTTP/2 is an ugly mess of taking something simple and making it more complex for...

MichaelGG · on Feb 10, 2015

Please tell me where the simplicity in line folding and comments-in-header-values is. Or having special handling for some headers, but not others.

drawkbox · on Feb 10, 2015

You can still deconstruct the message and it is simple and easier to debug even if less exact.

In binary if there is one flaw the whole block is bunk, i.e. off by one, wrong offset, binary munging/encoding, other things. As an example if you have a game profile that is binary, it can be ruined by binary profiles and corruption on a bad save or transmission.

Binary is all or nothing, maybe that is what is needed for better standards but it is a major change.

What is easier, a json or yaml format of a file or a binary block you have to parse? What worked better HTML5 or XHTML (exactness over interoperability)

Granted most of us won't be writing HTTP/2 servers all day but it does expect more of the implementations to adhere to, for better or worse.

The classic rule in network interoperability is be conservative in what you send (exactness) and liberal in what you accept (expect others to not be as exact).

MichaelGG · on Feb 10, 2015

The "classic rule" aka Postel's Law, has proven to be disastrous. The idea of resuming a corrupted message is a totally flawed concept. At best, it introduces compatibility issues. This is essentially the history of HTML and browsers, each one needing to implement the same bugs as other popular versions.

SIP is another IETF gem, which takes its syntax from HTTP. And guess what? It's impossible to have unambiguous parsing in the wild! Why? The whole liberal in what you accept bad idea. So A interprets \n as a line ending, even though the spec says \r\n. B interprets it another liberal way, and assumes you didn't mean to transmit two newlines in a row, so it'll keep reading headers. End result: you can bypass policy restrictions by abusing this liberal-ness and get A to approve a message that B will interpret in another way. Yikes. And, since the software for both is so widely deployed, there is little hope of solving the problem. In fact, the IETF essentially requires you to implement AI as you're supposed to guess at the "intent" of a message.

So you're sorta proving my point, that people are thinking "oh it's just text" and then writing shitty, sloppy code, and they're giddy cause it sorta worked, even from a two line shell script. And then further generations have to deal with this mess, because these folks just can't bear to get line endings right or whatnot.

drawkbox · on Feb 10, 2015

Keep in mind you are still going to have lots of these same problems you mention inside the binary blocks and header blocks. Just the specific annoyances of HTTP 1.1 will be gone but new ones will appear.

Going binary does not make it suddenly easier, it just slices it up and adds a layer of obfuscation.

Easier to know what the hell is going on across a wire with current formats and debug them. Utopia interop does not exist so Postel's Law has gotten us this far. Being text no doubt makes it easier to debug and interoperate, otherwise we'd be sending binary blocks instead of json. Unless you control both endpoints, Postel's comes into play and simplicity wins.

We are moving in a new direction for better or worse and going live. I feel like it is slightly off the right path but sometimes you need to take a wrong step like SOAP did to get back to simple. We'll see how it goes.

agentS · on Feb 10, 2015

A binary protocol's parsing is usually something like read 2 bytes from the wire, decode N = uint16(b[0] << 8) | uint16(b[1]), then read N bytes from the wire. A text-based protocol's parsing almost always involves a streaming parser, which is tricky to get correct, and always more inefficient.

Besides, I think this is a moot point, because chances are that less than 100 people's HTTP2 implementations will serve 99.9999% of traffic. It's not like you or I spend much of our time deep in nginx's code debugging some HTTP parsing; I think its just as unlikely we'll be doing that for HTTP2 parsing.

Also, HTTP2 will always (pretty much) be wrapped in TLS. So its not like you're going to be looking at a plain-text dump of that. You'll be using a tool and that tool author will implement a way to convert the binary framing to human-readable text.

Another way to put it is that the vast majority of HTTP streams are not examined by humans and only examined by computers. Choosing a text-based protocol just seems a way to adversely impact the performance of every single user's web-browsing.

Another another way to put it is that there is a reason that Thrift, Protocol Buffers, and other common RPC mechanisms do not use a text-based protocol. Nor do IP, TCP, or UDP, for that matter. And there's a reason that Memcached was updated with a binary protocol even though it had a perfectly serviceable text-based protocol.

drawkbox · on Feb 10, 2015

Agreed on all points. Binary protocols are no doubt better, faster, more efficient and more precise. I use reliable UDP all the time in game server/clients. Multiplayer games have to be efficient, TCP is even too slow for real-time gaming.

Binary protocols work wonderfully... when you control both endpoints, the client and the server.

When you don't control both endpoints is where interoperability breaks down. Efficiency and exactness can be enemies of interoperability at times, we currently use very forgiving systems instead of throwing them out and assert crash dump upon communication error. Network data is a form of communication.

Maybe you are right, since it is binary, only a few hundred implementations might be made and those will be made by better engineers since it is more complex. Maybe HTTP is really a lower level protocol like TCP/UDP etc now. Maybe since Google controls Chrome and the browser lead and has enough engineers to ensure all leading implementations and OSs/server libraries/webservers are correct then it may work out.

As engineers we want things to be exact, but there are always game bugs not found in testing and hidden new problems that we aren't weighing against the known current ones. Getting a something new is nice because all the old problems are gone, but there will be new problems!

It will be an all new experiment we try going away from text/MIME based to something more lower level, complex and exact over simple and interoperability focused. Let's see if the customers find any bugs in the release.

MichaelGG · on Feb 10, 2015

>Binary protocols work wonderfully... when you control both endpoints, the client and the server.

IP is all binary and I don't think it's a case of one party controlling all endpoints.

MichaelGG · on Feb 10, 2015

Binary protocols are usually far easier to implement both sending and receiving. There is far less ambiguity.

In fact, the newline problem I mentioned? It was not easier to diagnose, and was only caught by using tools checking it as a binary structure.

Postel was just flat wrong, and history shows us this is so. JSON is popular because it was automatically available in JavaScript, and people dislike the bit if extra verbosity XML introduces. JSON is also a much tighter format than the text parsing the IETF usually implements.

Postel's law also goes against the idea of failing fast. Instead, you end up thinking you're compliant, because implementations just happen to interpret your mistake in the right way. Then one day something changes and bam, it all comes crashing down. Ask any experienced web developer the weird edge cases they have to deal with, again from Postel's law.

And anyways, you know what everyone uses when debugging the wire? WireShark or something similar. Problem solved. Same for things like JSON. Over the past months I've been dealing with that a lot. Every time I have an issue, I pull out a tool to figure it out.

Do you know the real reason for the text looseness? It's a holdover from the '60s. The Header: Value format was just a slight codification of what people were already doing. And why? Because they wanted a format that was primary written and readby humans, with a bit if structure thrown in. Loose syntax is great in such cases. Modern protocols are not written and rarely read by humans. So it's just a waste of electricity and developer time.

4ydx · on Feb 10, 2015

Yeah using binary protocols seems to be the new hotness. It all makes me feel so old with my preference for simple ascii text files.

kolev · on Feb 9, 2015

Actually, XHTML was a simplification.

drawkbox · on Feb 9, 2015

Did you close that tag in that textarea from some third party content? If not you're whole view is broken. It was a layer or hopeful standardization that was too hopeful and counted on implementers too much to be exact. It was a nice attempt but was quickly retracted to go to HTML5.

I guess the same thing applies to HTTP/2, sometimes you have to dumb/simplify it down a little, the smartest way that relies on implementers, might be the leap that is too hard to make. The best standards are the simple ones that cannot be messed up even by poor implementations. Maybe the standards for protocols developed in the past looked at adoption more as they had to convince everyone to use it, here if you force it you don't need to listen to everyone or simplify, which is a mistake.

While code and products should be exact, over the wire you need to be conservative in what you send and liberal in what you accept in standards and interoperability land.

In another area, there is a reason why things like REST win over things like SOAP, or JSON over XML, it comes down to interoperability and simplicity.

The more simple and interoperable standard will always win, and as standards creators, each iteration should be more simple. As engineers, we have accepted the role of taking complexity and making it simple for others, even other engineers or maybe some junior coder that doesn't understand the complexity. What protocol is the new coder or the junior coder going to gravitate to? The simple one.

totony · on Feb 10, 2015

XHTML was an overall improvement to HTML imo. If you're going to use XML, at least be consistent with this choice. HTML was not, XHTML is.

drawkbox · on Feb 10, 2015

It was better, from an engineering aspect it should have been better and the world would have more precision and validity/verification on content.

But from an interoperability aspect (relying on implementations) the market didn't think it was better or we'd be using it still. HTML5 won because it was simple and met many needs demanded by the market.

The simple standards that provide more benefits, but most importantly are highly focused on interoperability and simplicity, win, always, even if they seem subpar from an exactness standpoint.

At one point in time SOAP had the same religious hype surrounding it that HTTP/2 seems to have. But sometimes you have to take a step to realize you are slightly off path according to the market, not what you might want to design and what should win but what happens with interoperability in the market. HTTP/2 and XHTML type standards are steps, to something better but are too top down or ivory tower eventhough they have lots of awesome and needed features.

totony · on Feb 10, 2015

HTML5 is a mix of XHTML and HTML4, but with new features. Its syntax ressemble more XHTML than HTML and as thus it is more XML compliant, although you can ignore strict syntax [0]. HTML5 is both XHTML and HTML4 alike, so it is no surprise that is has taken over the market.

Note that I don't especially like HTTP/2 and believe that a hack like SPDY should not make it to a standard. More time and care should be made to make a central protocol like HTTP (central in that it is used alot).

[0] Source: http://www.techrepublic.com/blog/10-things/10-things-you-sho...

>Instead, the HTML5 spec is written so that you can write HTML5 with strict XML syntax and it will work

mnarayan01 · on Feb 10, 2015

Technically HTML was a derivative of SGML. That said, as someone who did a fair amount of parsing of old-timey HTML...it would have been really nice if it was XML.

gsnedders · on Feb 10, 2015

Well, by the time the first spec was written it was defined to be an SGML application; when TimBL first implemented it, it was "roughly based on SGML". As far as I'm aware, except for the W3C Validator, no other serious HTML implementation treated HTML as SGML.

And it couldn't have been XML, because HTML predates it.

smegel · on Feb 9, 2015

For those slamming HTTP/2.0, how do they rate SPDY?

drawkbox · on Feb 9, 2015

SPDY was great for Google and allowed them to change and take hold of HTTP/2.

It saved them lots of money I am sure in improved speed but at the trade-off of complexity and minimal adoption of the standard because it wasn't beneficial to everyone. HTTP/2 is a continuation of that effort by Google which I would do if I were them as well probably. But in the end both are not that big of improvements for what they take away.

Of course I use both but I don't think they will last very long until the next, it was too fast and there are large swaths of engineers that do not like being forced into something that has minimal benefits when it could have been a truly nice iteration.

HTTP/2 is really closer to SPDY and I wish they would have just kept it as SPDY for now. Let a little more time go by to see if that is truly useful enough to merge into HTTP/2. HTTP/2 is essentially SPDY from Google tweaked and injected into HTTP/2 which has huge benefits for Google, so I understand where the momentum is coming from.

Google also controls the browser so it is much easier for them to be the lead now on web standards changes. We will have to use it if we like it or not. I don't like the heavy hand that they are using with their browser share, just like Microsoft of older days (i.e plugins killed off, SPDY, HTTP/2, PPAPI, NaCL etc)

copsarebastards · on Feb 10, 2015

SPDY is a great prototype that exemplifies why you should write a prototype: to show the problems with your design. It's unfortunate that the HTTP/2.0 committee decided to ignore the flaws and go with the prototype design.