Hacker News new | past | comments | ask | show | jobs | submit login
Varnish author criticizing HTTP/2.0 proposals (w3.org)
277 points by cgbystrom on July 17, 2012 | hide | past | favorite | 136 comments



After the whole IPv6 story, I'm surprised the author ignores the political dimension of designing a new protocol. As Mitch Kapor said "Architecture is Politics". It's not just about what solution is best from a technological perspective, it's about what we want our future to look like.

The internet has become way more important than back when these protocols first became standard, and every time a protocol or standard is up for debate, political and commercial forces try to influence it in their favor. Some of the concepts they tried to shove into IPv6 were downright evil, and would have killed the internet as we know it. Personally, I'm relieved all that is left is a small, un-sexy improvement which albeit slowly, will eventually spread and solve the only really critical problem we have with IPv4.

I really dread subjecting HTTP to that process. Although I fully agree with the author's critique of cookies for instance, the idea of replacing them with something "better" frankly scares the crap out of me. Especially when the word "identity" is being used. You just know what kind of suggestions some powerful parties will come up with if you open this up for debate, and fighting that will take up all of the energy that should be put towards improving what we already have.

As techies we should learn to accept design flaws and slow adoption and look at the bigger picture of the social and political impact of technology: HTTP may be flawed, but things could be way, way worse.


You don't go to the author of Varnish for political advice, you go to the author of Varnish for technical advice. I'd be disappointed if he took advantage of his good name to push political advice when he got his reputation for being good at technical matters.

After all, is any of his technical advice invalid due to political concerns that are not wild speculation on your part?

----

Not to say that politics doesn't enter into it, just that it should be brought to the table and discussed by other actors. And those actors should probably be all ears about the technical issues.


>Not to say that politics doesn't enter into it, just that it should be brought to the table and discussed by other actors.

How so? You seem to be distinguishing between politics actors (politicians?) and technical actors.

In a democracy it is not just important but essential that ALL have their say on policy, not just "political experts".


Moreover, the author seems to completely miss the point that SPDY was designed to overcome wide area networking performance issues with the way http uses tcp. Which it does quite well for substantial latency improvements.

He throws out a lot of criticism about SPDY being haphazardly designed (with no explanation), then we find out that really he has an axe to grind over cookies and SSL.

I call bullshit on the whole post. I found nothing useful in it. I almost fell for the http router bit, but again he offers no more than vague criticisms. If SPDY hasn't been a problem at Google and Facebook for load balancers, SPDY isn't badly designed for load balancer implementation. It leads me to believe that his real issue is that Varnish must have been coded in such a way to make it hard to support SPDY. Or perhaps that the authors real beef with SPDY is that he didn't design it.


Any downvoters care to give a more specific response? The OP completely missed the actual purpose of SPDY/HTTP 2.0, without contributing constructive feedback. Facebook's comments on HTTP 2.0 proposals were much more useful

http://lists.w3.org/Archives/Public/ietf-http-wg/2012JulSep/...


Some of the concepts they tried to shove into IPv6 were downright evil, and would have killed the internet as we know it

Oh? Got an example? I've never heard of this (but don't really follow IPv6 stuff).


The evil bit? I'm having a hard time coming up with something serious that would be enough to kill the internet as we know it.


I'm guessing the OP was talking about something that would make the internet less anonymous.


This is a well thought, excellent comment that perfectly makes sense. Thank you. When huge corporations track you as how they are tracking you currently, most of the times, they don't get to know who you really are (except Facebook, because they know 'you'). That is, your identity. That's the only level of control you have over your anonymity. And these guys are proposing a new protocol just to remove that. Ridiculous.


I suspect you are reading too much into the word "identity". It is just a matter of identifying the endpoint, so the HTTP router can delegate requests from one user to the same backend instance in any given session. This, in turn, will give a user a consistent image of the state at the server independently of cache propagation in the backend. For example. You don't need the user's real life identity to do this, and you don't need the same identity across different sites or across different sessions.

This is currently sometimes done by cookies, which makes life difficult for HTTP routers. He is proposing a mechanism to keep the identifying-part while getting rid of problems in the HTTP router layer. The way I read this, it seemed to be without introducing additional privacy concerns and in fact removing some. (Cookies can carry more than identity)

There is a detailed technical discussion to be had about implementing all of this, and in this discussion any privacy concerns would become visible and open for discussion. But I think it is a leap to say that the comments in TFA would necessarily make for a world bereft of privacy ;)


Although I fully agree with the author's critique of cookies for instance, the idea of replacing them with something "better" frankly scares the crap out of me. Especially when the word "identity" is being used.

Ever heard of evercookie? Does that not scare you? Would creating a clean, well-understood solutions that users can actually control not be better than what we have now?

There is just so much wrong with cookies, it's really surprising that no HTTP upgrades propose anything better. For one, cookies confuse session information and client-side storage, and thus work poorly in both roles.


An evercookie is actually pretty straightforward to remove, if you know what you're doing with firebug/firecookie. The only tricky thing it does is persist a cookie in the sessionStorage of your window object, which isn't cleared when you clear your browser cache.


> Although I fully agree with the author's critique of cookies for instance, the idea of replacing them with something "better" frankly scares the crap out of me.

> Especially when the word "identity" is being used. You just know what kind of suggestions some powerful parties will come up with if you open this up for debate, and fighting that will take up all of the energy that should be put towards improving what we already have.

Oh wow, I hadn't thought of that. Reading that critique I was just thinking "oooh doing away with cookies would be a great thing", slightly wondering what one could replace it with ... but you're right, they'd probably replace it with something extra plus plus scary.


There should be a way to "identify" yourself agains sites you are visiting. Which do not left tracks in you computer, so when you visit another site with a Facebook/Google "goodie" they cannot identify you again.

The problem is that cookies are in you computer, they should be ephemeral (you can do it, but is not standard).

But then, yes, Facebook and Google and even Governments will try to know everything about you.


Cookies but blocking 3rd-party cookies solve that, doesn't?


His proposal is at http://phk.freebsd.dk/misc/draft-kamp-httpbis-http-20-archit...

It comes with the caveats that "Please disregard any strangeness in the boilerplate, I may not thrown all the right spells at xml2rfc, and also note that I have subsequently changed my mind on certain subjects, most notably Cookies which should simply be exterminated from HTTP/2.0, and replaced with a stable session/identity concept which does not make it possible or necessary for servers to store data on the clients."


If I'm interpreting that correctly, that could be a huge win for privacy


How so? A unique identifier is a unique identifier.


If the browser created the unique identifier instead of the server it could create a new one as often as necessary.

I think that would actually work very nicely.


Browsers can do that now with cookies, by just clearing old cookies.


If you don't store a leakable cookie on the client and only use a connection ID (which is autoassigned by the server and cannot be easily forged) then a lot of cross-domain evil hackery goes away.


I don't know what you mean by 'leakable' exactly, but you can already mark cookies as HttpOnly (not accessible from Javascript) and Secure (only transmitted over HTTPS).


"Leakable" in the sense that you can hand it to somebody else and they can actually use it.

A cookie is leakable because the client chooses to send it, so copying it to somebody else is really bad. A server-assigned per-connection ID is not leakable unless you can spoof the IP address of the one you're sending as.


In principle you could already achieve that by binding a session (and corresponding cookie) to the client's IP address. No big deal.

Problem you may have is that some clients are behind proxy farms and can arrive with different source IP addresses within the scope of a single session.

If you do not bind the 'server-assigned-per-connection ID' to an IP address they become just as 'leakable' as a session cookie.


Doing server side IP checks is already easy, but in reality it can lead to massive amounts of user complaints when their sessions keep disappearing, because as it turns out, some user segments have a lot of users coming through proxies where each request is not guaranteed to come from the same IP.


Or their mobile device roams from WiFi to cell data.

I suspect anyone suggesting that an IP be part of the session security has never actually tried it on a large scale.


Sure. It'd be nice to have some kind of actual standard to at least make a good best guess here. Cookies are a really, really bad start.


Then a translation device at carrier assigns the client a different address. I heard first-hand that this is very annoying.


Of course, but the question is what that unique identifier denotes.

An IP address tells people who you are, where you are, and what you do.

A unique identifier ('+18El1iZRFCIiqRpfw4dJR8mXjJn2UxPrjwoRNpjSWg=' for instance) tells what you do without the who and where [1].

It's even possible advertisers could still target users based upon these identifiers, just without the background knowledge that makes these sorts of things privacy issues.

[1] I'll admit, it could be argued that the what can determine the who and where


Maybe he's trying to bring light to Unhosted?


While in general I understand where he is coming from, I believe his main argument about adoption is flawed.

What do you think is more likely going to be adopted? A protocol that's not backwards compatible at all (heck, it even throws out cookies) or something that works over the existing protocol, negotiating extended support and then switching to that while continuing to work the exact same way for both old clients and the applications running behind it?

See SPDY which is a candidate for becoming HTTP 2.0. People are ALREADY running that or at least eager to try it. I don't think for a second that SPDY is having the adoption problems of ipv6, SNI issues aside.

Even if native sessions would be a cool feature, how many years do you believe it takes before something like that can be reliably used? We're still wary of supporting stuff that a 11 years old browser didn't support.


I don't think for a second that SPDY is having the adoption problems of ipv6, SNI issues aside.

Google being able to modify both the client (Chrome) as well as few fairly significant server installations has kind of helped there a little bit...


The key difference with IPv6 is that SPDY is opaque to routers and only needs changes at the endpoints. IPv6 has the chicken-and-egg problem that it only becomes usable once everyone (hosters, ISPs, core networks, home networks, web frameworks, client devices) has deployed it, and most of those actors don't bother with things that bring no immediate benefit.


You do know that when IPv6 roll-out was initially contemplated, it would happen on IPv4-convertible addresses only, until everybody had converted. That would have allowed seamless IPv6/IPv4 interop and made migration a breeze. Lacking any "must have" features, nobody could see a reason to bother. Now we have to do the migration without the benefit of seamless conversion.


Luckily, HTTP only requires the browser (which tend to be frequently updated) and server to be upgraded, not some random aging routers, there is little reason to fear adoption, there is minimal harm in keeping both protocols around for some time, and there is something much closer to a "must have" feature - speed.


A lot of hardware (proxies, DPI gear, hotel wifi hotspots, cellular carriers, etc) sniffs/mangles HTTP in transit and must be verified to be compatible with new HTTP features. HTTPS doesn't have this constraint.


Browsers are frequently updated only since Google moved the game on with Chrome auto-update.


I regret missing out on that. Using those addresses? https://tools.ietf.org/html/rfc5156#section-2.3

That doesn't make the chicken-and-egg problem disappear. The best I could imagine is that it would allow IPv6 to downgrade to IPv4 using some sort of addition to the routing table. There's still no incentive to upgrade.


> SNI issues aside

What SNI issues? In practice any client that supports SPDY is going to support SNI.


Yes. But SPDY requires SSL, so in order to get any advantages of SPDY at all, you have to serve your site over SSL which requires one IP address per site due to the lack of SNI support in non-SPDY browsers.

So you either only provide SSL+SPDY for browsers you know support SNI, or you don't provide either SSL or SPDY to all of the browsers.


How protocol adoption happens, in three steps:

1. Working code. 2. Publicity. 3. Ubiquity.

That's it. Kamp is making a lot of the right noises here, but he's already lost ground to SPDY just because they've shipped code. No amount of sitting round tables bashing out the finer details of a better spec will help as much as getting code written - even if it's just a placeholder for an extensible spec, as long as that placeholder does something useful.


> he's already lost ground to SPDY just because they've shipped code.

Is SPDY good enough now and fixable enough in the future to leverage this gained ground?

As more and more people deploy SPDY, they will understand its problems and there will almost no chances to change it, differently from what Google did at the beginning (they have gone through 1 big change and many small changes in the protocol). When the authors of the main servers (varnish, apache, nginx) will start feeling its limits, will they have to keep it around just for sake of compatibility?

Please note that SPDY on the server is not a requirement, just an opportunity. Nothing will change for your users if you want to remove support for SPDY in your server after you deployed it and used it for some time. There are no "http+spdy://example.org" URL around, so supporting HTTP only will always be sufficient. Maybe not as much performant as SPDY but 100% supported.


(These are just my opinions as web developer. Please feel free to downvote if my expectations are wrong)

While I can completely agree with the technical merits of this proposal, there are some very two-faced statements.

Author begins by pointing out the painfulness of IPv4 to IPv6, says that the next HTTP upgrade should be humble. But then proceeds to kill cookies and remove all the architectural problems in HTTP. Isn't that the same what IPv6 was? Wouldn't such an approach produce the same amount of pain to the implementors (that is us, the web developers)?

Any upgrade will certainly have some backward-incompatible changes. But if it is totally backward incompatible, I don't understand why it still needs to be called HTTP. Couldn't we just call it SPDY v2 instead, or some other fancy name?

Cookies are a problem. But the safest way to solve that problem is in isolation. Try to come up with some separate protocol extension, see if it works out, throw it away if it doesn't. But why marry the entire future of HTTP with such a do-or-die change?

I blindly agree with the author that SPDY is architecturally flawed. But why is it being advocated in such big numbers? Even Facebook (deeply at war with Google) is embracing it. It's because SPDY doesn't break existing applications. Just install mod_spdy to get started. But removing cookies? What happens to the millions of web apps deployed today, which have $COOKIE and set_cookie statements everywhere in the code? How do I branch them out and serve separate versions of the same application, one for HTTP/1.1 and another for HTTP/2.0?

More doubts keep coming... Problem with SPDY compressing HTTP headers? Use SPDY only for communication over the internet. Within the server's data center, or within the client's organization - keep serving normal HTTP. There are no bandwidth problems within there. Just make Varnish and the target server speak via SPDY, that is where the real gains are.

I could go on. I'm not trying to say that the author's suggestions are wrong. They are important and technically good. But the way they should be taken up and implemented, without pain to us developers, doesn't have to be HTTP/2.0. Good ideas don't need to be forced down others throats.


No. Ipv6 is Ipv4 with bigger addresses, it didn't try to solve any of the other problems of IPv4 (or attempts to solve them were killed by ISPs).

One example was multihoming (having more than one ISP) serveral smart proposals were floated (anycast, nearcast etc) but they were killed by ISP's who protected a lucrative business.

If Ipv6 had made multi-ISP multihoming possible without all the trouble of BGP, business would have killed to get it back in the late 1990ies.

Cookies only disappear from the wire, they are trivial to simulate on your server (see my other reply here).


"Ipv6 is Ipv4 with bigger addresses,"

Yeah, I used to think that, then I participated in some IPv6 conversions and watched some others. I don't think that any more. IPv6 may not be the Glorious Solution to All Network Problems Ever, but it's not just the obvious incremental improvement on IPv4 either. It's a new protocol.

(I do sometimes wonder if an IPv4.1 that simply set a flag and used 8 bytes instead of 4 was proposed right now if it could still beat IPv6 out to the field even with IPv6's head start. Note, I'm not saying this would necessarily be a good idea, I just find myself wondering if IPv4.1 could still hypothetically beat IPv6 to deployment.)


What practical problems, aside from address range exhaustion, does it solve? I've read some technical articles about benefits of IPv6, but most of them keep returning to address size problem.


I think a big part of this is address range exhaustion is the root cause of many other problems. For example, IPv6 effectively obsoletes NAT, which removes all kinds of complexity from many protocols (off the top off my head: IPSec, many P2P protocols).

IPv6 also brings saner (IMO) protocol headers, and introduces a variety of other incrementally improved protocols (see ICMPv6, DHCPv6) that have been tweaked with the benefit of years of deployment experience.


> More doubts keep coming... Problem with SPDY compressing HTTP headers? Use SPDY only for communication over the internet.

That's not good enough. The big problem is that you need the headers to properly route the request to the correct server. So for most operations, there will have to be one machine that is capable of reading all the headers of all the requests that arrive. gzipping the headers makes the job of this machine much, much harder.


I have a lot of respect for phkamp, varnish is an impressive piece of engineering.

I disagree with the stab he takes at cookie-sessions here, though. He seems to ignore that sessions are not only about identity but also about state.

Servers should be stateless, therefor client-sessions (crypt+signed) are usually preferable over server-sessions.

Having a few more bytes of cookie-payload is normally an order of magnitude cheaper (in terms of latency) than performing the respective lookups server-side for every request. Very low bandwidth links might disagree, but that's a corner-case and with cookies we always have the choice.

Removing cookies in favor of a "client-id" would effectively remove the session-pattern that has proven optimal for the vast majority of websites.


No, servers should hold _all the state_ and clients none. If I add something to my shopping basket from my mobile phone, I want to be able to add more from my browser, rather than have two shopping baskets.

Servers storing stuff on the clients is just plain wrong, and it is wrong from every single angle you can view it: Its wrong from a privacy point of view, it's wrong from a cost-allocation point of view, it's wrong from an architecture point of view and it's wrong from a protocol point of view.

But it was a quick hack to add to HTTP in a hurry back in the dotcom days.

It must die now.


I too want to add the first item into the shopping basket from the laptop and the second from my tablet.

But this does not imply that the clients should store no state - it only implies that the state as perceived by the users needs to be the same. Different from how it is implemented.

While we are at the topic of state, why do I have to subscribe to a service just to be able to add the bookmark on one device and use it on the other ?

I view the two problems as congruent (except the bookmarks state is global, thus there is no "server" to offload the state onto) - but at the same time this difference highlights the assumption that there is "The Server" for the web app. What if there weren't ? Can we push the model a bit further and make it p2p - and I am pretty sure that as the homomorphic crypto advances, we will be able to do so even for the untrusted peers. Then there's no "server" anymore to store the state in.

Then, you have the DoS bit. Absolutely correctly the HTTP routers are the most loaded and hard to scale element of the whole setup. If you offload the state on the client, then you can "dumb down" the task of the non-initial content switching decision, based on the trustable client state.

So, I think that distributing the state is a good idea. What is limiting is the naive distributing the state - and this is where I agree with your assessment. And that's probably one of the things that would need to get fixed for something that would is big enough to be called "2.0". (As a by-product, solving the above would also solve the endpoint identity/address change survivability problem).


No, servers should hold _all the state_ and clients none.

That's the opposite of the general consensus in the webdev-community.

Client-state is not only vastly more efficient in many cases but it also usually leads to cleaner designs and easier scaling.

Many of the modern desktop-like webapps would be outright infeasible without client-state. What's your response to that, should we just refrain from making such apps in the browser?

If I add something to my shopping basket from my mobile phone, I want to be able to add more from my browser

And at the same time you probably appreciate when on your slow mobile-link the "add-basket" operation happens asynchronously, yet doesn't get lost when you refresh the page at the wrong moment.

I'm a bit confused here. You know better than most how critical latency is to the user-experience. Saving on server-roundtrips or hiding them is a big deal.

Yet you promote this dogma without providing an alternative solution to this dilemma.


Cleaner for who ? Easier for who ?

That kind of cookie usage just makes it Somebody Else's Problem instead of your problem.


Cleaner for who ? Easier for who ?

For the webapp-developer, which results in a faster and cheaper experience for the user.

I'm still baffled at your persistence given you sit pretty much at the source. You have probably written VCLs for sticky sessions yourself and pondered the constraints wrt data-locality and single points of failure? Sticky sessions are just not a good design when the alternative is so readily available; it's the first time in a long time I hear anyone disagree with that.

That kind of cookie usage just makes it Somebody Else's Problem instead of your problem.

And who would that "somebody else" be?

Users certainly don't care about a few hundred extra-bytes that their browser sends with each request, especially since that trade-off usually makes their browsing faster than the alternative would be.

The privacy concern is valid but boils down to developers using cookies wrong (without encryption). If we were to remove all technologies that are used wrong by incompetent developers then the internet would be a pretty empty place.


Clients can't be completely stateless; at the very least they need to pass along a key to identify their server-side state. That's what cookies do now (among other things) and it sounded to me like that's what you were proposing for the session/identify facility. I agree with you on that point; a specific feature in the protocol would be better than the generic cookie feature, given the ways cookies have been abused.

What's your opinion on IndexedDB and other local storage mechanisms? I believe that single-page-apps are overused, but I do think that they have their niche and standards for storing data locally are valuable and necessary. In my own work I'd use that space as a cache rather than permanent storage, just like I'd use something like memcached on the server side to reduce database queries.


But what about preferences for anonymous users? Store that on the server side? Append them to the URL? Both options kinda suck.

Also, consider dabblet. The way it allows you to store your stuff using github is very smart IMHO.


Store it on the server: The user-agent gives you a session-id to use as key.

It may be that session-keys should tell if they are anonymous or if they represent (locally) authenticated users, but that's a very complex subject I won't claim to have a clear opinion of yet.


Store the settings of anyone who ever connected? For how long? Forever, just in case? Silly. And why do you even assume the server has to have a database? Why should it be required to have one, why should it have to store the stuff? What is your take on statelessness? you concentrate so much on the abuses of cookies and client side storage/computation, but you're not addressing the advantages. I doubt you're aware of them to be honest.


Uhm, isn't that how it works today ? Do you care about how many metric shitloads of storage your cookies take up on client's disks ? Shouldn't you ?

Putting the cost of storage where the decision to store is made is sound economic practics.


My Cookies directory is 11 MB. That's actually quite a lot, considering the length of the average cookie, but my disk is 256 GB and it's only gotten that big because I've been browsing for years literally without ever clearing my cookies and I can clear them at any time.

This is really a non-issue.


It is the user's session data. If it is stored on their end, they can choose how long they wish to store it for, and delete it any time they like.


User-agent, and other bits of stuff that is duly noted by http://panopticlick.eff.org/ are much-much-much worse than cookies. Cookies you can erase. User-agent and other "fingerprints" are with you forever. And they travel with you no matter where you are.

So, while you would dismiss the "privacy hazard" that the cookies are, you replace it with something much worse.


You can still have the cookie concept, and have the session id be a random number each time someone sends a tab to the site. The cookie can hold those preferences, and the session id can be used for session stuff. As a bonus, you can then only load the cookie on the first page load, and keep the values in cache associated with the browser random session number, saving in data transfer issues, and losing nothing. And for those that don't need cookies, they get a big win in terms of privacy.


ok, so I grpk the idea correctly it is something like "send the cookie-like-data from the client only on the first GET, if you are doing it over HTTP/1.1 single TCP connection" - that sense (and could be easily made into an extension to HTTP/1.1 - [though it creates the dependency between the different GET requests] - have the server will just send "X-Dont-Send-Me-More-Cookies-in-this-TCP: yes!" header from the server, and make the compliant clients react to it).

What I do not understand where's the win on the privacy front here. You send the random ids - but the site owner will re-correlate these random IDs with your identity. So, you would not win anything here - or, what am I missing ?

My take on the privacy:

There is no problem with someone collecting a bunch of info about me and using it to improve their services.

There is a little bit of a problem with someone collecting a bunch of info about me and another million people and keeping that in a big blob.

There is a big problem when that someone gets hacked and this bunch of info about another million people gets to the bad kids.

It's the centralization of a lot of data that is bad for the privacy.

Store the data locally on the clients and give it to the server only when it is contextually needed. e.g.: my shipping address, I am happy for my browser to supply it to you from my local storage to you every time you want to ship me something. I am very happy if you do not store and sell this address to someone who will later send snail-mail spam to me. Or store without the due diligence ('cos time to market and all that) and then get hacked and then I find myself "having paid" for the helicopter spare parts.

Of course, this would hurt the nouveau business models that treat the users as a product. And will make the analytics harder - because one would not be able to just run a select... But to me it could be a useful tradeoff.

(above, I use the term "client" to refer to the collective set of the devices that are "mine". As I wrote in another reply, storing the state on client does not imply the difference in the user-seen behavior, so the shopping cart should survive).


Of course, keeping the data decentralized on your computer is super secure, this is why botnets logging users data never got beyond theory. It is also why phishing was a clever idea but never panned out, people only would send data to the right recipients. </snark>

Sure, centralized data sounds big and scary, because a single security instance looses a million people's data in one go, but how is it any different from a million security instances in a virus losing "only" 1 person's data?

Similarly, I don't understand how it is remotely feasible to think that storing your shipping adress on your computer vs on a site that is shipping you stuff changes things -- I mean, they still have to get your address to send you the stuff you ordered. It is a fundamental requirement of shipping. Address is not a private bit of info.

Fingerprinting will be around, so it is probable that there will still be tracking. Can't beat that right now, so lets not conflate that with other problems. Instead lets look at the problems that are solved: cookies store data to make it easy to not just correlate and be probably right about the user, but be perfect. Further, they can be hijacked and otherwise stolen and used by malicious third parties, giving data beyond just the access patterns to the site in question. Session ids can be engineered to not have this inherent problem, cutting down information leakage. Further, I imagine plugins that will keep drack of your worst data offenders, and force a new session id every request from them, making the data tracking and correlation even more difficult.

It isn't an all or nothing game, even if you get rid of the low-hanging-fruit abuses, it is a win. Yes, new stuff will come along, but that doesn't mean we shouldn't try, particularly when the current scenarios allow all the bad stuff you can think of, but easier.


re. snark: phishing: it is not the physical user that has to input the data. Think of how you use the password manager. botnets: yes, but since I keep my computing devices clean, I was never a victim of a botnet. While my account info was stolen from one of the online sites, with zero influence. See where the difference is ?

The difference is that the decentralized approach would put more control in the hands of the user (so they either take care themselves or hire someone to take care for them). If they want to.

"Address is not a private bit of info" - it's person and context dependent. Some people consider their name a private bit of info in some contexts... And yes you have to send the shipping info to the remote party to ship you stuff. But they do not have to keep it neatly packed one select away.

I still have a difficulty understanding how the "random session-id" will solve the problem of privacy. All I can see happening is one more level of indirection, that will cause the creation of the frameworks to re-collate this back. Because this is a functionality that is needed by the developers. And once you have the commonly available code, you're back to previous stage - except with an additional pile of code to debug.

I'm not saying all of this because I think we should stop trying. It's just that I can't see how the cost of uplifting the entire internet infra (the code required for this functionality will surely be much more storage than the cookies over my lifetime) and the cost of having the programmers support both models for the good chunk of future (hello, IE6 users, I am looking at you! :-) justifies the incremental feeling of security that this gives.

edit: re. sending the data to the trusted server: sign with your client key a "request for data" together with the manifest of the addresses that the server can plausibly have. Then when the server needs the data it can present this request to your UA and get the data. Yes, the server can be hacked and this data can be siphoned off. But then the attackers get the [timespan of the breach] worth of user data, and not the entire DB.


re. snark: phishing: it is not the physical user that has to input the data. Think of how you use the password manager. botnets: yes, but since I keep my computing devices clean, I was never a victim of a botnet. While my account info was stolen from one of the online sites, with zero influence. See where the difference is ?

No I don't see the difference at all. So you got lucky, and didn't have you computer targetted early on by a 0-day virus. Congrats, I'm sure your luck will keep up forever.

I'm not saying all of this because I think we should stop trying. It's just that I can't see how the cost of uplifting the entire internet infra (the code required for this functionality will surely be much more storage than the cookies over my lifetime) and the cost of having the programmers support both models for the good chunk of future (hello, IE6 users, I am looking at you! :-) justifies the incremental feeling of security that this gives.

Now you are conflating the sole benefit of session ids with the security benefit. There are other benefits. Read the article, there are benefits to "http routers" that would come from it. Look at my comment history, I mention a couple (cache locality benefits from routing, ability to standardize login stuff and use http auth reasonably again, without reinventing the wheel every site/framework). Others have mentioned other benefits. The incremental security benefit is but one of these.

I'm not saying all of this because I think we should stop trying. It's just that I can't see how the cost of uplifting the entire internet infra (the code required for this functionality will surely be much more storage than the cookies over my lifetime) and the cost of having the programmers support both models for the good chunk of future (hello, IE6 users, I am looking at you! :-) justifies the incremental feeling of security that this gives.

This is a strawman, yes there are still places on legacy systems, but more and more are adopting systems that allow standards based approaches and faster upgrade cycles (ala adopting chrome or firefox), there is no reason to doubt this trend will continue.

edit: re. sending the data to the trusted server: sign with your client key a "request for data" together with the manifest of the addresses that the server can plausibly have. Then when the server needs the data it can present this request to your UA and get the data. Yes, the server can be hacked and this data can be siphoned off. But then the attackers get the [timespan of the breach] worth of user data, and not the entire DB.

This looks to be a usability nightmare. Further, at best it is no better of a solution than the one i presented - an incremental change that requires lots of code. As soon as this starts happening in a widespread way, the attack patterns will change from server hacking to browser hacking in a serious way. Or finding ways to hack the http gateways where ssl is dropped, and which are frequently appliances harder to monitor for security. Or there will be more phishing attacks using sophisticated key stealing techniques to get real credentials. Or DNS attacks. Or as plug devices get super cheap, piles of mitm attacks on places with wifi, or or or... security is always incremental.


>Servers storing stuff on the clients is just plain wrong, and it is wrong from every single angle you can view it

Not that it is surprising given the source, but this "my opinion is objectively correct" nonsense isn't constructive. Client side sessions give you stateless servers, which allows real seamless fail-over. Having to run a HA session-storage service to get that is a big additional cost. "PHK said it is right" doesn't provide sufficient benefits to overcome that downside.


They are not wrong because I say so, they are wrong because they are wrong.

When it gets to the point where EU regulates something, the way they did with cookies, it should be painfully obvious to even the most casual observer, that there is something horribly wrong with it.

As for the cost of your HA session-storage ? Cry me a river! You're the one making the money, you're the one who should carry the cost.


The EU did not start regulating cookies because your data could theoretically be leaked from your computer via cookies or something like that. They did it because cookies are used to track people, which is no different from a hypothetical session identifier, except that hypothetical browser controls could be added which are already fully possible with cookies.


"Cry me a river" is no more compelling than "I am right because I say so". I am making money? I didn't realize my free site that I pay hosting expenses for out of my pocket was making me money. When can I expect my check?

You haven't offered any reason why anyone would want to move from client side sessions to server side sessions. If you want to affect change, you need to provide reason for change, not just condescending nonsense.


So how does having a protocol standard state value, lets say a uuid, rather than a cookie with a bunch of keys, one of which is a uuid, change things? Well first, we can now disassociate valuable private data from a session identifier. Second, we can now have the http routers point session ids to specific servers, allowing a level of "on box" caching per app server, allowing faster lookups without needing to hit memcache-type things, just poll the local state cache first.


> See for instance how SSH replaced TELNET, REXEC, RSH, SUPDUP

> Or I might add, how HTTP replaced GOPHER[3].

telnet and gopher were used by a few thousands servers only and were not consumer facing technologies (for the most part), it doesn't make sense to compare that to IPv4 and HTTP that are used by millions (billion?) of servers.


Sukuriant's point might be terse, but it's not invalid either.

In 20 years' time, I'm pretty sure that the next iteration of us will be saying something like "When they phased out HTTP, there were only a few billion servers..."

Better will always make its way in, even if there are entrenched systems running the 'old faithful' code already out there. IE6 is being phased out, yes, VERY slowly, but we're already way closer to getting it down to an irrelevant number than we would be if there weren't pushes. Will we ever get rid of it completely? Maybe not. I'm sure there's still a gopher server out there somewhere or another, and it's not that uncommon to get Telnet access to some commodity (crappy) web hosts, but SSH is pervasive and good, and we're all better off for it.


The factors that caused slow IP4 to IP6 transition are cited as examples by the author you are criticizing. It's very clear they are aware of the scale of http deployment as well as the challenges that brings. In fact, that key point motivates most of their objections to the current standards proposals.


>telnet and gopher were used by a few thousands servers only and were not consumer facing technologies

The card catalogs at most university libraries and most libraries of any national or international importance were reachable by telnet in 1992. And I think card catalogs count as a "consumer-facing" service.

The vast majority of internet client software in 1992 was text only. The first exception to this that I am aware of is the WWW, which most internet users had not started to use by the end of 1992 (email, newsgroups and ftp being the most widely used services). The way most connected to the internet or an intranet from home was by sending vt100 or similar protocol over a dial-up link -- with a Unix shell account or VMS account at the other end of the link. Repeating myself for emphasis: in 1992 most people accessing the internet from home or from a small office used a modem and IP packets did not pass over that modem link. The point of this long paragraph is that the vast majority of the machines on which these shell accounts ran were also reachable by telnet.

Finally, the telnet protocol in 1992 was a "general-purpose adapter" similar to how HTTP is one today. For example, the first web page I ever visited I visited through a telnet-HTTP gateway so that I could get a taste of the new WWW thing without having to install a WWW browser. Note that this telnet-HTTP gateway is another example of a "consumer-facing" telnet server.

In summary, there were probably more than a few thousand telnet servers in 1992 -- and many of them were "consumer-facing".

I am almost certain there were a few million users (certainly so if we include college students who used it for a semester or so, then stopped) of the internet in 1992, and most of those users used telnet.


IRC was pretty widely used back in the day too, and it was and still is text-only. There have been a number of non-text replacements for IRC (SecondLife being the most successful) but most died off. IRC's grandchildren, Twitter, SMS, all of the instant messaging and chat services, are all still text-only or close too it.


OK, but I feel the need to stress that when I wrote "text only", I was referring to the user interfaces.

(And the reason user interfaces are relevant here is that telnet was the main way to export user interfaces over the internet in 1992.)


> telnet and gopher were used by a few thousands servers only

I can't speech for Gopher, but I routinely see telnet and rsh all around the industry, where anyone with control of some Windows machine on the network can sniff critical PLC and server passwords. Even when SSH is available for the servers. It is hopefully a changing situation as servers get replaced/upgraded and SSH gets more and more pervasive.


Telnet ain't dead. It's used by hundreds of thousands, if not millions of networking devices the world over.


Why not?


Who won't support spdy by the end of the year? The pace of protocol obsolescence is increasing. I can't find stats about how much http traffic facebook is responsible for, but i'd bet it's at least 1%

Governments and home grown enterprise apps are my guess about who's late to the party.


El cheapo web hosting will be late to the party. TLS certificates will be a stumbling block.


Excellent judge of the protocol. It's interesting that no one is trying to solve the cookie bloat problem with http.


Removing cookies from a protocol which is otherwise fully compatible with HTTP/1, in the sense of being able to be interposed as a proxy or substituted in the web server without breaking apps, is a terrible idea.

> Cookies are, as the EU commision correctly noted, fundamentally flawed, because they store potentially sensitive information on whatever computer the user happens to use, and as a result of various abuses and incompetences, EU felt compelled to legislate a "notice and announce" policy for HTTP-cookies.

> But it doesn't stop there: The information stored in cookies have potentialiiy very high value for the HTTP server, and because the server has no control over the integrity of the storage, we are now seing cookies being crypto-signed, to prevent forgeries.

Anyone with a grain of skill is capable of using cookies as identifiers only; it's hard to see what cookies vs identifiers has to do with "notice and announce" or security. An explicit session mechanism could provide benefits over using cookies for the same purpose, but what exactly would removing cookies achieve other than breaking the world?


I agree that anybody with sufficient clue can and will use cookies as id only.

Unfortunately such people are evidently few and far between.

Banning cookies and having the client offer a session identifier instead solves many problems.

For starters, it stores the data where it belongs: On the server, putting the cost of storage and protection where it belongs too.

This is a win for privacy, as you will know if you have ever taken the time to actually examine the cookies on your own machine.

Second, it allows the client+user to decide if it will issue anonymous (ie: ever-changing) session identifiers, as a public PC in a library should do, or issue a stable user-specific session-id, to get the convenience of being recognized by the server without constant re-authorization.

Today users don't have that choice, since they have no realistic way of knowing which cookies belongs to a particular website due to 3rd-party cookies and image-domain splitting etc.

Network-wise, we eliminate a lot of bytes to send and receive.

One of the major improvements SPDY has shown is getting the entire request into one packet (by deflating all the headers).

But the only reason HTTP requests don't fit in a single packet to begin with is cookies, get rid of cookies, and almost all requests fit inside the first MTU.

Finally, eliminating cookies improve caching opportunities, which will help both client and server side get a better web experience.

As for breaking the world: It won't happen.

It is trivial to write a module for apache which simulates cookies for old HTTP/1 web-apps: Simply store/look up the cookies in a local database table, indexed by the session-id the client provided.

I'm sure sysadmins will have concerns about the size of that table, but that is an improvement, today the cost is borne by the web-users.


An identifier has privacy disadvantages over a cookie with the same duration. The least privacy you have is when the server has a unique identifier for you: then they can do whatever they want. With a cookie the site has an option to store only what they need, instead of something unique. For example if I'm running an a/b test I could do this with a cookie, setting it to "1" for half the users and "2" for the other half.

(I work on mod_pagespeed, and our experimental framework uses cookies this way: https://developers.google.com/speed/docs/mod_pagespeed/modul...)


> This is a win for privacy, as you will know if you have ever taken the time to actually examine the cookies on your own machine.

Most of the cookies I've seen are some kind of hash.

> Second, it allows the client+user to decide if it will issue anonymous (ie: ever-changing) session identifiers, as a public PC in a library should do, or issue a stable user-specific session-id, to get the convenience of being recognized by the server without constant re-authorization.

> Today users don't have that choice, since they have no realistic way of knowing which cookies belongs to a particular website due to 3rd-party cookies and image-domain splitting etc.

I don't see how this makes sense - what's the difference?

Assuming that the session identifier is different between sites (if it's not, then the user has no option to "remove cookies" for a single domain without deauthenticating everywhere, and it's harder to determine which sites are tracking you):

- There will still be third party domains involved, since advertisers will still want to correlate traffic between domains;

- Sending a new session identifier with every request won't be practical, because you won't be able to log in, but users will be able to set their browsers to send a new identifier when the window is closed or whatever... just as they could currently configure their browser to clear cookies at that time.

Also, anyone who wants to abuse cookies can just use localStorage.

> But the only reason HTTP requests don't fit in a single packet to begin with is cookies, get rid of cookies, and almost all requests fit inside the first MTU.

Surely it's still useful to deflate things (user-agent...), though, and then what does it matter?

> Finally, eliminating cookies improve caching opportunities, which will help both client and server side get a better web experience.

How so? The server is perfectly justified in sending different content based on the session identifier, so wouldn't a proxy have to assume it would?

But if you want to say the result doesn't depend on cookies, can't you just set a Vary header?

> It is trivial to write a module for apache which simulates cookies for old HTTP/1 web-apps: Simply store/look up the cookies in a local database table, indexed by the session-id the client provided.

Eh... okay. This still breaks anything that uses JavaScript to interact with the cookies.


The client/user-agent gets to control what session-id gets sent to which sites.

That makes it possible for a UI design where the user can press a button and say "don't surf this site anonymously" with the default being a new random session-id for all other sites.

That will make tracking and correlation of webusage much harder, which I really don't see a downside to.

Deflate is bad on its own, it is a DoS amplifier and it makes the job of load-balancers much more resource intensive, because they have to retain compression state for all connections and spend CPU and memory on the inflation.

The server is perfectly justified in customizing content, and we have a header for saying that is the case: Cache-Control.

The problem with cookies is that they disable caching of everything on the site, including favicon.ico and there is nothing the server can do about it, because the cookies are sent on all requests.

Javascript will also have access to the session-id.


> That makes it possible for a UI design where the user can press a button and say "don't surf this site anonymously" with the default being a new random session-id for all other sites.

This is already possible, just give tabs their own cookie context by default. (Browsers don't make this the default, but they all have some variant of "incognito mode" already...)

> The problem with cookies is that they disable caching of everything on the site, including favicon.ico and there is nothing the server can do about it, because the cookies are sent on all requests.

I admit that I don't know much about HTTP caching, but I don't see why the Cookie header would inhibit caching. (Edit: Isn't the purpose of the Vary header to specify which request headers affected the result, including Cookie?)


(probably pressing wrong reply link here ?)

Cookies are almost never mentioned in Vary: so all caches have to assume that the precense of cookies means non-cacheable.


Hmm. That would indeed be an advantage of a new mechanism.

What I meant about JavaScript is that a server side cookie->session bridge for legacy code would not work in general because corresponding client side code sometimes expects to be able to see the cookies in document.cookie.


Why must HTTP 2.x be backwards compatible with 1.x? Should SSH have not been created because you can't talk to it with a telnet client? If a new protocol offers sufficient benefits, it would be worth having to make minor changes to apps to support both.

Cookies suck, from a technical and regulatory-compliance standpoint. Plus, I'll finally stop having to clear my cookies every month or so just to log in to my PayPal and American Express accounts. Both sites keep creating unique cookies on every login until there are so many that they pass their own web servers' max header length limits.


HTTP/2.0 doesn't have to be backwards compatible at all. In fact, I see the future protocol switch being pretty simple. There will be new HTTP/2.0 servers and HTTP/1.1 legacy servers. The clients will speak either language, but 2.0 servers will be faster. Eventually clients will let the user say if they want to talk to 1.1 servers at all.

The initial line will remain the same, except for the version:

    GET /page HTTP/2.0
    *** extra 2.0 headers/request ***
If the server speaks 2.0, it will just carry on. If it doesn't, the server will return a 505 and the client will resubmit the request:

    GET /page HTTP/2.0
    505 HTTP Version Not Supported
    GET /page HTTP/1.1
    *** 1.1 headers / request ***
There is no reason the protocols must to be backwards compatible past the first line. Hell, 2.0 could even be binary after that first line. So, while they don't have to be compatible, they can still coexist.


It's not about "server speaking HTTP/2.0". Its about "application speaking HTTP/2.0".

Every web app today is married with $COOKIE statements. The whole problem for me as a web developer is that existing applications will have to be rewritten just to run on top of a cookie-less HTTP/2.0 protocol.

SPDY, despite all its architectural problems, has taken off in a huge way (even Facebook is implementing it) just because applications don't have to be rewritten. You could just install mod_spdy in apache and be done with it. From the view of the web developer, these kind of breaking changes just makes life more painful.


Of course this about the server speaking HTTP/2.0. We are talking about a potential client-server protocol upgrade. The fact that millions of applications are written to the 1.1 spec shouldn't be a factor in working on the 2.0 spec.

I believe that HTTP/2.0 will necessarily require applications to be rewritten. Or at least frameworks will need to be updated in order to take advantage of the new features. And you can expect some pain when other features are taken away. If there is a push to make cookies more 'optional', you can expect that clients will start to let users block cookies entirely. Would you rather your application kinda work for people but not those on 2.0 browsers? No, of course not, you'd rather it work for everyone. So why try to run a 1.1 webapp across the 2.0 protocol.

If you are hard-coding $COOKIE statements in your code, you aren't writing it to a sufficient abstraction to be able to survive a future major version jump. But there's nothing wrong with that. Major version jumps in a protocol are pretty rare, and your code will still work just fine as a 1.1 webapp.

If you're writing applications that expect to be dealing with HTTP requests, then of course you'll have to rewrite applications to run on a major version upgrade of a protocol. This is what will be expected. Major version updates shouldn't necessarily be backwards compatible, and that's the main argument of the post. If the update is marginal, there will be nothing to drive adoption of 2.0 over 1.1.

What I was trying to point out was that, (hypothetically) if HTTP/2.0 isn't backwards compatible, that doesn't mean that HTTP/1.1 and 2.0 applications couldn't co-exist on the same site (or even the same server).

Plus, let's remember, this is all hypothetical - we are still trying to figure out the goals of HTTP/2.0.


An important consideration is that forcing the first line on the HTTP2.0 server can actually really hurt it. I mostly agree with Kamp's considerations on HTTP routers, and not having all data necessary for routing at fixed offsets in the packet instantly makes their jobs harder.

The very least, you want the server-provided identity header to be before all the variable-length fields, because in normal situations, most high-throughput servers will be able to fully route their traffic on it alone.


No, I agree that HTTP doesn't make things easier on HTTP routers. Variable length headers that have to be parsed does nothing for speed.

However, this is the method that HTTP has defined for version upgrades, so if you want to muck around with the first line, you lose the ability to co-exist with HTTP/1.1 on the same port.

And I really doubt that they'll want to switch ports for HTTP/2.0.

One possible method would be forcing width padding of the request-path and Host headers. This would potentially make it possible to use fixed offsets. But this strikes me as inelegant.


> Why must HTTP 2.x be backwards compatible with 1.x?

Because the only benefit of removing cookies is a tiny bit of simplicity which could theoretically allow removing (a small amount of) code browsers will already have to keep around for probably at least a decade to support existing websites. If cookies are mostly unused by the time HTTP/3.x rolls around, we can talk...

> Cookies suck, from a technical

Agreed, but...

> and regulatory-compliance standpoint.

I don't understand this point. Surely the need for regulation of user tracking by websites doesn't depend on whether cookies or an equivalent mechanism are being used? If people start using Not Cookies(tm), they will be unregulated at first, but the law will be changed if the effect is the same.

Edit: Similarly, any protocol that gives a website a persistent identity token without its explicitly requesting one is a bad idea - cookies do provide a modicum of visibility to the user regarding who's tracking them. Not sure exactly what Kamp is proposing.

> Plus, I'll finally stop having to clear my cookies every month or so just to log in to my PayPal and American Express accounts. Both sites keep creating unique cookies on every login until there are so many that they pass their own web servers' max header length limits.

Hah, no you won't. I strongly suspect legacy codebases will remain on HTTP/1.1 approximately forever, at least if 2.0 is backwards incompatible.


If you would bother to read the full thread as well as PHK's position proposal you'd see that removing cookies brings more than a tiny benefit. The overall flavor of his proposal is to reify a concept of session and by that eliminate redundant communication. At the same time we get substantial wins in security.


You can add a concept of session without removing cookies, which are trivial to keep supporting since, as I mentioned, browsers will have to support cookies for legacy sites for a long time anyway.

I doubt the security argument amounts to much, considering that there are few sites with cookie-based vulnerabilities, it's long been trivially easy ($_SESSION in PHP) for any site to use identifiers as cookies, and many of the sites that are vulnerable are the kind of old-fashioned things that will never be upgraded anyway.


Once you have a concept of a session unique nonce, cookies are needless. Browsers can implement a locally encrypted resumption store where the user entered entropy never touches the network to resume a session on the same machine. Resuming a session on a new machine could use 2 factor auth with fallback one time capabilities. That's big, doubly so given the clear historical trends in users ability to memorize entropy and the likelihood of hashed password database disclosure.

What you need to understand, is that we can get rid of cookies, live in a more secure world, and give up nothing. The only thing holding us back is unwillingness to understand the underlying issues and fear that we stand to lose something by advocating change.

On top of that, by standardizing on a nonce we avoid all cookie request overhead larger than the nonce, which is not trivial. Every mandatory request byte we save under MTU is huge.


There is no difference between browsers implementing a locally encrypted resumption store for a nonce and for cookies (since again, most sites where security is important already use cookies purely as identifiers); nor does it affect whether sites will start requiring two factor for all logins. The two systems are equivalent except that one is simpler, but the (small amount of) complexity of cookies is not what's blocking these types of security measures.

I'm not saying additional login security isn't a good idea (although since local cookies tend to be compromised by malware running on the machine rather than offline attacks, it may not be that useful to encrypt them), and I'm not saying that avoiding cookies isn't a good idea, because cookies do tend to get bloated. But tying the two proposals together is unhelpful to both, because they're essentially orthogonal.

As for avoiding cookie request overhead, that is again something you can do by adding a standard nonce without actually removing the old mechanism; sites that want to be fast or have a standardized way to interact with HTTP routers, and most sites that use web frameworks, as the frameworks get updated, will use the new one. The only way removing cookies would help is if servers started translating cookies for legacy applications automatically, but I don't see that becoming prevalent because of document.cookie and related concerns.

edit: and again, breaking backwards compatibility is a great way to slow HTTP/2 adoption, not that it really matters unless it brings TLS to all sites along with it (but that's another story...)


They are different because a server suggested resumable session nonce requires no form based authentication. User entered entropy never hits the wire. I do not know how to emphasize this point more strongly. Please read the research on trends in brute force and timing attacks vs human capacity for entropy memorization.

Evidence is clear: The majority of sites do not handle cookies securely. They do not handle user submitted entropy securely. Virtually no one supports 2 factor authorization (props to google on this point).

The scheme I am suggesting and the status quo are not equivalent as you suggest. I do agree that they are orthogonal, except at the point where we decide upon a standard.

I take it as a given that the status quo is unacceptable. If you disagree there's not much for us to discuss.

What I'd like to see is http/2 be a fresh design that is unwilling to sacrifice security. We can always assume http/1.1 or failback to it under negotiation. Because of that, there is no reason to burden a new standard with backwards comparability. Among the SPDY community you see this same perspective, often suggested as a tick/tock strategy where when version n+2 goes online version n+1 becomes backport only support and version n is abandoned.


Server suggested resumable session nonce is just a cookie. If you want the user to be able to put in a password without sending it in plaintext to the server (i.e. make HTTP authentication actually work properly) that would be really really great but, I think, also a different proposal.

Well... I don't think it's worth drawing a line in the sand here, because the speed and, should TLS-always-on make it in, security benefits of the existing protocol are significant enough that everyone should be able to use them without rewriting their authentication system. But I'd certainly be for a comprehensive proposal for a new authentication system; it would probably be significantly cleaner than BrowserID.


> and regulatory-compliance standpoint

One of the main reasons why people can't just turn off Cookies is because they are needed for session management. This makes it very difficult to just disable. If there was a dedicated session management method in HTTP/2.0 then that would remove a lot of the need for Cookies. Then they could be used for what they were intended (local persistent state). This would also give users better methods for managing them (or just disabling them).


Eh... maybe eventually, once nothing uses cookies anymore (including existing HTTP sites). But surely this can be solved today by having browsers force cookies to expire with the session?


And that is what phk is advocating as part of his proposal. Cookies are the wrong tool to be used for session management.


you probably know this, but cookies are sent with every damn request the client makes. So it's a tax, It may be a small % of total traffic, but most of the time, it's useless.


It's easy to work around this in a backwards compatible way, though, in a way that's applicable to more than just cookies - in requests after the first, only re-send headers that have changed.

Or just live with the hit while sites migrate to the new mechanism; I'm fine with it being considered a legacy thing.


I dunno - obviously different projects diverge on required backwards compatibility (Python 2/3?), but SSH isn't called Telnet v2.

For something as fundamental as HTTP the author argues changes need to be radical to drive adoption, but at the same time there's not necessarily wide-spread impetus to do so if the burden is too high. This is engineering on a 15 year time-scale, which I feel a little young (at 24) to well comprehend!

It's not just apps in the web-app sense, but user agents (all the way down to embedded systems) that would need changing to take advantage of the 'sufficient benefits'. That's a pretty massive undertaking.


HTTP 1.x will not disappear just because 2.x is available. Millions of people still run wifi networks on 802.11b; all those routers still keep on workin' even though 802.11a/g/n had 'sufficient benefits' that we build those into new hardware. There was no massive undertaking to replace all the routers in the world.

Nobody would have to change their old apps or hardware. Like SPDY, the availability of a new protocol supported by web browsers just means new stuff can optionally do things old stuff can't.


It doesn't seem like the proposals are "fully" compatible with HTTP. Some of them are entirely different encodings. And I doubt people are thinking of carrying over comments in headers and line folding...

What actually is the proposal to eliminate cookies? Just provide some fixed "identifier" type field?


> And I doubt people are thinking of carrying over comments in headers and line folding...

Heh, I don't think any reasonable web apps actually depend on the value of those :)

> What actually is the proposal to eliminate cookies? Just provide some fixed "identifier" type field?

Unfortunately, I don't think there is a concrete proposal to compare to, other than

    Given how almost universal the "session" concept on the Internet we
    should add it to the HTTP/2.0 standard, and make it available for
    HTTP routers to use as a "flow-label" for routing.


I'd say the cookies situation could be solved by one-page apps. Because the software persists across page-views, it'll be able to maintain session using some other mechanism (localstorage, the lifetime of the tab, whatever). The user could then disable cookies and not be worried about tracking.


so your "solution" for cookies, is to mandate that all web apps must use ajax, and pass session IDs as part of the URL for all requests?


Frankly many problems addressed by these proposals are not things that should be solved in a protocol like HTTP. Do you want every HTTP request to be slowed down with a crypto token exchange and verification? unlikely. In special cases you definately want it, but all the time? Absolutely not.


It may be counter-intuitive, but with the right cyphers, the crypto costs of SSL and SPDY are negligible.


No, they are not. For one thing you have to terminate all your SSL on your loadbalancer in order to distribute the traffic. That makes SPDY a no-go for web-hotels/web-hosting where each customer has their own certificate.

Second, there are perfectly valid legally mandated circumstances which forbid end-to-end privacy, from children in schools to inmates in jail and patients in psych. hospitals, not to mention corporate firewalls and the monster that looks out for classified docs not leaking out of CIA.


> No, they are not. For one thing you have to terminate all your SSL on your loadbalancer in order to distribute the traffic. That makes SPDY a no-go for web-hotels/web-hosting where each customer has their own certificate.

That's what SNI is for.

> Second, there are perfectly valid legally mandated circumstances which forbid end-to-end privacy

Then install spyware on the user's computer, or add a trusted SSL key and mitm all the things.


I was talking about performance, but: SNI addresses your first point. For the second point, whoever needs to enforce snooping already has to deal with SSL.


What about embedded devices/the 'internet of things'? Should I have to add a (costly) crypto chip to my microcontroller just to have a HTTP/web interface? Or a large SSL stack?

(HTTP 1 would be sufficient for these use cases for a long time to come, but resource considerations should factor into these standards discussions)


I would keep HTTP/1.1, as you suggest. SSL servers on cheap (sub-Raspberry Pi) embedded devices are a no-go for another reason: it is hard to keep the certificate unique and private. Embedded devices with smart cards might be workable.


One huge benefit I think people are missing about the notion of a protocol standard session mechanism, is that we can use HTTP auth much easier, and perhaps get away from this notion of every site having to redo the login process. Browsers can handle the "remember login" settings, and logging out is as simple as tab closing. No "remember this computer" or "don't remember this computer" checkbox confusion. No random sites saying "remember me always" and requiring manual logout on a borrowed computer. It certainly helps with the password wallet concept too.

Sure all that stuff has become semi-standard as it currently exists, but it is ugly, hacky, and sometimes doesn't work, and other times opens doors for hilarious malfeasance.


"Overall, I find all three proposals are focused on solving yesteryears problems, rather than on creating a protocol that stands a chance to last us the next 20 years."


"We have also learned that a protocol which delivers the goods can replace all competition in virtually no time."

This is an argument for websockets eating protocol lunch.


In my opinion it is always a good idea in these kind of situations to set a goal we should strive to.

I wonder what the ideal web protocol would look like. If, for example, we didn't have a burden of billions of servers and Internet reliance on HTTP/1.x protocol.

What would be the most ideal solution that would suite emerging use cases for the Web? Are there any research papers on this topic?


I agree with this way of thinking. phkamp has come up with some innovative ideas, but if it doesn't come easy for the rest of us there are lots of techniques to help e.g. asking, "if I was superman and could implement any solution, what would it be" http://personalexcellence.co/blog/25-brainstorming-technique...


load balancers aren't meant to just be "HTTP routers". they can definitely be used as such for smaller applications and do a good job at it, but a real load balancer needs to be quite complex, being able to adapt to the underlying applications that make use of it.

if your goal is to only route HTTP requests, then you're only solving the first step of an increasingly complicated field of computer science (namely, web applications).

Cookies aren't going to go away. if you want to improve the protocol to deal with cookies better, that makes sense, but acting like they are some kind of evil on the internet that should be forgotten isn't going to work. it's a bit self-defeating to argue that some protocols failed because of failure to provide new benefits and then argue against Cookies in HTTP!


load balancers aren't meant to just be "HTTP routers". they can definitely be used as such for smaller applications and do a good job at it, but a real load balancer needs to be quite complex, being able to adapt to the underlying applications that make use of it

I think Poul-Henning Kamp is fairly well qualified to discuss what load balancers do.

(And yes, load balancer are fundamentally HTTP Routers. Yes, sometimes they do content manipulation etc, but all those features are add-ons to the basic use-case)


Perhaps Kamp is qualified to discuss what some 'load-balancers' do...

The proxy developers have always had their doubts about SPDY (you can see them when @mnot first proposed it as a starting point)

Terminating SPDY at the HTTP Router makes a lot of sense architecturally but I know some orgs don't terminate SSL at the load-balancer due to the licensing costs.

Ultimately we need load-balancing options and someone to develop the opensource proxies (haproxy, varnish etc.) into more sophisticated offerings.

Perhaps they'll be a SPDY module for Traffic Server


I don't see the issue with doing away with cookies. Cookies sucks for so many reasons. A natively handled session is just fine. Oh yes, it means also you can't be easily tracked, because that's the other purpose of cookies. Well too bad. You can still track through other means.

If you wanna store stuff, there's HTML5. Cookies are really just tracking and session.

As for the router, what he says makes complete sense, but, if there is more to it, then, what are you thinking about? Personally I think the host header is the most important thing to parse, for, well, routing, termination, etc. I'm not certain what else is needed beyond that point.


Although I think he should further consider why many websites are using cookies at all or even cryptographically secure cookies, I think he makes a good point about caching and cookies. Websites with large amounts of traffic often move their static content to separate domains to avoid clients sending cookies headers on every request to static/forever-cacheable assets.


I can think of a few organisations that benefit financially, a great amount from the fact that cookies allow them to track users. Ad companies like Google and Microsoft spring to mind. The same organisations building our major browsers... Conflict of interest? You better believe it.


Imagine if browsers reported a user-resettable uuid sent over on each connection open. Advertisers get the benefit of being able to track users, do it more securely since no data is ever stored on the client, and must eat the cost of actually storing the key-value pairs, instead of passing it down to users and expending massive amounts of bandwidth bouncing the data back and forth.

It's not about killing cookies, it's about finding a better solution that solves the problem better.

Note: I work for a startup that "benefits financially" from tracking users—a feature without which we would not have a business.


This already exists in spirit, in the form of "Incognito" or "Private" browsing. This doesn't solve the protocol issues of cookie bandwidth and insecure storage, but it does provide a user-resettable store of cookies.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: