

Scalable Web Architectures: Common Patterns and Approaches - wave
http://www.slideshare.net/iamcal/scalable-web-architectures-common-patterns-and-approaches-web-20-expo-nyc-presentation

======
antirez
The idea of storing sensible stuff client side is indeed powerful and very
simple. The problem is the client could change this state, and in most
applications this is not a cool idea, so it's just a matter of signing it.

Assuming we want to store "foobar" client side, and make sure the client can't
change this, all we need is to have a server-side-secret, that is, a random
string that can't be guessed from outside. Get it with this:

    
    
        dd if=/dev/urandom bs=1 count=512 | md5sum
    

Well now just we can set the cookie as

    
    
        "foobar|"+sha1_hmac("foobar","ServerSideSecret").
    

There is no way the user can alter the cookie, otherwise the signature will no
longer match. But often there are other kind of attacks: for example the user
can get the cookie from another user, and in the context of the wrong user id
the cookie content can be dangerous.

Another common condition is that this information should only be valid for a
given amount of time.

To fix both this problems just add to the string to sign the current unix time
and the user id, so the final cookie will be (assuming 100 is the user id):

    
    
        "foobar|1244498681|100|"+sha1_hmac("foobar|1244498681|100","ServerSideSecret")
    

When you check the cookie back, make sure to compare the server unix time with
the timestamp in the cookie. If it's older than N seconds just ignore it.

Another common trick is to add the user IP address to the string in order to
avoid that the cookie can be reused in the context of a different IP.

This stuff are very cool to save CPU time in your servers but make sure to
think about all the kind of attacks like playback attacks or sending the
cookie in other contexts, users, IP, time, applications (if you share
secrets), and so on.

~~~
jjs
I like to stick the hmac hash at the beginning of the cookie text, since it
has a known length. I don't care about tampering on the client-side (since you
could achieve the same effect with firebug), so js scripts can simply lop off
the first _n_ characters and use the rest of the cookie text as-is.

------
pierrefar
Awesome overview.

I've seen the "super-slim" sessions idea discussed elsewhere in relation to
Flickr, but couldn't find much about it on the web. This presentation finally
explains it properly and it's a wonderful thing.

Now... about integrating it into my web apps....

------
gcv
The idea of shoving everything into an encrypted cookie --- it seems like it
should work well, and I really like not storing session state server-side.
Wouldn't the cookie run afoul of some security problem or other, though? I'm
having a hard time thinking up scenarios, but I am not a security expert.

<http://news.ycombinator.com/item?id=639647>

[EDIT] Also interesting that the talk did not mention nginx as either a proxy
or a software load balancer.

~~~
zmimon
One thing you gotta remember is that that cookie is going to get shoved up to
your server with _every single request_ (images, css, ajax, everything). So if
you end up putting 2k of stuff in there then it can really add up (esp. given
that many clients have asymmetric connection speed). You don't want your tiny
ajax requests that load 50 bytes of data sending the equivalent of 4 - 8k
upstream because they are all sending a bunch of unnecessary stuff in your
cookie. So, yes, it's good - but the key to it is keeping the cookie "slim".

(NB: couldn't see the presentation since I don't have flash, if they covered
this, apologies).

~~~
lsb
Yes, this is why you load images and CSS from a CDN, and you can do
JSON+callbacks instead of Ajax onto a big app server somewhere. (There's a gem
for Rack, if you're on Ruby, to automatically handle the parenthesization.)

~~~
gcv
What do you mean by JSON+callbacks?

~~~
lsb
You promote data to code.

JSON is usually a data format, like {result: 42}. If you wrap that in
callback({result: 42}), that's executable javascript code.

Including script tags on your pags via JS, like how the Delicious API / Flickr
API / Google Translate & whatnot APIs, all rely on this. Check out
<http://developer.yahoo.com/common/json.html#callbackparam>

------
datums
If you're on EC2 they now offer Elastic Load Balancing which allows you to
move away from the RR Dns setup many are using. If you have your own hardware
and are looking for a Layer 4 load balancing solution take a look at
<http://www.ultramonkey.org/>

