That doesn't take away from my appreciation of the pattern, though: I'm very much in favour of rolling releases forwards, rather than being limited to two colours.
Making this handoff automatic is definitely possible as well, though we do want people to reload occasionally to get new client-side code.
I haven't checked to see if autobahn has this functionality.
I haven't ever worked on chat services, so this may not be reasonable. Would it be possible to use some other termination endpoint that sits in front of the service, that allows you to maintain persistent connections to the clients, but make for more transparent swaps of backend services?
So, for example could you leverage nginx or haproxy as the "termination" point for the chat connection, with those proxying back to the kubernetes service endpoint, which then proxy back to the real backend service. So, when you go to swap out the backend service, nginx / haproxy start forwarding to the new service transparently, while still maintaining the long-lived connection with the client.
If this was doable, it would mean you'd only have to drain if you needed to swapout the proxy layer, which is likely a less-frequent task, and thus allows you more agility with your backend services.
I have used it before. Super easy to setup. Even with kubernetes.
It also seems like they could have done simple blue/green and extend their websocket protocol and client to support a hand-off "hey, there's a new version of the proxy, reconnect in x seconds" message (and have idempotency of messages) such that they could have a rather smooth schedule where everyone can reconnect then disconnect gradually over some period of minutes instead of hours and not have either sudden spikes or any period of interruption for clients.
Also, if the previous ReplicaSet a Deployment is rolling past has several pods, maybe only some of them need to stay alive (maybe some drain sooner than others.)
Perhaps the whole endeavor should just be to make Pod drainage a bit more explicit than just terminationGracePeriodSeconds... perhaps letting a pod signal with a positive confirmation that it's shutting down (letting connections drain) and the rest of the k8s controllers can just leave it alone until it terminates itself.
Although really, I think a combination of setting terminationGracePeriodSeconds to unlimited, and having a health check that ensures that it doesn't get wedged and miss the termination signal (by checking that a pod status of "shutting down" corresponds to some property of the container, like a health endpoint saying the shutdown is in progress...) and then nothing else needs to be done. Basically, color me skeptical when they say:
"We used service-loadbalancer to stick sessions to backends and we turned up the terminationGracePeriodSeconds to several hours. This appeared to work at first, but it turned out that we lost a lot of connections before the client closed the connection. We decided that we were probably relying on behavior that wasn’t guaranteed anyways, so we scrapped this plan."
(This also depends on the container obeying the standard SIGTERM contract to properly drain connections but not accept new ones, which is pretty standard in most web servers nowadays.)
So if you set a 5 hour grace period, and a preStop hook that invokes a script that doesn’t return until all connections are closed (but which tells the container process not to accept new ones) you can control the drain rate.
There is some app level smarts required - to have new connections rejected and have any proxies rebalance you. Haproxy does this in most cases, but the service proxy won’t (in iptables mode).
If that’s not the behavior you’re seeing, please open a bug on Kube and assign me (this is something I maintain)
One extra thing I remember that was sort of problematic was that when a pod was Terminating it'd get removed from the Endpoints, so any tooling that was using the API info to keep an eye on connections was basically unusable at that point.
Live demo: https://youtu.be/1kjgwXP_N7A?t=10m46s
The author points out that the issue with Blue/Green/AnyColors deploys is that they need 16 pods per color at all time (which in their case would end up being 128 pods) and 24/48 hours for each connection pool to drain.
But how is using a SHA instead of a COLOR any different? Unless I am missing something, and, if running 128 pods and 24/48 hours of draining is the issue, then using SHA instead of colors is not solving those 2 issues.
You'll still need 16 pods and 24/48 hours per SHA-deploy, and you're actually making it worst by not using fixed colors since you have quite a lot more SHA at your disposition.
You could do the same with $Color, just seems to be clearer since people often think of $Color as being static deployment infrastructure, whereas people are used to SHA's pointing to branches that are naturally cleaned up.
I'm curious though, if the rollout was over a couple of hours for example, why would reconnections be that big of a problem? We host about 10,000+ websocket on a $20 VPS, and the Go server hosting it crashes from time to time. A surge of 10,000 reconnections instantly afterwards has never lasted for more than a minute or so, so why is it so bad? Moments of peak load aren't that big of a deal, or?
Am I missing something, or wouldn’t it be as “simple” as connecting to the running container and running netstat and conditionally killing the pod based on the number of connections? I bet you thought of that, so I’m curious why it didn’t work for you.
Could probably solve this with a readiness probe / health check of sorts that is smart enough to know what low usage means.
You can drain stuff by changing a Service's selector but leaving the Deployment alone. Instead of changing a Deployment and doing a rolling update, create a new deployment and repoint the Service. Existing connections will remain until you delete the underlying Deployment.