
How We Release So Frequently - TomNomNom
http://engineering.skybettingandgaming.com/2016/02/02/how-we-release-so-frequently/
======
simonw
This is a good solid write-up of techniques that I think are emerging as best
practices generally for high scale, complex sites. The approach described
matches what we do at Eventbrite pretty closely, and I know companies like
Etsy and Slack use the same kind of process.

Feature flags in particular are a very powerful tool for managing feature
releases independent of the deploy cycle. Here's a good recent article that
dives into those in more detail: [http://martinfowler.com/articles/feature-
toggles.html](http://martinfowler.com/articles/feature-toggles.html)

~~~
kough
I just want to throw out that there's nothing about this that won't work for
!("high scale, complex") sites. I worked at a ~20 person startup and we
arrived at the same decision regarding database migrations and, you know what,
it wasn't that hard and it didn't take that much time and it made deploys a
lot smoother.

I see a lot of other people mentioning the annoyance factor. Like anything
else, you get used to it, and appreciate its advantages.

------
ktRolster
The biggest difficulty is managing database rollbacks. When you mess up your
database, rolling back can be tough.

These guys avoid the problem by never rolling back the database, and never
making changes that might require that.

~~~
axelfontaine
Flyway author here. Rollback is an illusion once your transaction has been
committed. At best you can attempt to issue a compensating transaction to undo
some of the effects. More details:
[https://flywaydb.org/documentation/faq.html#downgrade](https://flywaydb.org/documentation/faq.html#downgrade)

~~~
w4tson
Currently using flyway on an enterprise Java project. Brilliant library.
Excellent documentation, developers sing its praises all day everyday.

------
rachelbythebay
Get off NFS while you still can. Thank me later.

~~~
ptrincr
I think NFS often gets an undue bad reputation. I work at a company which uses
NFS at scale and it does the job without too much trouble. NFS is used as the
storage for vmware datastores, xen primary storage and also for shared storage
mounted between servers.

For the latter case, the mounting of these partitions can be automated with
config management on the linux servers. You have to be careful with UID's and
GID's but config management helps with this.

The filers supplying the NFS storage can be exploited to provide replication
to other datacenters,snapshots and also provide redundancy with multiple heads
serving the volumes.

In the past I've used Fibre channel ( found it overly complex) and iSCSI.
iSCSI was fairly straight forward to use, but I've never tried to automate it.
I guess there isn't a reason you couldn't however. For complexity I guess its
Fibre>iSCSI>NFS.

Performance wise we don't have any issues with NFS itself, the bottleneck is
sometimes the filer trying to keep up :-)

Anyhow, in complex environments, sometimes its good to keep things simple
where you can. NFS helps with that, its stable, scalable and the performance
is comparable to iSCSI.

Removing the need for shared storage on the OS where possible is the ultimate
aim though.

~~~
digi_owl
I wonder of how much the experience differs based on the NFS version being
used.

------
rixed
In the old days of relational databases we had an abstraction layer between
actual storage and applications called a query language, decoupling them with
functions and views, which were helpful to change the schema independently of
the code...

~~~
QuercusMax
We don't use database views, but kinda-sorta mimic them by separating the
persistence layer from the API layer. It can be annoying to maintain (at the
big G there are lots of SWEs who joke that their job is writing code to copy
fields between protobufs), but the alternative is to couple your API directly
to what you're storing in the database. That is a road that leads to pain.

The benefit over using views is that at your code is written in the same
language, instead of having a while bunch of logic running semi-hidden in the
database. If you have a bug in your view, you have to update your DB schema
(or at least roll out new PL/SQL DB code or whatever). And if you're working
with a planet-scale distributed app, it just plain won't work.

------
fancy_pantser
I literally checked the date twice to make sure this wasn't 10+ years old.

~~~
vemv
I guess your point being that these aren't novel techniques?

The author sure didn't invent them, but they aren't widespread enough yet. In
fact e.g. Rails puts you in the opposite mindset (which is OK at early
stages).

------
girvo
Neat. Interestingly, for all its faults PHP in my experience has made this a
little easier to achieve than other languages we also use: shared-nothing and
no internal process state between requests makes cutting over a bit easier
than our equivalent node servers. Some great practical advice in this article.

~~~
qwer
That's a flaw in how you use node, not node itself. We run dozens of instances
of node without internal process state, the need for sticky sessions, etc.
You're losing a lot of load-balancing ability by not keeping this discipline
too.

------
Can_Not
I can see the infrastructural scenarios where this would be beneficial, but
maybe even then, I think this is a convoluted way to not use what I consider
critical workflow tools, specifically environmental files, config files, git
merge, git rebase, branches. I think you would be better off looking to see if
you can organize your files better, restrict merging major branches to
employees who are properly trained/competent in merging.

------
pbreit
Are frequent releases some sort of advantage?

~~~
FordPrefectAO
Whatever you are developing has no value until someone actually uses it.
Therefore the quicker you can confidently release something(feature, bugfix,
etc), the more value you gave deliver.

This idea comes from the concept of inventory waste from lean manufacturing

~~~
sdrothrock
In addition, faster releases can also reduce sunk costs. For example, if you
release new features in stages, you can correct course quickly from the
beginning based on actual user use/feedback instead of spending months
building up something that may not meet actual needs.

~~~
Can_Not
This is hugely important. My current employer wants our grand release to have
a powerful search feature. We have zero customers to use it and Zero data to
search through.

------
tenken
What language, framework, tools do you use to manage these phased migrations?
It sounds like alot of extra code to write.

Also 1 site update, or version number, could really be N releases until
fruition -- which don't sound like traditional releases to me.

~~~
d0m
>> It sounds like alot of extra code to write.

What he's saying is more about convention than writing code. For instance,
instead of adding a column "abc" and doing:

foo.abc = 123;

they would do something like:

if (foo.abc) { foo.abc = 123; }

make sure all tests pass, and then migrate the db.

If you're asking about tools to migrate code, all popular languages have one.
(I.e. django comes with one that's already really good).

------
DigitalJack
seems like there is a race condition on the docroot switchover, but maybe with
their forward only migrations it's a non-issue.

------
tbarbugli
Interesting 8/10 years ago!

------
stuaxo
Hm, is it any better working there than for the TV part ?

~~~
unistdh
There are many tech teams at Sky. Each one is very different in terms of
culture, tech, product etc.

As I understand it, Sky bet is quite separate from the rest of the company.
This is probably a good thing...

~~~
michae1m
Separate company these days.

