Achieving Zero Downtime Deployment
- by MattW
I am trying to achieve zero downtime deployments so I can deploy less during off hours and more during "slower" hours - or anytime, in theory.
My current setup, somewhat simplified:
Web Server A (.NET App)
Web Server B (.NET App)
Database Server (SQL Server)
My current deployment process:
"Stop" the sites on both Web Server A and B
Upgrade the database schema for the version of the app being deployed
Update Web Server A
Update Web Server B
Bring everything back online
Current Problem
This leads to a small amount of downtime each month - about 30 mins. I do this during off hours, so it isn't a huge problem - but it is something I'd like to get away from.
Also - there is no way to really go 'back'. I don't generally make rollback DB scripts - only upgrade scripts.
Leveraging The Load Balancer
I'd love to be able to upgrade one Web Server at a time. Take Web Server A out of the load balancer, upgrade it, put it back online, then repeat for Web Server B.
The problem is the database. Each version of my software will need to execute against a different version of the database - so I am sort of "stuck".
Possible Solution
A current solution I am considering is adopting the following rules:
Never delete a database table.
Never delete a database column.
Never rename a database column.
Never reorder a column.
Every stored procedure must be versioned.
Meaning - 'spFindAllThings' will become 'spFindAllThings_2' when it is edited.
Then it becomes 'spFindAllThings_3' when edited again.
Same rule applies to views.
While, this seems a bit extreme - I think it solves the problem. Each version of the application will be hitting the DB in a non breaking way. The code expects certain results from the views/stored procedures - and this keeps that 'contract' valid. The problem is - it just seeps sloppy. I know I can clean up old stored procedures after the app is deployed for awhile, but it just feels dirty. Also - it depends on all of the developers following these rule, which will mostly happen, but I imagine someone will make a mistake.
Finally - My Question
Is this sloppy or hacky?
Is anybody else doing it this way?
How are other people solving this problem?