High Availability
- by mattjgilbert
Udi Dahan presented at the UK Connected Systems User Group last night. He discussed High Availability and pointed out that people often think this is purely an infrastructure challenge. However, the implications of system crashes, errors and resulting data loss need to be considered and managed by software developers. In addition a system should remain both highly reliable (backwardly compatible) and available during deployments and upgrades. The argument is that you cannot be considered highly available if your system is always down every time you upgrade.
For our recent BizTalk 2009 upgrade we made use of our Business Continuity servers (note the name, rather than calling them Disaster Recovery servers ? ) to ensure our clients could continue to operate while we upgraded the Production BizTalk servers. Then we failed back to the newly built 2009 environment and rebuilt the BC servers. Of course, in the event of an actual disaster there was a window where either one or the other set were not available to take over – however, our Staging machines were already primed to switch to production settings, having been used for testing the upgrade in the first place.
While not perfect (the failover between environments was not automatic and without some minimal outage) planning the upgrade in this way meant BizTalk was online during the rebuild and upgrade project, we didn’t have to rush things to get back on-line and planning meant we were ready to be as available as we could be in the event of an actual disaster.