How far should we take the N+N redundancy craziness ?
Posted
by
Brann
on Server Fault
See other posts from Server Fault
or by Brann
Published on 2009-08-14T07:11:27Z
Indexed on
2011/06/21
0:23 UTC
Read the original article
Hit count: 284
Hardware
|redundancy
The industry standard when it comes from redundancy is quite high, to say the least. To illustrate my point, here is my current setup (I'm running a financial service).
Each server has a RAID array in case something goes wrong on one hard drive
.... and in case something goes wrong on the server, it's mirrored by another spare identical server
... and both server cannot go down at the same time, because I've got redundant power, and redundant network connectivity, etc
... and my hosting center itself has dual electricity connections to two different energy providers, and redundant network connectivity, and redundant toilets in case the two security guards (sorry, four) needs to use it at the same time
... and in case something goes wrong anyway (a nuclear nuke? can't think of anything else), I've got another identical hosting facility in another country with the exact same setup.
- Cost of reputational damage if down = very high
- Probability of a hardware failure with my setup : <<1%
- Probability of a hardware failure with a less paranoiac setup : <<1% ASWELL
- Probability of a software failure in our application code : >>1% (if your software is never down because of bugs, then I suggest you doublecheck your reporting/monitoring system is not down. Even SQLServer - which is arguably developed and tested by clever people with a strong methodology - is sometimes down)
In other words, I feel like I could host a cheap laptop in my mother's flat, and the human/software problems would still be my higher risk.
Of course, there are other things to take into consideration such as :
- scalability
- data security
- the clients expectations that you meet the industry standard
But still, hosting two servers in two different data centers (without extra spare servers, nor doubled network equipment apart from the one provided by my hosting facility) would provide me with the scalability and the physical security I need.
I feel like we're reaching a point where redundancy is just a communcation tool. Honestly, what's the difference between a 99.999% uptime and a 99.9999% uptime when you know you'll be down 1% of the time because of software bugs ?
How far do you push your redundancy crazyness ?
© Server Fault or respective owner