I am designing a 2-host Hyper-V R2 cluster with 6-10 guests stored on a SMB iSCSI SAN device (probably Promise VessRAID). I will be getting at least two of the SAN devices and
need to eliminate the storage a single point of failure. Ideally, that would involve real-time failover for the storage, like the Windows failover clustering does for the hosts.
This design will be used at around six of our sites, and I would like to allow for us to eventually setup a cluster at colocation site and replicate each site's VMs there for DR. (Ideally a live multi-site cluster, but a manual import of the VMs would be fine for this sort of DR.)
The tools that come with enterprise SANs, like EMC and NetApp, seem to be the most commonly used items for a Hyper-V cluster, but I can't afford their prices with my budget. Outside of them, the two tools that seem to be most common for Hyper-V storage replication are SteelEye (now SIOS) DataKeeper Cluster Edition and Double-Take Availability.
Originally, I was planning on using Clustered Shared Volume(s) (CSV), but it seems like replication support for these is either not available or brand new in both these products. It looks like CSVs are supported in Double-Take 5.22, see this discussion, but I don't think I want to run something that new in production.
Right now, it seems like the best option for me is not to implement CSVs, implement some sort of storage replication, and upgrade to CSVs at a later date once replicating them is more mature. I would love to have live migration, and CSVs are not required for live migration if you are using one LUN per VM, so I guess this is what I'll do.
I would prefer to stick to the using the Microsoft Windows Server and Hyper-V tools and features as much as possible. From that standpoint, SteelEye looks more appealing than Double-Take because they make the DataKeeper volume(s) available to the Failover Clustering Manager and then failover clustering is all configured and managed through the native Microsoft tools. Double-Take says that "clustered Hyper-V hosts are not supported," and Double-Take Availability itself seems to be what is used for the actual clustering and failover.
Does anyone know if any of these replication tools work with more than two hosts in the cluster? All the information I can find on the web only uses two hosts in their examples.
Are there any better tools than SteelEye and Double-Take for doing what I am trying to do, which is eliminate the storage as as single point of failure? Neverfail, AppAssure, and DataCore all seem to offer similar functionality, but they don't seems to be as popular as SteelEye and Double-Take.
I have seen a number of people suggest using Starwind iSCSI SAN software for the shared storage, which includes replication (and CSV replication at that). There are a couple of reasons I have not seriously considered this route:
1) The company I work for is exclusively a Dell shop and Dell does not have any servers with that I can pack with more than six 3.5" SATA drives.
2) In the future, it could be advantegous for us to not be locked into a particular brand or type of storage and third-party replication softwares all allow replication to heterogeneous storage devices.
I am pretty new to iSCSI and clustering, so please let me know if it looks like I am planning something that goes against best practices or overlooking/missing something.