Running a Mongo Replica Set on Azure VM Roles
- by Elton Stoneman
Originally posted on: http://geekswithblogs.net/EltonStoneman/archive/2013/10/15/running-a-mongo-replica-set-on-azure-vm-roles.aspxSetting up a MongoDB Replica Set with a bunch of Azure VMs is straightforward stuff. Here’s a step-by-step which gets you from 0 to fully-redundant 3-node document database in about 30 minutes (most of which will be spent waiting for VMs to fire up).
First, create yourself 3 VM roles, which is the minimum number of nodes you need for high availability. You can use any OS that Mongo supports. This guide uses Windows but the only difference will be the mechanism for starting the Mongo service when the VM starts (Windows Service, daemon etc.)
While the VMs are provisioning, download and install Mongo locally, so you can set up the replica set with the Mongo shell.
We’ll create our replica set from scratch, doing one machine at a time (if you have a single node you want to upgrade to a replica set, it’s the same from step 3 onwards):
1. Setup Mongo
Log into the first node, download mongo and unzip it to C:.
Rename the folder to remove the version – so you have c:\MongoDB\bin etc. – and create a new folder for the logs, c:\MongoDB\logs.
2. Setup your data disk
When you initialize a node in a replica set, Mongo pre-allocates a whole chunk of storage to use for data replication. It will use up to 5% of your data disk, so if you use a Windows VM image with a defsault 120Gb disk and host your data on C:, then Mongo will allocate 6Gb for replication. And that takes a while. Instead you can create yourself a new partition by shrinking down the C: drive in Computer Management, by say 10Gb, and then creating a new logical disk for your data from that spare 10Gb, which will be allocated as E:. Create a new folder, e:\data.
3. Start Mongo
When that’s done, start a command line, point to the mongo binaries folder, install Mongo as a Windows Service, running in replica set mode, and start the service: cd c:\mongodb\bin
mongod -logpath c:\mongodb\logs\mongod.log -dbpath e:\data -replSet TheReplicaSet –install
net start mongodb
4. Open the ports
Mongo uses port 27017 by default, so you need to allow access in the machine and in Azure. In the VM, open Windows Firewall and create a new inbound rule to allow access via port 27017. Then in the Azure Management Console for the VM role, under the Configure tab add a new rule, again to allow port 27017.
5. Initialise the replica set
Start up your local mongo shell, connecting to your Azure VM, and initiate the replica set: c:\mongodb\bin\mongo sc-xyz-db1.cloudapp.net
rs.initiate()
This is the bit where the new node (at this point the only node) allocates its replication files, so if your data disk is large, this can take a long time (if you’re using the default C: drive with 120Gb, it may take so long that rs.initiate() never responds. If you’re sat waiting more than 20 minutes, start another instance of the mongo shell pointing to the same machine to check on it).
Run rs.conf() and you should see one node configured.
6. Fix the host name for the primary – *don’t miss this one*
For the first node in the replica set, Mongo on Windows doesn’t populate the full machine name. Run rs.conf() and the name of the primary is sc-xyz-db1, which isn’t accessible to the outside world. The replica set configuration needs the full DNS name of every node, so you need to manually rename it in your shell, which you can do like this: cfg = rs.conf()
cfg.members[0].host = ‘sc-xyz-db1.cloudapp.net:27017’
rs.reconfig(cfg)
When that returns, rs.conf() will have your full DNS name for the primary, and the other nodes will be able to connect.
At this point you have a working database, so you can start adding documents, but there’s no replication yet.
7. Add more nodes
For the next two VMs, follow steps 1 through to 4, which will give you a working Mongo database on each node, which you can add to the replica set from the shell with rs.add(), using the full DNS name of the new node and the port you’re using: rs.add(‘sc-xyz-db2.cloudapp.net:27017’)
Run rs.status() and you’ll see your new node in STARTUP2 state, which means its initializing and replicating from the PRIMARY. Repeat for your third node: rs.add(‘sc-xyz-db3.cloudapp.net:27017’)
When all nodes are finished initializing, you will have a PRIMARY and two SECONDARY nodes showing in rs.status().
Now you have high availability, so you can happily stop db1, and one of the other nodes will become the PRIMARY with no loss of data or service.
Note – the process for AWS EC2 is exactly the same, but with one important difference. On the Azure Windows Server 2012 base image, the MongoDB release for 64-bit 2008R2+ works fine, but on the base 2012 AMI that release keeps failing with a UAC permission error. The standard 64-bit release is fine, but it lacks some optimizations that are in the 2008R2+ version.