glusterfs mounts get unmounted when 1 of the 2 bricks goes offline

Posted by Shiquemano on Server Fault See other posts from Server Fault or by Shiquemano
Published on 2013-10-31T21:18:26Z Indexed on 2013/10/31 21:57 UTC
Read the original article Hit count: 259

Filed under:

I have an odd case where 1 of the 2 replicated glusterfs bricks will go offline and take all of the client mounts down with it. As I understand it, this should not be happening. It should fail over to the brick that is still online, but this hasn't been the case. I suspect that this is due to configuration issue.

Here is a description of the system:

2 gluster servers on dedicated hardware (gfs0, gfs1)
8 client servers on vms (client1, client2, client3, ... , client8)

Half of the client servers are mounted with gfs0 as the primary, and the other half are pointed at gfs1. Each of the clients are mounted with the following entry in /etc/fstab:

/etc/glusterfs/datavol.vol /data glusterfs defaults 0 0

Here is the content of /etc/glusterfs/datavol.vol:

volume datavol-client-0
    type protocol/client
    option transport-type tcp
    option remote-subvolume /data/datavol
    option remote-host gfs0
end-volume

volume datavol-client-1
    type protocol/client
    option transport-type tcp
    option remote-subvolume /data/datavol
    option remote-host gfs1
end-volume

volume datavol-replicate-0
    type cluster/replicate
    subvolumes datavol-client-0 datavol-client-1
end-volume

volume datavol-dht
    type cluster/distribute
    subvolumes datavol-replicate-0
end-volume

volume datavol-write-behind
    type performance/write-behind
    subvolumes datavol-dht
end-volume

volume datavol-read-ahead
    type performance/read-ahead
    subvolumes datavol-write-behind
end-volume

volume datavol-io-cache
    type performance/io-cache
    subvolumes datavol-read-ahead
end-volume

volume datavol-quick-read
    type performance/quick-read
    subvolumes datavol-io-cache
end-volume

volume datavol-md-cache
    type performance/md-cache
    subvolumes datavol-quick-read
end-volume

volume datavol
    type debug/io-stats
    option count-fop-hits on
    option latency-measurement on
    subvolumes datavol-md-cache
end-volume

The config above is the latest attempt at making this behave properly. I have also tried the following entry in /etc/fstab:

gfs0:/datavol /data glusterfs defaults,backupvolfile-server=gfs1 0 0

This was the entry for half of the clients, while the other half had:

gfs1:/datavol /data glusterfs defaults,backupvolfile-server=gfs0 0 0

The results were exactly the same as the above configuration. Both configs connect everything just fine, they just don't fail over.

Any help would be appreciated.

Developer IT

glusterfs mounts get unmounted when 1 of the 2 bricks goes offline - Developer IT

glusterfs mounts get unmounted when 1 of the 2 bricks goes offline

replication

mount

failover

glusterfs

Related posts about replication

Is there any replication standard or concept for application server data replication

MySQL Connect 8 Days Away - Replication Sessions

re-enabling a table for mysql replication

Replication Services in a BI environment

Fixing MySQL Replication

Related posts about mount

12.10 update breaks NFS mount

Mount SMB / AFP 13.10

Mount Return Code for CIFS mount

Disable raid member check upon mount to mount damaged nvidia raid1 member

Network shares do not mount.

Categories cloud