RHCS: GFS2 in A/A cluster with common storage. Configuring GFS with rgmanager

Posted by Pavel A on Server Fault See other posts from Server Fault or by Pavel A
Published on 2012-12-11T15:10:04Z Indexed on 2012/12/15 17:05 UTC
Read the original article Hit count: 334

Filed under:
|
|

I'm configuring a two node A/A cluster with a common storage attached via iSCSI, which uses GFS2 on top of clustered LVM. So far I have prepared a simple configuration, but am not sure which is the right way to configure gfs resource.

Here is the rm section of /etc/cluster/cluster.conf:

<rm>
    <failoverdomains>
        <failoverdomain name="node1" nofailback="0" ordered="0" restricted="1">
            <failoverdomainnode name="rhc-n1"/>
        </failoverdomain>
        <failoverdomain name="node2" nofailback="0" ordered="0" restricted="1">
            <failoverdomainnode name="rhc-n2"/>
        </failoverdomain>
    </failoverdomains>
    <resources>
        <script file="/etc/init.d/clvm" name="clvmd"/>
        <clusterfs name="gfs" fstype="gfs2" mountpoint="/mnt/gfs"  device="/dev/vg-cs/lv-gfs"/>
    </resources>
    <service name="shared-storage-inst1" autostart="0" domain="node1" exclusive="0" recovery="restart">
        <script ref="clvmd">
            <clusterfs ref="gfs"/>
        </script>
    </service>
    <service name="shared-storage-inst2" autostart="0" domain="node2" exclusive="0" recovery="restart">
        <script ref="clvmd">
            <clusterfs ref="gfs"/>
        </script>
    </service>
</rm>

This is what I mean: when using clusterfs resource agent to handle GFS partition, it is not unmounted by default (unless force_unmount option is given). This way when I issue

clusvcadm -s shared-storage-inst1

clvm is stopped, but GFS is not unmounted, so a node cannot alter LVM structure on shared storage anymore, but can still access data. And even though a node can do it quite safely (dlm is still running), this seems to be rather inappropriate to me, since clustat reports that the service on a particular node is stopped. Moreover if I later try to stop cman on that node, it will find a dlm locking, produced by GFS, and fail to stop.

I could have simply added force_unmount="1", but I would like to know what is the reason behind the default behavior. Why is it not unmounted? Most of the examples out there silently use force_unmount="0", some don't, but none of them give any clue on how the decision was made.

Apart from that I have found sample configurations, where people manage GFS partitions with gfs2 init script - https://alteeve.ca/w/2-Node_Red_Hat_KVM_Cluster_Tutorial#Defining_The_Resources

or even as simply as just enabling services such as clvm and gfs2 to start automatically at boot (http://pbraun.nethence.com/doc/filesystems/gfs2.html), like:

chkconfig gfs2 on

If I understand the latest approach correctly, such cluster only controls whether nodes are still alive and can fence errant ones, but such cluster has no control over the status of its resources.

I have some experience with Pacemaker and I'm used to that all resources are controlled by a cluster and an action can be taken when not only there are connectivity issues, but any of the resources misbehave.

So, which is the right way for me to go:

  1. leave GFS partition mounted (any reasons to do so?)
  2. set force_unmount="1". Won't this break anything? Why this is not the default?
  3. use script resource <script file="/etc/init.d/gfs2" name="gfs"/> to manage GFS partition.
  4. start it at boot and don't include in cluster.conf (any reasons to do so?)

This may be a sort of question that cannot be answered unambiguously, so it would be also of much value for me if you shared your experience or expressed your thoughts on the issue. How does for example /etc/cluster/cluster.conf look like when configuring gfs with Conga or ccs (they are not available to me since for now I have to use Ubuntu for the cluster)?

Thanks you very much!

© Server Fault or respective owner

Related posts about gfs2

Related posts about shared-storage