SCHA API for resource group failover / switchover history
- by krishna.k.murthy
The Oracle Solaris Cluster framework keeps an internal log of cluster events, including switchover and failover of resource groups. These logs can be useful to Oracle support engineers for diagnosing cluster behavior. However, till now, there was no external interface to access the event history. Oracle Solaris Cluster 4.2 provides a new API option for viewing the recent history of resource group switchovers in a program-parsable format.
Oracle Solaris Cluster 4.2 provides a new option tag argument RG_FAILOVER_LOG for the existing API command scha_cluster_get which can be used to list recent failover / switchover events for resource groups.
The command usage is as shown below:
# scha_cluster_get -O RG_FAILOVER_LOG number_of_days
number_of_days : the number of days to be considered for scanning the historical logs.
The command returns a list of events in the following format. Each field is separated by a semi-colon [;]:
resource_group_name;source_nodes;target_nodes;time_stamp
source_nodes: node_names from which resource group is failed over or was switched manually.
target_nodes: node_names to which the resource group failed over or was switched manually.
There is a corresponding enhancement in the C API function scha_cluster_get() which uses the SCHA_RG_FAILOVER_LOG query tag.
In the example below geo-infrastructure (failover resource group), geo-clusterstate (scalable resource group), oracle-rg (failover resource group), asm-dg-rg (scalable resource group) and asm-inst-rg (scalable resource group) are part of Geographic Edition setup.
# /usr/cluster/bin/scha_cluster_get -O RG_FAILOVER_LOG 3 geo-infrastructure;schost1c;;Mon Jul 21 15:51:51 2014 geo-clusterstate;schost2c,schost1c;schost2c;Mon Jul 21 15:52:26 2014 oracle-rg;schost1c;;Mon Jul 21 15:54:31 2014 asm-dg-rg;schost2c,schost1c;schost2c;Mon Jul 21 15:54:58 2014 asm-inst-rg;schost2c,schost1c;schost2c;Mon Jul 21 15:56:11 2014 oracle-rg;;schost2c;Mon Jul 21 15:58:51 2014 geo-infrastructure;;schost2c;Mon Jul 21 15:59:19 2014 geo-clusterstate;schost2c;schost2c,schost1c;Mon Jul 21 16:01:51 2014 asm-inst-rg;schost2c;schost2c,schost1c;Mon Jul 21 16:01:10 2014 asm-dg-rg;schost2c;schost2c,schost1c;Mon Jul 21 16:02:10 2014 oracle-rg;schost2c;;Tue Jul 22 16:58:02 2014 oracle-rg;;schost1c;Tue Jul 22 16:59:05 2014 oracle-rg;schost1c;schost1c;Tue Jul 22 17:05:33 2014
Note that in the output some of the entries might have an empty string in the source_nodes. Such entries correspond to events in which the resource group is switched online manually or during a cluster boot-up. Similarly, an empty destination_nodes list indicates an event in which the resource group went offline.
- Arpit Gupta, Harish Mallya