???????11gR2 GI????????????????????,?10g????,???????GI?????????????1.Ocssd.bin:????????10g??????????,???????(Node Monitoring)????(Group Management)?????????????“??????????”????????2.Cssdagent.bin/Cssdmonitor.bin:?2????11gR2??????????????ocssd.bin??????(Local HeartBeat),??????1??????????????????ocssd.bin???????????,????????ocssd.bin????????,??????,???????????10g??oclsomon/oclsvmon(?????????????)?oprocd????,????11gR2???????—rebootless restart,?????????11.2.0.2????????????,????????????(????????)??????ocssd.bin?????,??????????????,??????????GI stack?????,??GI stack??????????(short disk I/O timeout)??graceful shutdown,????????,??,????????????????????????11gR2 ??????????????1.Ocssd.log2.Cssdagent ? cssdmonitor logs<GI_home>/log/<node_name>/agent/ohasd/oracssdagent_root/oracssdagent_root.log<GI_home>/log/<node_name>/agent/ohasd/oracssdmonitor_root_root/oracssdmonitor_root.log3.Cluster alert log<GI_home>/log/<node_name>/alert<node_name>.log4.OS log5.OSW ?? CHM ????,??????????????????1.???????????????????????????????,??????10g???????????????????????????GI alert log ??,?????node2?2012-08-15 16:30:06.554 [cssd(11011) ]CRS-1612:Network communication with node node1 (1) missing for 50% of timeout interval. Removal of this node from cluster in 14.510 seconds2012-08-15 16:30:13.586 [cssd(11011) ]CRS-1611:Network communication with node node1 (1) missing for 75% of timeout interval. Removal of this node from cluster in 7.470 seconds2012-08-15 16:30:18.606 [cssd(11011) ]CRS-1610:Network communication with node node1 (1) missing for 90% of timeout interval. Removal of this node from cluster in 2.450 seconds2012-08-15 16:30:21.073 [cssd(11011) ]CRS-1632:Node node1 is being removed from the cluster in cluster incarnation 2363798322012-08-15 16:30:21.086 [cssd(11011) ]CRS-1601:CSSD Reconfiguration complete. Active nodes are node2 .?????????????node1?????????????????,???????, node2?? node1 ?????????node1 ???,???node1 ???????????????(????os log ??OSW ????),??node1 ???????node2??node1?????????,????node1??????????,???reconfiguration,????????????,????????????,?11.2.0.2??,??rebootless restart???,node eviction ????????GI stack??,????????????,???node2?node1?????????,node1?ocssd.bin??????(????ocssd.log??)??node1???????????????,??node1??????GI node eviction????2.???????????????,?????10g???????,???????????3.??ocssd.bin ????Cssdagent/Cssdmonitor.bin????????????,??????,????,????oracssdagent_root.log ?oracssdmonitor_root.log ????????2012-07-23 14:09:58.506: [ USRTHRD][1095805248] (:CLSN00111: )clsnproc_needreboot: Impending reboot at 75% of limit 28030; disk timeout 28030, network timeout 26380, last heartbeat from CSSD at epoch seconds 1343023777.410, 21091 milliseconds ago based on invariant clock 269251595; now polling at 100 ms……2012-07-23 14:10:02.704: [ USRTHRD][1095805248] (:CLSN00111: )clsnproc_needreboot: Impending reboot at 90% of limit 28030; disk timeout 28030, network timeout 26380, last heartbeat from CSSD at epoch seconds 1343023777.410, 25291 milliseconds ago based on invariant clock 269251595; now polling at 100 ms……???????????????timeout???28 ???(misscount – reboot time)?4.?????????????????? ??????????????????????,????ocssd.bin??????,?????????????,?????????????ocssd.bin??,????????os???????????OSW??,???? ??????? cpu ???Linux OSWbb v5.0 node1SNAP_INTERVAL 30CPU_COUNT 8OSWBB_ARCHIVE_DEST /osw/archiveprocs -----------memory---------- ---swap-- -----io---- -system-- -----cpu------r b swpd free buff cache si so bi bo in cs us sy id wa……zzz ***Mon Aug 30 17:55:21 CST 2012158 6 4200956 923940 7664 19088464 0 0 1296 3574 11153 231579 0 100 0 0 0zzz ***Mon Aug 30 17:55:53 CST 2012135 4 4200956 923760 7812 19089344 0 0 4 45 570 14563 0 100 0 0 0zzz ***Mon Aug 30 17:56:53 CST 2012126 2 4200956 923784 8396 19083620 0 0 196 1121 651 15941 2 98 0 0 0?????????????,????10g??????11gR2????????????????,??????,????????Note 1050693.1 : Troubleshooting 11.2 Clusterware Node Evictions (Reboots)