Search Results

Search found 1914 results on 77 pages for 'mongrel cluster'.

Page 27/77 | < Previous Page | 23 24 25 26 27 28 29 30 31 32 33 34  | Next Page >

  • My error with upgrading 4.0 to 4.2- What NOT to do...

    - by Steve Tunstall
    Last week, I was helping a client upgrade from the 2011.1.4.0 code to the newest 2011.1.4.2 code. We downloaded the 4.2 update from MOS, upload and unpacked it on both controllers, and upgraded one of the controllers in the cluster with no issues at all. As this was a brand-new system with no networking or pools made on it yet, there were not any resources to fail back and forth between the controllers. Each controller had it's own, private, management interface (igb0 and igb1) and that's it. So we took controller 1 as the passive controller and upgraded it first. The first controller came back up with no issues and was now on the 4.2 code. Great. We then did a takeover on controller 1, making it the active head (although there were no resources for it to take), and then proceeded to upgrade controller 2. Upon upgrading the second controller, we ran the health check with no issues. We then ran the update and it ran and rebooted normally. However, something strange then happened. It took longer than normal to come back up, and when it did, we got the "cluster controllers on different code" error message that one gets when the two controllers of a cluster are running different code. But we just upgraded the second controller to 4.2, so they should have been the same, right??? Going into the Maintenance-->System screen of controller 2, we saw something very strange. The "current version" was still on 4.0, and the 4.2 code was there but was in the "previous" state with the rollback icon, as if it was the OLDER code and not the newer code. I have never seen this happen before. I would have thought it was a bad 4.2 code file, but it worked just fine with controller 1, so I don't think that was it. Other than the fact the code did not update, there was nothing else going on with this system. It had no yellow lights, no errors in the Problems section, and no errors in any of the logs. It was just out of the box a few hours ago, and didn't even have a storage pool yet. So.... We deleted the 4.2 code, uploaded it from scratch, ran the health check, and ran the upgrade again. once again, it seemed to go great, rebooted, and came back up to the same issue, where it came to 4.0 instead of 4.2. See the picture below.... HERE IS WHERE I MADE A BIG MISTAKE.... I SHOULD have instantly called support and opened a Sev 2 ticket. They could have done a shared shell and gotten the correct Fishwork engineer to look at the files and the code and determine what file was messed up and fixed it. The system was up and working just fine, it was just on an older code version, not really a huge problem at all. Instead, I went ahead and clicked the "Rollback" icon, thinking that the system would rollback to the 4.2 code.   Ouch... What happened was that the system said, "Fine, I will delete the 4.0 code and boot to your 4.2 code"... Which was stupid on my part because something was wrong with the 4.2 code file here and the 4.0 was just fine.  So now the system could not boot at all, and the 4.0 code was completely missing from the system, and even a high-level Fishworks engineer could not help us. I had messed it up good. We could only get to the ILOM, and I had to re-image the system from scratch using a hard-to-get-and-use FishStick USB drive. These are tightly controlled and difficult to get, almost always handcuffed to an engineer who will drive out to re-image a system. This took another day of my client's time.  So.... If you see a "previous version" of your system code which is actually a version higher than the current version... DO NOT ROLL IT BACK.... It did not upgrade for a very good reason. In my case, after the system was re-imaged to a code level just 3 back, we once again tried the same 4.2 code update and it worked perfectly the first time and is now great and stable.  Lesson learned.  By the way, our buddy Ryan Matthews wanted to point out the best practice and supported way of performing an upgrade of an active/active ZFSSA, where both controllers are doing some of the work. These steps would not have helpped me for the above issue, but it's important to follow the correct proceedure when doing an upgrade. 1) Upload software to both controllers and wait for it to unpack 2) On controller "A" navigate to configuration/cluster and click "takeover" 3) Wait for controller "B" to finish restarting, then login to it, navigate to maintenance/system, and roll forward to the new software. 4) Wait for controller "B" to apply the update and finish rebooting 5) Login to controller "B", navigate to configuration/cluster and click "takeover" 6) Wait for controller "A" to finish restarting, then login to it, navigate to maintenance/system, and roll forward to the new software. 7) Wait for controller "A" to apply the update and finish rebooting 8) Login to controller "B", navigate to configuration/cluster and click "failback"

    Read the article

  • eSTEP TechCast - December 2012

    - by Cinzia Mascanzoni
    Join our next eSTEP TechCast will be on Thursday, 06. December 2012, 11:00 - 12:00 GMT (12:00 - 13:00 CET; 15:00 - 16:00 GST) Title: Innovations with Oracle Solaris Cluster 4 In this webcast we will focus at the integration of the cluster software with the IPS packaging system of Solaris 11, which makes installing and updating the software much easier and much more reliable, especially with virtualization technologies involved. Our webcast will also reflect new versions of Oracle Solaris Cluster if they will be announced in the meantime.Call Info: Call-in-toll-free number: 08006948154 (United Kingdom) Call-in-toll-free number: +44-2081181001 (United Kingdom) Show global numbers Conference Code: 803 594 3 Security Passcode: 9876 Webex Info (Oracle Web Conference) Meeting Number: 255 760 510 Meeting Password: tech2011 Playback / Recording / Archive: The webcasts will be recorded and will be available shortly after the event in the eSTEP portal under the Events tab, where you could find also material from already delivered eSTEP TechCasts. Use your email-adress and PIN: eSTEP_2011 to get access. 

    Read the article

  • 11gR2 ??????????

    - by Allen Gao
    ???????11gR2 GI????????????????????,?10g????,???????GI?????????????1.Ocssd.bin:????????10g??????????,???????(Node Monitoring)????(Group Management)?????????????“??????????”????????2.Cssdagent.bin/Cssdmonitor.bin:?2????11gR2??????????????ocssd.bin??????(Local HeartBeat),??????1??????????????????ocssd.bin???????????,????????ocssd.bin????????,??????,???????????10g??oclsomon/oclsvmon(?????????????)?oprocd????,????11gR2???????—rebootless restart,?????????11.2.0.2????????????,????????????(????????)??????ocssd.bin?????,??????????????,??????????GI stack?????,??GI stack??????????(short disk I/O timeout)??graceful shutdown,????????,??,????????????????????????11gR2 ??????????????1.Ocssd.log2.Cssdagent ? cssdmonitor logs<GI_home>/log/<node_name>/agent/ohasd/oracssdagent_root/oracssdagent_root.log<GI_home>/log/<node_name>/agent/ohasd/oracssdmonitor_root_root/oracssdmonitor_root.log3.Cluster alert log<GI_home>/log/<node_name>/alert<node_name>.log4.OS log5.OSW ?? CHM ????,??????????????????1.???????????????????????????????,??????10g???????????????????????????GI alert log ??,?????node2?2012-08-15 16:30:06.554 [cssd(11011) ]CRS-1612:Network communication with node node1 (1) missing for 50% of timeout interval.  Removal of this node from cluster in 14.510 seconds2012-08-15 16:30:13.586 [cssd(11011) ]CRS-1611:Network communication with node node1 (1) missing for 75% of timeout interval.  Removal of this node from cluster in 7.470 seconds2012-08-15 16:30:18.606 [cssd(11011) ]CRS-1610:Network communication with node node1 (1) missing for 90% of timeout interval.  Removal of this node from cluster in 2.450 seconds2012-08-15 16:30:21.073 [cssd(11011) ]CRS-1632:Node node1 is being removed from the cluster in cluster incarnation 2363798322012-08-15 16:30:21.086 [cssd(11011) ]CRS-1601:CSSD Reconfiguration complete. Active nodes are node2 .?????????????node1?????????????????,???????, node2?? node1 ?????????node1 ???,???node1 ???????????????(????os log ??OSW ????),??node1 ???????node2??node1?????????,????node1??????????,???reconfiguration,????????????,????????????,?11.2.0.2??,??rebootless restart???,node eviction ????????GI stack??,????????????,???node2?node1?????????,node1?ocssd.bin??????(????ocssd.log??)??node1???????????????,??node1??????GI node eviction????2.???????????????,?????10g???????,???????????3.??ocssd.bin ????Cssdagent/Cssdmonitor.bin????????????,??????,????,????oracssdagent_root.log ?oracssdmonitor_root.log ????????2012-07-23 14:09:58.506: [ USRTHRD][1095805248] (:CLSN00111: )clsnproc_needreboot: Impending reboot at 75% of limit 28030; disk timeout 28030, network timeout 26380, last heartbeat from CSSD at epoch seconds 1343023777.410, 21091 milliseconds ago based on invariant clock 269251595; now polling at 100 ms……2012-07-23 14:10:02.704: [ USRTHRD][1095805248] (:CLSN00111: )clsnproc_needreboot: Impending reboot at 90% of limit 28030; disk timeout 28030, network timeout 26380, last heartbeat from CSSD at epoch seconds 1343023777.410, 25291 milliseconds ago based on invariant clock 269251595; now polling at 100 ms……???????????????timeout???28 ???(misscount – reboot time)?4.?????????????????? ??????????????????????,????ocssd.bin??????,?????????????,?????????????ocssd.bin??,????????os???????????OSW??,???? ??????? cpu ???Linux OSWbb v5.0 node1SNAP_INTERVAL 30CPU_COUNT 8OSWBB_ARCHIVE_DEST /osw/archiveprocs -----------memory---------- ---swap-- -----io---- -system-- -----cpu------r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa……zzz ***Mon Aug 30 17:55:21 CST 2012158  6 4200956 923940   7664 19088464    0    0  1296  3574 11153 231579  0 100  0  0  0zzz ***Mon Aug 30 17:55:53 CST 2012135  4 4200956 923760   7812 19089344    0    0     4    45  570 14563  0 100  0  0  0zzz ***Mon Aug 30 17:56:53 CST 2012126  2 4200956 923784   8396 19083620    0    0   196  1121  651 15941  2 98  0  0  0?????????????,????10g??????11gR2????????????????,??????,????????Note 1050693.1 : Troubleshooting 11.2 Clusterware Node Evictions (Reboots)

    Read the article

  • ?11gR2 RAC???ASM DISK Path????

    - by Liu Maclean(???)
    ????T.askmaclean.com???????11gR2?ASM DISK?????,??????: aix 6.1,grid 11.2.0.3+asm11.2.0.3+rac ???????????aix????????mpio,??diskgroup ?????veritas dmp???,?????asm?disk_strings=/dev/vx/rdmp/*,crs/asm??????????????/dev/vx/rdmp/?????,?????????diskgroup??? crs???????:2012-07-13 15:07:29.748: [ GPNP][1286]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2108 clsgpnp_profileCallUrlInt] get-profile call to url “ipc://GPNPD_ggtest1? disco “” [f=0 claimed- host: cname: seq: auth:]2012-07-13 15:07:29.762: [ GPNP][1286]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2236 clsgpnp_profileCallUrlInt] Result: (0) CLSGPNP_OK. Successful get-profile CALL to remote “ipc://GPNPD_ggtest1? disco “”2012-07-13 15:07:29.762: [ CSSD][1286]clssnmReadDiscoveryProfile: voting file discovery string(/dev/vx/rdmp/*)2012-07-13 15:07:29.762: [ CSSD][1286]clssnmvDDiscThread: using discovery string /dev/vx/rdmp/* for initial discovery2012-07-13 15:07:29.762: [ SKGFD][1286]Discovery with str:/dev/vx/rdmp/*: 2012-07-13 15:07:29.762: [ SKGFD][1286]UFS discovery with :/dev/vx/rdmp/*: 2012-07-13 15:07:29.769: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_919: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_212: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_211: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_210: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_209: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_181: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_180: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_3: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_2: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_1: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_0: 2012-07-13 15:07:29.771: [ SKGFD][1286]OSS discovery with :/dev/vx/rdmp/*: 2012-07-13 15:07:29.771: [ SKGFD][1286]Handle 1115e7510 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_916: 2012-07-13 15:07:29.772: [ SKGFD][1286]Handle 1118758b0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_912: 2012-07-13 15:07:29.773: [ SKGFD][1286]Handle 1118d9cf0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_908: 2012-07-13 15:07:29.773: [ SKGFD][1286]Handle 1118da450 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_904: 2012-07-13 15:07:29.773: [ SKGFD][1286]Handle 1118dad70 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_903: 2012-07-13 15:07:29.802: [ CLSF][1286]checksum failed for disk:/dev/vx/rdmp/v_df8000_916:2012-07-13 15:07:29.803: [ SKGFD][1286]Lib :UFS:: closing handle 1115e7510 for disk :/dev/vx/rdmp/v_df8000_916: 2012-07-13 15:07:29.803: [ SKGFD][1286]Lib :UFS:: closing handle 1118758b0 for disk :/dev/vx/rdmp/v_df8000_912: 2012-07-13 15:07:29.804: [ SKGFD][1286]Handle 1115e6710 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_202: 2012-07-13 15:07:29.808: [ SKGFD][1286]Handle 1115e7030 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_201: 2012-07-13 15:07:29.809: [ SKGFD][1286]Handle 1115e7ad0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_200: 2012-07-13 15:07:29.809: [ SKGFD][1286]Handle 1118733f0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_199: 2012-07-13 15:07:29.816: [ CLSF][1286]checksum failed for disk:/dev/vx/rdmp/v_df8000_186:2012-07-13 15:07:29.816: [ SKGFD][1286]Lib :UFS:: closing handle 1118de5d0 for disk :/dev/vx/rdmp/v_df8000_186: 2012-07-13 15:07:29.816: [ CSSD][1286]clssnmvDiskVerify: Successful discovery of 0 disks2012-07-13 15:07:29.816: [ CSSD][1286]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery2012-07-13 15:07:29.816: [ CSSD][1286]clssnmvFindInitialConfigs: No voting files found2012-07-13 15:07:29.816: [ CSSD][1286](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds2012-07-13 15:07:30.169: [ CSSD][1029]clssgmExecuteClientRequest(): type(37) size(80) only connect and exit messages are allowed before lease acquisition proc(1115e4870) client(0) ??????ASM DISK PATH???????,????11gR2 RAC+ASM????,??CRS??????,????crsctl start crs -excl -nocrs???????CSS???ASM??, ???????(clssnmCompleteInitVFDiscovery: Voting file not found),????Voteing file????????????????? ?????????,???????11gR2 RAC+ASM??ASM DISK??: 1.?????????ASM DISK?????,??????UDEV????????,???UDEV????ASM DISK?/dev/asm-disk* ??? /dev/rasm-disk*???, ??????udev rule??????: [grid@maclean1 ~]$ export ORACLE_HOME=/g01/grid/app/11.2.0/grid [grid@maclean1 ~]$ /g01/grid/app/11.2.0/grid/bin/sqlplus / as sysasm SQL*Plus: Release 11.2.0.3.0 Production on Sun Jul 15 04:09:28 2012 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> show parameter diskstri NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ asm_diskstring string /dev/asm* ??????ASM?????asm_diskstring ?/dev/asm*, ???root????UDEV RULE?? : [root@maclean1 rules.d]# cp 99-oracle-asmdevices.rules 99-oracle-asmdevices.rules.bak [root@maclean1 rules.d]# vi 99-oracle-asmdevices.rules [root@maclean1 rules.d]# cat 99-oracle-asmdevices.rules KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB09cadb31-cfbea255_", NAME="rasm-diskb", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB5f097069-59efb82f_", NAME="rasm-diskc", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB4e1a81c0-20478bc4_", NAME="rasm-diskd", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VBdcce9285-b13c5a27_", NAME="rasm-diske", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB82effe1a-dbca7dff_", NAME="rasm-diskf", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB950d279f-c581cb51_", NAME="rasm-diskg", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB14400d81-651672d7_", NAME="rasm-diskh", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB31b1237b-78aa22bb_", NAME="rasm-diski", OWNER="grid", GROUP="asmadmin", MODE="0660" ???????99-oracle-asmdevices.rules?UDEV RULE????,??????????/dev/rasm-disk*???,??????ASM DISK???, ????????????????RAC CRS??????? ??????votedisk?ocr ????: [root@maclean1 rules.d]# /g01/grid/app/11.2.0/grid/bin/crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 6896bfc3d1464f9fbf0ea9df87e023ad (/dev/asm-diskb) [SYSTEMDG] 2. ONLINE 58eb81b656084ff2bfd315d9badd08b7 (/dev/asm-diskc) [SYSTEMDG] 3. ONLINE 6bf7324625c54f3abf2c942b1e7f70d9 (/dev/asm-diskd) [SYSTEMDG] 4. ONLINE 43ad8ae20c354f5ebf7083bc30bf94cc (/dev/asm-diske) [SYSTEMDG] 5. ONLINE 4c225359d51b4f93bfba01080664b3d7 (/dev/asm-diskf) [SYSTEMDG] Located 5 voting disk(s). [root@maclean1 rules.d]# /g01/grid/app/11.2.0/grid/bin/ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 2844 Available space (kbytes) : 259276 ID : 879001605 Device/File Name : +SYSTEMDG Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded ??votedisk file?????????ASM DISK,?????????crsctl replace votedisk, ??????LINUX OS: [root@maclean1 rules.d]# init 6 rebooting ............ [root@maclean1 dev]# ls -l *asm* brw-rw---- 1 grid asmadmin 8, 16 Jul 15 04:15 rasm-diskb brw-rw---- 1 grid asmadmin 8, 32 Jul 15 04:15 rasm-diskc brw-rw---- 1 grid asmadmin 8, 48 Jul 15 04:15 rasm-diskd brw-rw---- 1 grid asmadmin 8, 64 Jul 15 04:15 rasm-diske brw-rw---- 1 grid asmadmin 8, 80 Jul 15 04:15 rasm-diskf brw-rw---- 1 grid asmadmin 8, 96 Jul 15 04:15 rasm-diskg brw-rw---- 1 grid asmadmin 8, 112 Jul 15 04:15 rasm-diskh brw-rw---- 1 grid asmadmin 8, 128 Jul 15 04:15 rasm-diski ??????????/dev/rasm-disk*?ASM DISK,??ASM??????css?????/dev/asm*?????ASM DISK,??????????????ASM DISK: more /g01/grid/app/11.2.0/grid/log/maclean1/cssd/ocssd.log 2012-07-15 04:17:45.208: [ SKGFD][1099548992]Discovery with str:/dev/asm*: 2012-07-15 04:17:45.208: [ SKGFD][1099548992]UFS discovery with :/dev/asm*: 2012-07-15 04:17:45.208: [ SKGFD][1099548992]OSS discovery with :/dev/asm*: 2012-07-15 04:17:45.208: [ CSSD][1099548992]clssnmvDiskVerify: Successful discovery of 0 disks 2012-07-15 04:17:45.208: [ CSSD][1099548992]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery 2012-07-15 04:17:45.208: [ CSSD][1099548992]clssnmvFindInitialConfigs: No voting files found 2012-07-15 04:17:45.208: [ CSSD][1099548992](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds 2012-07-15 04:17:45.251: [ CSSD][1096661312]clssgmExecuteClientRequest(): type(37) size(80) only connect and exit messages are allowed before lease acquisition proc(0x26a8ba0) client((nil)) 2012-07-15 04:17:45.251: [ CSSD][1096661312]clssgmDeadProc: proc 0x26a8ba0 2012-07-15 04:17:45.251: [ CSSD][1096661312]clssgmDestroyProc: cleaning up proc(0x26a8ba0) con(0xfe6) skgpid ospid 3751 with 0 clients, refcount 0 2012-07-15 04:17:45.252: [ CSSD][1096661312]clssgmDiscEndpcl: gipcDestroy 0xfe6 2012-07-15 04:17:45.829: [ CSSD][1096661312]clssscSelect: cookie accept request 0x2318ea0 2012-07-15 04:17:45.829: [ CSSD][1096661312]clssgmAllocProc: (0x2659480) allocated 2012-07-15 04:17:45.830: [ CSSD][1096661312]clssgmClientConnectMsg: properties of cmProc 0x2659480 - 1,2,3,4,5 2012-07-15 04:17:45.830: [ CSSD][1096661312]clssgmClientConnectMsg: Connect from con(0x114e) proc(0x2659480) pid(3751) version 11:2:1:4, properties: 1,2,3,4,5 2012-07-15 04:17:45.830: [ CSSD][1096661312]clssgmClientConnectMsg: msg flags 0x0000 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssscSelect: cookie accept request 0x253ddd0 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssscevtypSHRCON: getting client with cmproc 0x253ddd0 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssgmRegisterClient: proc(3/0x253ddd0), client(61/0x26877b0) 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssgmExecuteClientRequest(): type(6) size(684) only connect and exit messages are  allowed before lease acquisition proc(0x253ddd0) client(0x26877b0) 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssgmDiscEndpcl: gipcDestroy 0x1174 2012-07-15 04:17:46.070: [ CSSD][1096661312]clssscSelect: cookie accept request 0x26368a0 2012-07-15 04:17:46.070: [ CSSD][1096661312]clssscevtypSHRCON: getting client with cmproc 0x26368a0 2012-07-15 04:17:46.070: [ CSSD][1096661312]clssgmRegisterClient: proc(5/0x26368a0), client(50/0x26877b0) ??11gR2?CRS?????ASM,??ocr???ASM?,??ASM???????,???CRS?????????: [root@maclean1 ~]# crsctl check has CRS-4638: Oracle High Availability Services is online [root@maclean1 ~]# crsctl check crs CRS-4638: Oracle High Availability Services is online CRS-4535: Cannot communicate with Cluster Ready Services CRS-4530: Communications failure contacting Cluster Synchronization Services daemon CRS-4534: Cannot communicate with Event Manager 2. ?????ASM DISK PATH???????,?????????????CRS: ??????OHASD??: [root@maclean1 ~]# crsctl stop has -f CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'maclean1' CRS-2673: Attempting to stop 'ora.mdnsd' on 'maclean1' CRS-2673: Attempting to stop 'ora.crf' on 'maclean1' CRS-2677: Stop of 'ora.mdnsd' on 'maclean1' succeeded CRS-2677: Stop of 'ora.crf' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'maclean1' CRS-2677: Stop of 'ora.gipcd' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gpnpd' on 'maclean1' CRS-2677: Stop of 'ora.gpnpd' on 'maclean1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'maclean1' has completed CRS-4133: Oracle High Availability Services has been stopped. 3. ?-excl -nocrs????CRS,?????ASM ???????CRS??: [root@maclean1 ~]# crsctl start crs -excl -nocrs  CRS-4123: Oracle High Availability Services has been started. CRS-2672: Attempting to start 'ora.mdnsd' on 'maclean1' CRS-2676: Start of 'ora.mdnsd' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.gpnpd' on 'maclean1' CRS-2676: Start of 'ora.gpnpd' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.cssdmonitor' on 'maclean1' CRS-2672: Attempting to start 'ora.gipcd' on 'maclean1' CRS-2676: Start of 'ora.cssdmonitor' on 'maclean1' succeeded CRS-2676: Start of 'ora.gipcd' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.cssd' on 'maclean1' CRS-2672: Attempting to start 'ora.diskmon' on 'maclean1' CRS-2676: Start of 'ora.diskmon' on 'maclean1' succeeded CRS-2676: Start of 'ora.cssd' on 'maclean1' succeeded CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'maclean1' CRS-2672: Attempting to start 'ora.ctssd' on 'maclean1' CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'maclean1' CRS-2676: Start of 'ora.ctssd' on 'maclean1' succeeded CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.asm' on 'maclean1' CRS-2676: Start of 'ora.asm' on 'maclean1' succeeded #??????CRS_HOME???ORACLE_BASE?777??,??????? [root@maclean1 ~]# chmod 777 /g01 4.??ASM???disk_strings????ASM DISK PATH??: [root@maclean1 ~]# su - grid [grid@maclean1 ~]$ sqlplus / as sysasm SQL*Plus: Release 11.2.0.3.0 Production on Sun Jul 15 04:40:40 2012 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> alter system set asm_diskstring='/dev/rasm*'; System altered. SQL> alter diskgroup systemdg mount; Diskgroup altered. SQL> create spfile from memory; File created. SQL> startup force mount; ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance ASM instance started Total System Global Area 283930624 bytes Fixed Size 2227664 bytes Variable Size 256537136 bytes ASM Cache 25165824 bytes ASM diskgroups mounted SQL> show parameter spfile NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ spfile string /g01/grid/app/11.2.0/grid/dbs/ spfile+ASM1.ora SQL> show parameter disk NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ asm_diskgroups string SYSTEMDG asm_diskstring string /dev/rasm* SQL> create pfile from spfile; File created. SQL> create spfile='+SYSTEMDG' from pfile; File created. SQL> startup force; ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance ASM instance started Total System Global Area 283930624 bytes Fixed Size 2227664 bytes Variable Size 256537136 bytes ASM Cache 25165824 bytes ASM diskgroups mounted SQL> show parameter spfile NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ spfile string +SYSTEMDG/maclean-cluster/asmp arameterfile/registry.253.7886 82933 ???????asm_diskstring ,????ASM DISKGROUP??SPFILE , ??ASM?????SPFILE?????????????????? 5. crsctl replace votedisk ???votedisk????: [root@maclean1 ~]# crsctl replace votedisk +systemdg Successful addition of voting disk 864a00efcfbe4f42bfd0f4f6b60472a0. Successful addition of voting disk ab14d6e727614f29bf53b9870052a5c8. Successful addition of voting disk 754c03c168854f46bf2daee7287bf260. Successful addition of voting disk 9ed58f37f3e84f28bfcd9b101f2af9f3. Successful addition of voting disk 4ce7b7c682364f12bf4df5ce1fb7814e. Successfully replaced voting disk group with +systemdg. CRS-4266: Voting file(s) successfully replaced [root@maclean1 ~]# crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 864a00efcfbe4f42bfd0f4f6b60472a0 (/dev/rasm-diskb) [SYSTEMDG] 2. ONLINE ab14d6e727614f29bf53b9870052a5c8 (/dev/rasm-diskc) [SYSTEMDG] 3. ONLINE 754c03c168854f46bf2daee7287bf260 (/dev/rasm-diskd) [SYSTEMDG] 4. ONLINE 9ed58f37f3e84f28bfcd9b101f2af9f3 (/dev/rasm-diske) [SYSTEMDG] 5. ONLINE 4ce7b7c682364f12bf4df5ce1fb7814e (/dev/rasm-diskf) [SYSTEMDG] Located 5 voting disk(s). [root@maclean1 ~]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 2844 Available space (kbytes) : 259276 ID : 879001605 Device/File Name : +SYSTEMDG Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded ??replace?votedisk??? ASM DISK?,???votedisk?OCR??????? 6.??CRS??: [root@maclean1 ~]# crsctl stop crs CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'maclean1' CRS-2673: Attempting to stop 'ora.mdnsd' on 'maclean1' CRS-2673: Attempting to stop 'ora.ctssd' on 'maclean1' CRS-2673: Attempting to stop 'ora.asm' on 'maclean1' CRS-2677: Stop of 'ora.mdnsd' on 'maclean1' succeeded CRS-2677: Stop of 'ora.asm' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'maclean1' CRS-2677: Stop of 'ora.ctssd' on 'maclean1' succeeded CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.cssd' on 'maclean1' CRS-2677: Stop of 'ora.cssd' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'maclean1' CRS-2677: Stop of 'ora.gipcd' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gpnpd' on 'maclean1' CRS-2677: Stop of 'ora.gpnpd' on 'maclean1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'maclean1' has completed CRS-4133: Oracle High Availability Services has been stopped. [root@maclean1 ~]# crsctl stat res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.BACKUPDG.dg ONLINE ONLINE maclean1 ora.DATA.dg ONLINE ONLINE maclean1 ora.LISTENER.lsnr ONLINE ONLINE maclean1 ora.SYSTEMDG.dg ONLINE ONLINE maclean1 ora.asm ONLINE ONLINE maclean1 Started ora.gsd OFFLINE OFFLINE maclean1 ora.net1.network ONLINE ONLINE maclean1 ora.ons ONLINE ONLINE maclean1 -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE maclean1 ora.cvu 1 ONLINE ONLINE maclean1 ora.maclean1.vip 1 ONLINE ONLINE maclean1 ora.maclean2.vip 1 ONLINE INTERMEDIATE maclean1 FAILED OVER ora.oc4j 1 ONLINE OFFLINE STARTING ora.prod.db 1 ONLINE OFFLINE Instance Shutdown,S TARTING 2 ONLINE OFFLINE ora.scan1.vip 1 ONLINE ONLINE maclean1 ???????ASM?????SPFILE,???????????????,?????CRS??????? ??11gR2 RAC+ASM?????????,????????????????ASM DISK PATH??????????

    Read the article

  • ?11gR2 RAC???ASM DISK Path????

    - by Liu Maclean(???)
    ????T.askmaclean.com???????11gR2?ASM DISK?????,??????: aix 6.1,grid 11.2.0.3+asm11.2.0.3+rac ???????????aix????????mpio,??diskgroup ?????veritas dmp???,?????asm?disk_strings=/dev/vx/rdmp/*,crs/asm??????????????/dev/vx/rdmp/?????,?????????diskgroup??? crs???????:2012-07-13 15:07:29.748: [ GPNP][1286]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2108 clsgpnp_profileCallUrlInt] get-profile call to url “ipc://GPNPD_ggtest1? disco “” [f=0 claimed- host: cname: seq: auth:]2012-07-13 15:07:29.762: [ GPNP][1286]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2236 clsgpnp_profileCallUrlInt] Result: (0) CLSGPNP_OK. Successful get-profile CALL to remote “ipc://GPNPD_ggtest1? disco “”2012-07-13 15:07:29.762: [ CSSD][1286]clssnmReadDiscoveryProfile: voting file discovery string(/dev/vx/rdmp/*)2012-07-13 15:07:29.762: [ CSSD][1286]clssnmvDDiscThread: using discovery string /dev/vx/rdmp/* for initial discovery2012-07-13 15:07:29.762: [ SKGFD][1286]Discovery with str:/dev/vx/rdmp/*: 2012-07-13 15:07:29.762: [ SKGFD][1286]UFS discovery with :/dev/vx/rdmp/*: 2012-07-13 15:07:29.769: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_919: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_212: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_211: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_210: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_209: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_181: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_180: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_3: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_2: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_1: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_0: 2012-07-13 15:07:29.771: [ SKGFD][1286]OSS discovery with :/dev/vx/rdmp/*: 2012-07-13 15:07:29.771: [ SKGFD][1286]Handle 1115e7510 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_916: 2012-07-13 15:07:29.772: [ SKGFD][1286]Handle 1118758b0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_912: 2012-07-13 15:07:29.773: [ SKGFD][1286]Handle 1118d9cf0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_908: 2012-07-13 15:07:29.773: [ SKGFD][1286]Handle 1118da450 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_904: 2012-07-13 15:07:29.773: [ SKGFD][1286]Handle 1118dad70 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_903: 2012-07-13 15:07:29.802: [ CLSF][1286]checksum failed for disk:/dev/vx/rdmp/v_df8000_916:2012-07-13 15:07:29.803: [ SKGFD][1286]Lib :UFS:: closing handle 1115e7510 for disk :/dev/vx/rdmp/v_df8000_916: 2012-07-13 15:07:29.803: [ SKGFD][1286]Lib :UFS:: closing handle 1118758b0 for disk :/dev/vx/rdmp/v_df8000_912: 2012-07-13 15:07:29.804: [ SKGFD][1286]Handle 1115e6710 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_202: 2012-07-13 15:07:29.808: [ SKGFD][1286]Handle 1115e7030 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_201: 2012-07-13 15:07:29.809: [ SKGFD][1286]Handle 1115e7ad0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_200: 2012-07-13 15:07:29.809: [ SKGFD][1286]Handle 1118733f0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_199: 2012-07-13 15:07:29.816: [ CLSF][1286]checksum failed for disk:/dev/vx/rdmp/v_df8000_186:2012-07-13 15:07:29.816: [ SKGFD][1286]Lib :UFS:: closing handle 1118de5d0 for disk :/dev/vx/rdmp/v_df8000_186: 2012-07-13 15:07:29.816: [ CSSD][1286]clssnmvDiskVerify: Successful discovery of 0 disks2012-07-13 15:07:29.816: [ CSSD][1286]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery2012-07-13 15:07:29.816: [ CSSD][1286]clssnmvFindInitialConfigs: No voting files found2012-07-13 15:07:29.816: [ CSSD][1286](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds2012-07-13 15:07:30.169: [ CSSD][1029]clssgmExecuteClientRequest(): type(37) size(80) only connect and exit messages are allowed before lease acquisition proc(1115e4870) client(0) ??????ASM DISK PATH???????,????11gR2 RAC+ASM????,??CRS??????,????crsctl start crs -excl -nocrs???????CSS???ASM??, ???????(clssnmCompleteInitVFDiscovery: Voting file not found),????Voteing file????????????????? ?????????,???????11gR2 RAC+ASM??ASM DISK??: 1.?????????ASM DISK?????,??????UDEV????????,???UDEV????ASM DISK?/dev/asm-disk* ??? /dev/rasm-disk*???, ??????udev rule??????: [grid@maclean1 ~]$ export ORACLE_HOME=/g01/grid/app/11.2.0/grid [grid@maclean1 ~]$ /g01/grid/app/11.2.0/grid/bin/sqlplus / as sysasm SQL*Plus: Release 11.2.0.3.0 Production on Sun Jul 15 04:09:28 2012 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> show parameter diskstri NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ asm_diskstring string /dev/asm* ??????ASM?????asm_diskstring ?/dev/asm*, ???root????UDEV RULE?? : [root@maclean1 rules.d]# cp 99-oracle-asmdevices.rules 99-oracle-asmdevices.rules.bak [root@maclean1 rules.d]# vi 99-oracle-asmdevices.rules [root@maclean1 rules.d]# cat 99-oracle-asmdevices.rules KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB09cadb31-cfbea255_", NAME="rasm-diskb", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB5f097069-59efb82f_", NAME="rasm-diskc", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB4e1a81c0-20478bc4_", NAME="rasm-diskd", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VBdcce9285-b13c5a27_", NAME="rasm-diske", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB82effe1a-dbca7dff_", NAME="rasm-diskf", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB950d279f-c581cb51_", NAME="rasm-diskg", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB14400d81-651672d7_", NAME="rasm-diskh", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB31b1237b-78aa22bb_", NAME="rasm-diski", OWNER="grid", GROUP="asmadmin", MODE="0660" ???????99-oracle-asmdevices.rules?UDEV RULE????,??????????/dev/rasm-disk*???,??????ASM DISK???, ????????????????RAC CRS??????? ??????votedisk?ocr ????: [root@maclean1 rules.d]# /g01/grid/app/11.2.0/grid/bin/crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 6896bfc3d1464f9fbf0ea9df87e023ad (/dev/asm-diskb) [SYSTEMDG] 2. ONLINE 58eb81b656084ff2bfd315d9badd08b7 (/dev/asm-diskc) [SYSTEMDG] 3. ONLINE 6bf7324625c54f3abf2c942b1e7f70d9 (/dev/asm-diskd) [SYSTEMDG] 4. ONLINE 43ad8ae20c354f5ebf7083bc30bf94cc (/dev/asm-diske) [SYSTEMDG] 5. ONLINE 4c225359d51b4f93bfba01080664b3d7 (/dev/asm-diskf) [SYSTEMDG] Located 5 voting disk(s). [root@maclean1 rules.d]# /g01/grid/app/11.2.0/grid/bin/ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 2844 Available space (kbytes) : 259276 ID : 879001605 Device/File Name : +SYSTEMDG Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded ??votedisk file?????????ASM DISK,?????????crsctl replace votedisk, ??????LINUX OS: [root@maclean1 rules.d]# init 6 rebooting ............ [root@maclean1 dev]# ls -l *asm* brw-rw---- 1 grid asmadmin 8, 16 Jul 15 04:15 rasm-diskb brw-rw---- 1 grid asmadmin 8, 32 Jul 15 04:15 rasm-diskc brw-rw---- 1 grid asmadmin 8, 48 Jul 15 04:15 rasm-diskd brw-rw---- 1 grid asmadmin 8, 64 Jul 15 04:15 rasm-diske brw-rw---- 1 grid asmadmin 8, 80 Jul 15 04:15 rasm-diskf brw-rw---- 1 grid asmadmin 8, 96 Jul 15 04:15 rasm-diskg brw-rw---- 1 grid asmadmin 8, 112 Jul 15 04:15 rasm-diskh brw-rw---- 1 grid asmadmin 8, 128 Jul 15 04:15 rasm-diski ??????????/dev/rasm-disk*?ASM DISK,??ASM??????css?????/dev/asm*?????ASM DISK,??????????????ASM DISK: more /g01/grid/app/11.2.0/grid/log/maclean1/cssd/ocssd.log 2012-07-15 04:17:45.208: [ SKGFD][1099548992]Discovery with str:/dev/asm*: 2012-07-15 04:17:45.208: [ SKGFD][1099548992]UFS discovery with :/dev/asm*: 2012-07-15 04:17:45.208: [ SKGFD][1099548992]OSS discovery with :/dev/asm*: 2012-07-15 04:17:45.208: [ CSSD][1099548992]clssnmvDiskVerify: Successful discovery of 0 disks 2012-07-15 04:17:45.208: [ CSSD][1099548992]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery 2012-07-15 04:17:45.208: [ CSSD][1099548992]clssnmvFindInitialConfigs: No voting files found 2012-07-15 04:17:45.208: [ CSSD][1099548992](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds 2012-07-15 04:17:45.251: [ CSSD][1096661312]clssgmExecuteClientRequest(): type(37) size(80) only connect and exit messages are allowed before lease acquisition proc(0x26a8ba0) client((nil)) 2012-07-15 04:17:45.251: [ CSSD][1096661312]clssgmDeadProc: proc 0x26a8ba0 2012-07-15 04:17:45.251: [ CSSD][1096661312]clssgmDestroyProc: cleaning up proc(0x26a8ba0) con(0xfe6) skgpid ospid 3751 with 0 clients, refcount 0 2012-07-15 04:17:45.252: [ CSSD][1096661312]clssgmDiscEndpcl: gipcDestroy 0xfe6 2012-07-15 04:17:45.829: [ CSSD][1096661312]clssscSelect: cookie accept request 0x2318ea0 2012-07-15 04:17:45.829: [ CSSD][1096661312]clssgmAllocProc: (0x2659480) allocated 2012-07-15 04:17:45.830: [ CSSD][1096661312]clssgmClientConnectMsg: properties of cmProc 0x2659480 - 1,2,3,4,5 2012-07-15 04:17:45.830: [ CSSD][1096661312]clssgmClientConnectMsg: Connect from con(0x114e) proc(0x2659480) pid(3751) version 11:2:1:4, properties: 1,2,3,4,5 2012-07-15 04:17:45.830: [ CSSD][1096661312]clssgmClientConnectMsg: msg flags 0x0000 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssscSelect: cookie accept request 0x253ddd0 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssscevtypSHRCON: getting client with cmproc 0x253ddd0 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssgmRegisterClient: proc(3/0x253ddd0), client(61/0x26877b0) 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssgmExecuteClientRequest(): type(6) size(684) only connect and exit messages are  allowed before lease acquisition proc(0x253ddd0) client(0x26877b0) 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssgmDiscEndpcl: gipcDestroy 0x1174 2012-07-15 04:17:46.070: [ CSSD][1096661312]clssscSelect: cookie accept request 0x26368a0 2012-07-15 04:17:46.070: [ CSSD][1096661312]clssscevtypSHRCON: getting client with cmproc 0x26368a0 2012-07-15 04:17:46.070: [ CSSD][1096661312]clssgmRegisterClient: proc(5/0x26368a0), client(50/0x26877b0) ??11gR2?CRS?????ASM,??ocr???ASM?,??ASM???????,???CRS?????????: [root@maclean1 ~]# crsctl check has CRS-4638: Oracle High Availability Services is online [root@maclean1 ~]# crsctl check crs CRS-4638: Oracle High Availability Services is online CRS-4535: Cannot communicate with Cluster Ready Services CRS-4530: Communications failure contacting Cluster Synchronization Services daemon CRS-4534: Cannot communicate with Event Manager 2. ?????ASM DISK PATH???????,?????????????CRS: ??????OHASD??: [root@maclean1 ~]# crsctl stop has -f CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'maclean1' CRS-2673: Attempting to stop 'ora.mdnsd' on 'maclean1' CRS-2673: Attempting to stop 'ora.crf' on 'maclean1' CRS-2677: Stop of 'ora.mdnsd' on 'maclean1' succeeded CRS-2677: Stop of 'ora.crf' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'maclean1' CRS-2677: Stop of 'ora.gipcd' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gpnpd' on 'maclean1' CRS-2677: Stop of 'ora.gpnpd' on 'maclean1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'maclean1' has completed CRS-4133: Oracle High Availability Services has been stopped. 3. ?-excl -nocrs????CRS,?????ASM ???????CRS??: [root@maclean1 ~]# crsctl start crs -excl -nocrs  CRS-4123: Oracle High Availability Services has been started. CRS-2672: Attempting to start 'ora.mdnsd' on 'maclean1' CRS-2676: Start of 'ora.mdnsd' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.gpnpd' on 'maclean1' CRS-2676: Start of 'ora.gpnpd' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.cssdmonitor' on 'maclean1' CRS-2672: Attempting to start 'ora.gipcd' on 'maclean1' CRS-2676: Start of 'ora.cssdmonitor' on 'maclean1' succeeded CRS-2676: Start of 'ora.gipcd' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.cssd' on 'maclean1' CRS-2672: Attempting to start 'ora.diskmon' on 'maclean1' CRS-2676: Start of 'ora.diskmon' on 'maclean1' succeeded CRS-2676: Start of 'ora.cssd' on 'maclean1' succeeded CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'maclean1' CRS-2672: Attempting to start 'ora.ctssd' on 'maclean1' CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'maclean1' CRS-2676: Start of 'ora.ctssd' on 'maclean1' succeeded CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.asm' on 'maclean1' CRS-2676: Start of 'ora.asm' on 'maclean1' succeeded #??????CRS_HOME???ORACLE_BASE?777??,??????? [root@maclean1 ~]# chmod 777 /g01 4.??ASM???disk_strings????ASM DISK PATH??: [root@maclean1 ~]# su - grid [grid@maclean1 ~]$ sqlplus / as sysasm SQL*Plus: Release 11.2.0.3.0 Production on Sun Jul 15 04:40:40 2012 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> alter system set asm_diskstring='/dev/rasm*'; System altered. SQL> alter diskgroup systemdg mount; Diskgroup altered. SQL> create spfile from memory; File created. SQL> startup force mount; ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance ASM instance started Total System Global Area 283930624 bytes Fixed Size 2227664 bytes Variable Size 256537136 bytes ASM Cache 25165824 bytes ASM diskgroups mounted SQL> show parameter spfile NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ spfile string /g01/grid/app/11.2.0/grid/dbs/ spfile+ASM1.ora SQL> show parameter disk NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ asm_diskgroups string SYSTEMDG asm_diskstring string /dev/rasm* SQL> create pfile from spfile; File created. SQL> create spfile='+SYSTEMDG' from pfile; File created. SQL> startup force; ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance ASM instance started Total System Global Area 283930624 bytes Fixed Size 2227664 bytes Variable Size 256537136 bytes ASM Cache 25165824 bytes ASM diskgroups mounted SQL> show parameter spfile NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ spfile string +SYSTEMDG/maclean-cluster/asmp arameterfile/registry.253.7886 82933 ???????asm_diskstring ,????ASM DISKGROUP??SPFILE , ??ASM?????SPFILE?????????????????? 5. crsctl replace votedisk ???votedisk????: [root@maclean1 ~]# crsctl replace votedisk +systemdg Successful addition of voting disk 864a00efcfbe4f42bfd0f4f6b60472a0. Successful addition of voting disk ab14d6e727614f29bf53b9870052a5c8. Successful addition of voting disk 754c03c168854f46bf2daee7287bf260. Successful addition of voting disk 9ed58f37f3e84f28bfcd9b101f2af9f3. Successful addition of voting disk 4ce7b7c682364f12bf4df5ce1fb7814e. Successfully replaced voting disk group with +systemdg. CRS-4266: Voting file(s) successfully replaced [root@maclean1 ~]# crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 864a00efcfbe4f42bfd0f4f6b60472a0 (/dev/rasm-diskb) [SYSTEMDG] 2. ONLINE ab14d6e727614f29bf53b9870052a5c8 (/dev/rasm-diskc) [SYSTEMDG] 3. ONLINE 754c03c168854f46bf2daee7287bf260 (/dev/rasm-diskd) [SYSTEMDG] 4. ONLINE 9ed58f37f3e84f28bfcd9b101f2af9f3 (/dev/rasm-diske) [SYSTEMDG] 5. ONLINE 4ce7b7c682364f12bf4df5ce1fb7814e (/dev/rasm-diskf) [SYSTEMDG] Located 5 voting disk(s). [root@maclean1 ~]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 2844 Available space (kbytes) : 259276 ID : 879001605 Device/File Name : +SYSTEMDG Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded ??replace?votedisk??? ASM DISK?,???votedisk?OCR??????? 6.??CRS??: [root@maclean1 ~]# crsctl stop crs CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'maclean1' CRS-2673: Attempting to stop 'ora.mdnsd' on 'maclean1' CRS-2673: Attempting to stop 'ora.ctssd' on 'maclean1' CRS-2673: Attempting to stop 'ora.asm' on 'maclean1' CRS-2677: Stop of 'ora.mdnsd' on 'maclean1' succeeded CRS-2677: Stop of 'ora.asm' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'maclean1' CRS-2677: Stop of 'ora.ctssd' on 'maclean1' succeeded CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.cssd' on 'maclean1' CRS-2677: Stop of 'ora.cssd' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'maclean1' CRS-2677: Stop of 'ora.gipcd' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gpnpd' on 'maclean1' CRS-2677: Stop of 'ora.gpnpd' on 'maclean1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'maclean1' has completed CRS-4133: Oracle High Availability Services has been stopped. [root@maclean1 ~]# crsctl stat res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.BACKUPDG.dg ONLINE ONLINE maclean1 ora.DATA.dg ONLINE ONLINE maclean1 ora.LISTENER.lsnr ONLINE ONLINE maclean1 ora.SYSTEMDG.dg ONLINE ONLINE maclean1 ora.asm ONLINE ONLINE maclean1 Started ora.gsd OFFLINE OFFLINE maclean1 ora.net1.network ONLINE ONLINE maclean1 ora.ons ONLINE ONLINE maclean1 -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE maclean1 ora.cvu 1 ONLINE ONLINE maclean1 ora.maclean1.vip 1 ONLINE ONLINE maclean1 ora.maclean2.vip 1 ONLINE INTERMEDIATE maclean1 FAILED OVER ora.oc4j 1 ONLINE OFFLINE STARTING ora.prod.db 1 ONLINE OFFLINE Instance Shutdown,S TARTING 2 ONLINE OFFLINE ora.scan1.vip 1 ONLINE ONLINE maclean1 ???????ASM?????SPFILE,???????????????,?????CRS??????? ??11gR2 RAC+ASM?????????,????????????????ASM DISK PATH?????????, ???????????????,????!

    Read the article

  • Telling subversion client to ignore certificate errors

    - by Pekka
    I have set up a copy of Redmine through the Bitnami Redmine Stack and am having trouble accessing a remote SVN repository through https. The trouble seems to be related to the fact that I don't have a signed certificate, and the certificate provided doesn't match the host name (I am accessing the same server through a number of host names). I am new to Ruby, Mongrel, Rails and Redmine. Following the advice in this forum thread, I changed the path Redmine uses to invoke the svn client in \apps\redmine\lib\ redmine\scm\adapters\subversion_adapter.rb from SVN_BIN = "svn" to SVN_BIN = "svn --trust-server-cert --non-interactive --config-dir c:/user/temp" I was hoping that the --trust-server-cert option would fix the certificate problem. However, I am still getting the following error message in mongrel.log: svn: OPTIONS of 'https://server.xyz:8443/svn/reponame': Server certificate verification failed: certificate issued for a different hostname, issuer is not trusted (https://server.xyz:8443) Does anybody know what to do about this? Additional info: I re-started the mongrel service after each change I am sure the configuration change has taken effect because subversion has created a full configuration directory in c:\user\temp I can access the remote repository using command line svn no problem The remote repository runs on a Windows box with VisualSVN

    Read the article

  • Redmine subversion won't ignore certificate error even if told

    - by Pekka
    I have set up a copy of Redmine through the Bitnami Redmine Stack and am having trouble accessing a remote SVN repository through https. The trouble seems to be related to the fact that I don't have a signed certificate, and the certificate provided doesn't match the host name (I am accessing the same server through a number of host names). I am new to Ruby, Mongrel, Rails and Redmine. Following the advice in this forum thread, I changed the path Redmine uses to invoke the svn client in \apps\redmine\lib\ redmine\scm\adapters\subversion_adapter.rb from SVN_BIN = "svn" to SVN_BIN = "svn --trust-server-cert --non-interactive --config-dir c:/user/temp" I was hoping that the --trust-server-cert option would fix the certificate problem. However, I am still getting the following error message in mongrel.log: svn: OPTIONS of 'https://server.xyz:8443/svn/reponame': Server certificate verification failed: certificate issued for a different hostname, issuer is not trusted (https://server.xyz:8443) Does anybody know what to do about this? Additional info: I re-started the mongrel service after each change I am sure the configuration change has taken effect because subversion has created a full configuration directory in c:\user\temp I can access the remote repository using command line svn no problem The remote repository runs on a Windows box with VisualSVN

    Read the article

  • Essbase BSO Data Fragmentation

    - by Ann Donahue
    Essbase BSO Data Fragmentation Data fragmentation naturally occurs in Essbase Block Storage (BSO) databases where there are a lot of end user data updates, incremental data loads, many lock and send, and/or many calculations executed.  If an Essbase database starts to experience performance slow-downs, this is an indication that there may be too much fragmentation.  See Chapter 54 Improving Essbase Performance in the Essbase DBA Guide for more details on measuring and eliminating fragmentation: http://docs.oracle.com/cd/E17236_01/epm.1112/esb_dbag/daprcset.html Fragmentation is likely to occur in the following situations: Read/write databases that users are constantly updating data Databases that execute calculations around the clock Databases that frequently update and recalculate dense members Data loads that are poorly designed Databases that contain a significant number of Dynamic Calc and Store members Databases that use an isolation level of uncommitted access with commit block set to zero There are two types of data block fragmentation Free space tracking, which is measured using the Average Fragmentation Quotient statistic. Block order on disk, which is measured using the Average Cluster Ratio statistic. Average Fragmentation Quotient The Average Fragmentation Quotient ratio measures free space in a given database.  As you update and calculate data, empty spaces occur when a block can no longer fit in its original space and will either append at the end of the file or fit in another empty space that is large enough.  These empty spaces take up space in the .PAG files.  The higher the number the more empty spaces you have, therefore, the bigger the .PAG file and the longer it takes to traverse through the .PAG file to get to a particular record.  An Average Fragmentation Quotient value of 3.174765 means the database is 3% fragmented with free space. Average Cluster Ratio Average Cluster Ratio describes the order the blocks actually exist in the database. An Average Cluster Ratio number of 1 means all the blocks are ordered in the correct sequence in the order of the Outline.  As you load data and calculate data blocks, the sequence can start to be out of order.  This is because when you write to a block it may not be able to place back in the exact same spot in the database that it existed before.  The lower this number the more out of order it becomes and the more it affects performance.  An Average Cluster Ratio value of 1 means no fragmentation.  Any value lower than 1 i.e. 0.01032828 means the data blocks are getting further out of order from the outline order. Eliminating Data Block Fragmentation Both types of data block fragmentation can be removed by doing a dense restructure or export/clear/import of the data.  There are two types of dense restructure: 1. Implicit Restructures Implicit dense restructure happens when outline changes are done using EAS Outline Editor or Dimension Build. Essbase restructures create new .PAG files restructuring the data blocks in the .PAG files. When Essbase restructures the data blocks, it regenerates the index automatically so that index entries point to the new data blocks. Empty blocks are NOT removed with implicit restructures. 2. Explicit Restructures Explicit dense restructure happens when a manual initiation of the database restructure is executed. An explicit dense restructure is a full restructure which comprises of a dense restructure as outlined above plus the removal of empty blocks Empty Blocks vs. Fragmentation The existence of empty blocks is not considered fragmentation.  Empty blocks can be created through calc scripts or formulas.  An empty block will add to an existing database block count and will be included in the block counts of the database properties.  There are no statistics for empty blocks.  The only way to determine if empty blocks exist in an Essbase database is to record your current block count, export the entire database, clear the database then import the exported data.  If the block count decreased, the difference is the number of empty blocks that had existed in the database.

    Read the article

  • Redundant Interconnect with Highly Available IP (HAIP) ??

    - by JaneZhang(???)
      ?11.2.0.2??,Oracle ?????Grid Infrastructure(GI)????Redundant Interconnect with Highly Available IP(HAIP).  ?11.2.0.2??,???????????OS?????????,??HAIP??,?????????????????????  ???GI????,??????????????????,??:   ???,HAIP???????169.254.*.*,????????????HAIP ???1?,???4?(???????),???????????  ??:$ crsctl stat res -t -init NAME           TARGET  STATE        SERVER   STATE_DETAILS Cluster Resources--------------------------------------------------------------------------------ora.cluster_interconnect.haip       1        ONLINE  ONLINE       node2                       #ifconfig -aeth1      Link encap:Ethernet  HWaddr 00:14:22:BD:59:DE  <=====????          inet addr:192.168.10.2  Bcast:192.168.10.255  Mask:255.255.255.0          inet6 addr: fe80::214:22ff:febd:59de/64 Scope:Link          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1          RX packets:54297359 errors:0 dropped:0 overruns:0 frame:0          TX packets:58151488 errors:0 dropped:0 overruns:0 carrier:0          collisions:0 txqueuelen:1000           RX bytes:837602539 (798.8 MiB)  TX bytes:3809085161 (3.5 GiB)          Interrupt:169 eth1:1    Link encap:Ethernet  HWaddr 00:14:22:BD:59:DE  <=====????????          inet addr:169.254.185.195  Bcast:169.254.255.255  Mask:255.255.0.0          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1          Interrupt:169 ???????HAIP??cluster interconnect: Cluster communication is configured to use the following interface(s) for this instance  169.254.185.195cluster interconnect IPC version:Oracle UDP/IP (generic)IPC Vendor 1 proto 2ASM ?????? HAIP ??cluster interconnect: Cluster communication is configured to use the following interface(s) for this instance  169.254.185.195cluster interconnect IPC version:Oracle UDP/IP (generic)IPC Vendor 1 proto 2  Oracle????ASM??????HAIP??????????????????????,????????????????????,???????????,????HAIP????????????????,???????????  HAIP ????????????,?????????????????   ??HAIP?????,???My Oracle Support Note ??1210883.1.

    Read the article

  • Unexpected start of already-primary server processes when heartbeat on secondary is stopped.

    - by vorik
    Hi, I've got an active-passive Heartbeat cluster with Apache, MySQL, ActiveMQ and DRBD. Today, I wanted to perform hardware-maintenance on the secondary node (node04), so I stopped the heartbeat service before shutting it down. Then, the primary node (node03) received a shutdown notice from the secondary node (node04). This logging comes from the primary node: node03 heartbeat[4458]: 2010/03/08_08:52:56 info: Received shutdown notice from 'node04.companydomain.nl'. heartbeat[4458]: 2010/03/08_08:52:56 info: Resources being acquired from node04.companydomain.nl. harc[27522]: 2010/03/08_08:52:56 info: Running /etc/ha.d/rc.d/status status heartbeat[27523]: 2010/03/08_08:52:56 info: Local Resource acquisition completed. mach_down[27567]: 2010/03/08_08:52:56 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired mach_down[27567]: 2010/03/08_08:52:56 info: mach_down takeover complete for node node04.companydomain.nl. heartbeat[4458]: 2010/03/08_08:52:56 info: mach_down takeover complete. harc[27620]: 2010/03/08_08:52:56 info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp ip-request-resp[27620]: 2010/03/08_08:52:56 received ip-request-resp drbddisk OK yes ResourceManager[27645]: 2010/03/08_08:52:56 info: Acquiring resource group: node03.companydomain.nl drbddisk Filesystem::/dev/drbd0::/data::ext3 mysql apache::/etc/httpd/conf/httpd.conf LVSSyncDaemonSwap::master monitor activemq tivoli-cluster MailTo::[email protected]::DRBDFailureDrisAcc MailTo::[email protected]::DRBDFailureDrisAcc 1.2.3.212 ResourceManager[27645]: 2010/03/08_08:52:56 info: Running /etc/ha.d/resource.d/drbddisk start Filesystem[27700]: 2010/03/08_08:52:57 INFO: Running OK ResourceManager[27645]: 2010/03/08_08:52:57 info: Running /etc/ha.d/resource.d/mysql start mysql[27783]: 2010/03/08_08:52:57 Starting MySQL[ OK ] apache[27853]: 2010/03/08_08:52:57 INFO: Running OK ResourceManager[27645]: 2010/03/08_08:52:57 info: Running /etc/ha.d/resource.d/monitor start monitor[28160]: 2010/03/08_08:52:58 ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/activemq start activemq[28210]: 2010/03/08_08:52:58 Starting ActiveMQ Broker... ActiveMQ Broker is already running. ResourceManager[27645]: 2010/03/08_08:52:58 ERROR: Return code 1 from /etc/ha.d/resource.d/activemq ResourceManager[27645]: 2010/03/08_08:52:58 CRIT: Giving up resources due to failure of activemq ResourceManager[27645]: 2010/03/08_08:52:58 info: Releasing resource group: node03.companydomain.nl drbddisk Filesystem::/dev/drbd0::/data::ext3 mysql apache::/etc/httpd/conf/httpd.conf LVSSyncDaemonSwap::master monitor activemq tivoli-cluster MailTo::[email protected]::DRBDFailureDrisAcc MailTo::[email protected]::DRBDFailureDrisAcc 1.2.3.212 ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/IPaddr 1.2.3.212 stop IPaddr[28329]: 2010/03/08_08:52:58 INFO: ifconfig eth0:0 down IPaddr[28312]: 2010/03/08_08:52:58 INFO: Success ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/MailTo [email protected] DRBDFailureDrisAcc stop MailTo[28378]: 2010/03/08_08:52:58 INFO: Success ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/MailTo [email protected] DRBDFailureDrisAcc stop MailTo[28433]: 2010/03/08_08:52:58 INFO: Success ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/tivoli-cluster stop ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/activemq stop activemq[28503]: 2010/03/08_08:53:01 Stopping ActiveMQ Broker... Stopped ActiveMQ Broker. ResourceManager[27645]: 2010/03/08_08:53:01 info: Running /etc/ha.d/resource.d/monitor stop monitor[28681]: 2010/03/08_08:53:01 ResourceManager[27645]: 2010/03/08_08:53:01 info: Running /etc/ha.d/resource.d/LVSSyncDaemonSwap master stop LVSSyncDaemonSwap[28714]: 2010/03/08_08:53:02 info: ipvs_syncmaster down LVSSyncDaemonSwap[28714]: 2010/03/08_08:53:02 info: ipvs_syncbackup up LVSSyncDaemonSwap[28714]: 2010/03/08_08:53:02 info: ipvs_syncmaster released ResourceManager[27645]: 2010/03/08_08:53:02 info: Running /etc/ha.d/resource.d/apache /etc/httpd/conf/httpd.conf stop apache[28782]: 2010/03/08_08:53:03 INFO: Killing apache PID 18390 apache[28782]: 2010/03/08_08:53:03 INFO: apache stopped. apache[28771]: 2010/03/08_08:53:03 INFO: Success ResourceManager[27645]: 2010/03/08_08:53:03 info: Running /etc/ha.d/resource.d/mysql stop mysql[28851]: 2010/03/08_08:53:24 Shutting down MySQL.....................[ OK ] ResourceManager[27645]: 2010/03/08_08:53:24 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext3 stop Filesystem[29010]: 2010/03/08_08:53:25 INFO: Running stop for /dev/drbd0 on /data Filesystem[29010]: 2010/03/08_08:53:25 INFO: Trying to unmount /data Filesystem[29010]: 2010/03/08_08:53:25 ERROR: Couldn't unmount /data; trying cleanup with SIGTERM Filesystem[29010]: 2010/03/08_08:53:25 INFO: Some processes on /data were signalled Filesystem[29010]: 2010/03/08_08:53:27 INFO: unmounted /data successfully Filesystem[28999]: 2010/03/08_08:53:27 INFO: Success ResourceManager[27645]: 2010/03/08_08:53:27 info: Running /etc/ha.d/resource.d/drbddisk stop heartbeat[4458]: 2010/03/08_08:53:29 WARN: node node04.companydomain.nl: is dead heartbeat[4458]: 2010/03/08_08:53:29 info: Dead node node04.companydomain.nl gave up resources. heartbeat[4458]: 2010/03/08_08:53:29 info: Link node04.companydomain.nl:eth0 dead. heartbeat[4458]: 2010/03/08_08:53:29 info: Link node04.companydomain.nl:eth1 dead. hb_standby[29193]: 2010/03/08_08:53:57 Going standby [foreign]. heartbeat[4458]: 2010/03/08_08:53:57 info: node03.companydomain.nl wants to go standby [foreign] Soo... What just happened here??? Heartbeat on node04 stopped and told node03, which was the active node at the time. Somehow, node03 decided to start the cluster processes that were already running. (For the processes that are not critical, I always return a 0 from the startupscript so it does not stops the entire cluster when a non-essential part fails.) When starting ActiveMQ, it returns status 1 because it is already running. This fails the node and shuts everything down. As heartbeat is not running on the secondary node, it cannot failover to there. When I tried to run ha_takeover to restart the resources, absolutely nothing happened. Only after I restarted heartbeat on the primary node the resources could be started (after a delay of 2 minutes). These are my questions: Why does heartbeat on the primary node try to start the cluster processes again? Why did ha_takeover not work? What can I do to prevent this from happening? Server configuration: DRBD: version: 8.3.7 (api:88/proto:86-91) GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by [email protected], 2010-01-20 09:14:48 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate B r---- ns:0 nr:6459432 dw:6459432 dr:0 al:0 bm:301 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0 uname -a Linux node04 2.6.18-164.11.1.el5 #1 SMP Wed Jan 6 13:26:04 EST 2010 x86_64 x86_64 x86_64 GNU/Linux haresources node03.companydomain.nl \ drbddisk \ Filesystem::/dev/drbd0::/data::ext3 \ mysql \ apache::/etc/httpd/conf/httpd.conf \ LVSSyncDaemonSwap::master \ monitor \ activemq \ tivoli-cluster \ MailTo::[email protected]::DRBDFailureDrisAcc \ MailTo::[email protected]::DRBDFailureDrisAcc \ 1.2.3.212 ha.cf debugfile /var/log/ha-debug logfile /var/log/ha-log keepalive 500ms deadtime 30 warntime 10 initdead 120 udpport 694 mcast eth0 225.0.0.3 694 1 0 mcast eth1 225.0.0.4 694 1 0 auto_failback off node node03.companydomain.nl node node04.companydomain.nl respawn hacluster /usr/lib64/heartbeat/dopd apiauth dopd gid=haclient uid=hacluster Thank you very much in advance, Ger Apeldoorn

    Read the article

  • propertyregex removes return characters in multiline

    - by javydreamercsw
    I'm using ants propertyregex method to change a property and it works fine up to a point. I'm lossing return characters. Here's what I'm trying to change: cluster.path=\ ${nbplatform.active.dir}/harness:\ ${nbplatform.active.dir}/platform:\ ${nbplatform.active.dir}/nb This is in a .properties file. So I'm trying to change it like this: <propertyregex property="cluster.path" input="${cluster.path}" regexp="nbplatform.active.dir" replace="xplatform.base" global="true" override="true"/> The stuff is replaced but I get: cluster.path= ${xplatform.base}/harness\: ${xplatform.base}/platform\: ${xplatform.base}/nb This brakes logic down the line not controlled by me (Netbeans) that uses the ':' as delimiter. Any idea?

    Read the article

  • Tomcat session stickiness in session replication

    - by rabbit
    Hi, I am having a 2 instance load balanced and session replicated tomcat 6.0.20 cluster. Should sticky_session be set to true or false for in memory session replication. http://tomcat.apache.org/connectors-doc/reference/workers.html mentions : Set sticky_session to False when Tomcat is using a Session Manager which can persist session data across multiple instances of Tomcat. where as /tomcat-6.0-doc/cluster-howto.html (Cluster Basics) mentions : Make sure that your loadbalancer is configured for sticky session mode.

    Read the article

  • Problem drawing a polygon on data clusters in MATLAB

    - by Hossein
    Hi, I have some data points which I have devided into them into some clusters with some clustering algorithms as the picture below:(it might takes some time for the image to appear) Each color represents different cluster. I have to draw polygons around each cluster. I use convhull for this reason. But as you can see the polygon for the red cluster is very big and covers a lot of areas, which is not the one I am looking for. I need to draw lines(ploygons) exactly around my data sets. For example in the picture above I want a polygon that is drawn exactly the same(and around) as the red cluster with the 3 branches. In other words, in this case I need a polygon with 3 branches to cover my red clusters not that big polygon that covers the whole area. Can anyone help me with this? Please Note that the solution should be general, because the clusters will change in each run of the algorithm, so it needs to be in a way that is general.

    Read the article

  • matlab get color

    - by iteratorr
    I am using matlab for cluster visualization. I want to somehow get the color of my current cluster center fill in the plot and draw line of same color to cluster members. How can I get the color?

    Read the article

  • Can computer clusters be used for general everyday applications?

    - by Matt Pascoe
    Does anyone know how a computer cluster can be used for everyday applications, like for example video games? I would like to build a computer cluster that can run applications over the cluster that were not specifically designed for computer clusters and still see the performance increase. One use would be for video games, but I would also like to utilize the increased computing power for running a large network of virtualized machines.

    Read the article

  • What db fits me?

    - by afvasd
    Dear Everyone I am currently using mysql. I am finding that my schema is getting incredibly complicated. I seek to find a new db that will suit my needs: Let's assume I am building a news aggregrator (which collects news from multiple website). I then run algorithms to determine if two news from different sites are actually referring to the same topic. I run this algorithm to cluster news together. The relationship is depicted below: cluster \--news1 \--word1 \--word2 \--news2 \--word3 \--news3 \--word1 \--word3 And then I will apply some magic and determine the importance of each word. Summing all the importance of each word gives me the importance of a news article. Summing the importance of each news article gives me the importance of a cluster. Note that above cluster there are also subgroups( like split by region etc), and categories (like sports, etc) which I have to determine the importance of that in a particular day per se. I have used views in the past to do so, but I realized that views are very slow. So i will normally do an insert into an actual table and index them for better performance. As you can see this leads to multiple tables derived like (cluster, importance), (news, importance), (words, importance) etc which can get pretty messy. Also the "importance" metric will change. It has become increasingly difficult to alter tables, update data (which I am using TRUNCATE TABLE) and then inserting from null. I am currently looking into something schemaless like Mongodb. I do not need distributedness. I would very much want something that is reasonably fast (which can be indexed) and something that is a lot more flexible that traditional RDMBS. Also, I need something that has some kind of ORM because I personally like ORM a lot. I am currently using sqlalchemy Please help!

    Read the article

  • Looking for advice on Hyper-v storage replication

    - by Notre1
    I am designing a 2-host Hyper-V R2 cluster with 6-10 guests stored on a SMB iSCSI SAN device (probably Promise VessRAID). I will be getting at least two of the SAN devices and need to eliminate the storage a single point of failure. Ideally, that would involve real-time failover for the storage, like the Windows failover clustering does for the hosts. This design will be used at around six of our sites, and I would like to allow for us to eventually setup a cluster at colocation site and replicate each site's VMs there for DR. (Ideally a live multi-site cluster, but a manual import of the VMs would be fine for this sort of DR.) The tools that come with enterprise SANs, like EMC and NetApp, seem to be the most commonly used items for a Hyper-V cluster, but I can't afford their prices with my budget. Outside of them, the two tools that seem to be most common for Hyper-V storage replication are SteelEye (now SIOS) DataKeeper Cluster Edition and Double-Take Availability. Originally, I was planning on using Clustered Shared Volume(s) (CSV), but it seems like replication support for these is either not available or brand new in both these products. It looks like CSVs are supported in Double-Take 5.22, see this discussion, but I don't think I want to run something that new in production. Right now, it seems like the best option for me is not to implement CSVs, implement some sort of storage replication, and upgrade to CSVs at a later date once replicating them is more mature. I would love to have live migration, and CSVs are not required for live migration if you are using one LUN per VM, so I guess this is what I'll do. I would prefer to stick to the using the Microsoft Windows Server and Hyper-V tools and features as much as possible. From that standpoint, SteelEye looks more appealing than Double-Take because they make the DataKeeper volume(s) available to the Failover Clustering Manager and then failover clustering is all configured and managed through the native Microsoft tools. Double-Take says that "clustered Hyper-V hosts are not supported," and Double-Take Availability itself seems to be what is used for the actual clustering and failover. Does anyone know if any of these replication tools work with more than two hosts in the cluster? All the information I can find on the web only uses two hosts in their examples. Are there any better tools than SteelEye and Double-Take for doing what I am trying to do, which is eliminate the storage as as single point of failure? Neverfail, AppAssure, and DataCore all seem to offer similar functionality, but they don't seems to be as popular as SteelEye and Double-Take. I have seen a number of people suggest using Starwind iSCSI SAN software for the shared storage, which includes replication (and CSV replication at that). There are a couple of reasons I have not seriously considered this route: 1) The company I work for is exclusively a Dell shop and Dell does not have any servers with that I can pack with more than six 3.5" SATA drives. 2) In the future, it could be advantegous for us to not be locked into a particular brand or type of storage and third-party replication softwares all allow replication to heterogeneous storage devices. I am pretty new to iSCSI and clustering, so please let me know if it looks like I am planning something that goes against best practices or overlooking/missing something.

    Read the article

  • Looking for advice on Hyper-v storage replication

    - by Notre1
    I am designing a 2-host Hyper-V R2 cluster with 6-10 guests stored on a SMB iSCSI SAN device (probably Promise VessRAID). I will be getting at least two of the SAN devices and need to eliminate the storage a single point of failure. Ideally, that would involve real-time failover for the storage, like the Windows failover clustering does for the hosts. This design will be used at around six of our sites, and I would like to allow for us to eventually setup a cluster at colocation site and replicate each site's VMs there for DR. (Ideally a live multi-site cluster, but a manual import of the VMs would be fine for this sort of DR.) The tools that come with enterprise SANs, like EMC and NetApp, seem to be the most commonly used items for a Hyper-V cluster, but I can't afford their prices with my budget. Outside of them, the two tools that seem to be most common for Hyper-V storage replication are SteelEye (now SIOS) DataKeeper Cluster Edition and Double-Take Availability. Originally, I was planning on using Clustered Shared Volume(s) (CSV), but it seems like replication support for these is either not available or brand new in both these products. It looks like CSVs are supported in Double-Take 5.22, see this discussion, but I don't think I want to run something that new in production. Right now, it seems like the best option for me is not to implement CSVs, implement some sort of storage replication, and upgrade to CSVs at a later date once replicating them is more mature. I would love to have live migration, and CSVs are not required for live migration if you are using one LUN per VM, so I guess this is what I'll do. I would prefer to stick to the using the Microsoft Windows Server and Hyper-V tools and features as much as possible. From that standpoint, SteelEye looks more appealing than Double-Take because they make the DataKeeper volume(s) available to the Failover Clustering Manager and then failover clustering is all configured and managed through the native Microsoft tools. Double-Take says that "clustered Hyper-V hosts are not supported," and Double-Take Availability itself seems to be what is used for the actual clustering and failover. Does anyone know if any of these replication tools work with more than two hosts in the cluster? All the information I can find on the web only uses two hosts in their examples. Are there any better tools than SteelEye and Double-Take for doing what I am trying to do, which is eliminate the storage as as single point of failure? Neverfail, AppAssure, and DataCore all seem to offer similar functionality, but they don't seems to be as popular as SteelEye and Double-Take. I have seen a number of people suggest using Starwind iSCSI SAN software for the shared storage, which includes replication (and CSV replication at that). There are a couple of reasons I have not seriously considered this route: 1) The company I work for is exclusively a Dell shop and Dell does not have any servers with that I can pack with more than six 3.5" SATA drives. 2) In the future, it could be advantegous for us to not be locked into a particular brand or type of storage and third-party replication softwares all allow replication to heterogeneous storage devices. I am pretty new to iSCSI and clustering, so please let me know if it looks like I am planning something that goes against best practices or overlooking/missing something.

    Read the article

  • what is the difference between ltsp-server and ltsp-server-standalone packages?

    - by Dhani DDn
    what is the difference between ltsp-server and ltsp-server-standalone packages? and what packages i must use for setting up ltsp-cluster root server. according to this link https://help.ubuntu.com/community/UbuntuLTSP/LTSP-Cluster the root server use the ltsp-server and dhcp3-server packages .. but i think ltsp-server-standalone and isc-dhcp-server packages is the newer one .... is that okay if i use ltsp-server-standalone and isc-dhcp-server instead ? sorry newbie question

    Read the article

  • Clustering for Mere Mortals (Pt 3)

    - by Geoff N. Hiten
    The Controller Now we get to the meat of the matter.  You want a virtual cluster, the first thing you have to do is create your own portable domain.  Start with a plain vanilla install of Windows 2003 R2 Standard on a semi-default VM. (1 GB RAM, 2 cores, 2 NICs, 128GB dynamically expanding VHD file).  I chose this because it had the smallest disk and memory footprint of any current supported Microsoft Server product.  I created the VM with a single dynamically expanding VHD, one fixed 16 GB VHD, and two NICs.  One NIC is connected to the outside world and the other one is part of an internal-only network.  The first NIC is set up as a DHCP client.  We will get to the other one later. I actually tried this with Windows 2008 R2, but it failed miserably.  Not sure whether it was 2008 R2 or the fact I tried to use cloned VMs in the cluster.  Clustering is one place where NewSID would really come in handy.  Too bad Microsoft bought and buried it. Load and Patch the OS (hence the need for the outside connection).This is a good time to go get dinner.  Maybe a movie too.  There are close to a hundred patches that need to be downloaded and applied.  Avoiding that mess was why I put so much time into trying to get the 2008 R2 version working.  Maybe next time.  Don’t forget to add the extensions for VMLite (or whatever virtualization product you prefer). Set a fixed IP address on the internal-only NIC.  Do not give it a gateway.  Put the same IP address for the NIC and for the DNS Server.  This IP should be in a range that is never available on your public network.  You will need all the addresses in the range available.  See the previous post for the exact settings I used. I chose 10.97.230.1 as the server.  The rest of the 10.97.230 range is what I will use later.  For the curious, those numbers are based on elements of my home address.  Not truly random, but good enough for this project. Do not bridge the network connections.  I never allowed the cluster nodes direct access to any public network. Format the fixed VHD and leave it alone for now. Promote the VM to a Domain Controller.  If you have never done this, don’t worry.  The only meaningful decision is what to call the new domain.  I prefer a bogus name that does not correspond to a real Top-Level Domain (TLD).  .com, .biz., .net, .org  are all TLDs that we know and love.  I chose .test as the TLD since it is descriptive AND it does not exist in the real world.  The domain is called MicroAD.  This gives me MicroAD.Test as my domain. During the promotion process, you will be prompted to install DNS as part of the Domain creation process.  You want to accept this option.  The installer will automatically assign this DNS server as the authoritative owner of the MicroAD.test DNS domain (not to be confused with the MicroAD.test Active Directory domain.) For the rest of the DCPROMO process, just accept the defaults. Now let’s make our IP address management easy.  Add the DHCP Role to the server.  Add the server (10.97.230.1 in this case) as the default gateway to assign to DHCP clients.  Here is where you have to be VERY careful and bind it ONLY to the Internal NIC.  Trust me, your network admin will NOT like an extra DHCP server “helping” out on her network.  Go ahead and create a range of 10-20 IP Addresses in your scope.  You might find other uses for a pocket domain controller <cough> Mirroring </cough> than just for building a cluster.  And Clustering in SQL 2008 and Windows 2008 R2 fully supports DHCP addresses. Now we have three of the five key roles ready.  Two more to go. Next comes file sharing.  Since your cluster node VMs will not have access to any outside, you have to have some way to get files into these VMs.  I simply go to the root of C: and create a “Shared” folder.  I then share it out and grant full control to “Everyone” to both the share and to the underlying NTFS folder.   This will be immensely useful for Service Packs, demo databases, and any other software that isn’t packaged as an ISO that we can mount to the VM. Finally we need to create a block-level multi-connect storage device.  The kind folks at Starwinds Software (http://www.starwindsoftware.com/) graciously gave me a non-expiring demo license for expressly this purpose.  Their iSCSI SAN software lets you create an iSCSI target from nearly any storage medium.  Refreshingly, their product does exactly what they say it does.  Thanks. Remember that 16 GB VHD file?  That is where we are going to carve into our LUNs.  I created an iSCSI folder off the root, just so I can keep everything organized.  I then carved 5 ea. 2 GB iSCSI targets from that folder.  I chose a fixed VHD for performance.  I tried this earlier with a dynamically expanding VHD, but too many layers of abstraction and sparseness combined to make it unusable even for a demo.  Stick with a fixed VHD so there is a one-to-one mapping between abstract and physical storage.  If you read the previous post, you know what I named these iSCSI LUNs and why.  Yes, I do have some left over space.  Always leave yourself room for future growth or options. This gets us up to where we can actually build the nodes and install SQL.  As with most clusters, the real work happens long before the individual nodes get installed and configured.  At least it does if you want the cluster to be a true high-availability platform.

    Read the article

  • Clustering for Mere Mortals (Pt3)

    - by Geoff N. Hiten
    The Controller Now we get to the meat of the matter.  You want a virtual cluster, the first thing you have to do is create your own portable domain.  IStart with a plain vanilla install of Windows 2003 R2 Standard on a semi-default VM. (1 GB RAM, 2 cores, 2 NICs, 128GB dynamically expanding VHD file).  I chose this because it had the smallest disk and memory footprint of any current supported Microsoft Server product.  I created the VM with a single dynamically expanding VHD, one fixed 16 GB VHD, and two NICs.  One NIC is connected to the outside world and the other one is part of an internal-only network.  The first NIC is set up as a DHCP client.  We will get to the other one later. I actually tried this with Windows 2008 R2, but it failed miserably.  Not sure whether it was 2008 R2 or the fact I tried to use cloned VMs in the cluster.  Clustering is one place where NewSID would really come in handy.  Too bad Microsoft bought and buried it. Load and Patch the OS (hence the need for the outside connection).This is a good time to go get dinner.  Maybe a movie too.  There are close to a hundred patches that need to be downloaded and applied.  Avoiding that mess was why I put so much time into trying to get the 2008 R2 version working.  Maybe next time.  Don’t forget to add the extensions for VMLite (or whatever virtualization product you prefer). Set a fixed IP address on the internal-only NIC.  Do not give it a gateway.  Put the same IP address for the NIC and for the DNS Server.  This IP should be in a range that is never available on your public network.  You will need all the addresses in the range available.  See the previous post for the exact settings I used. I chose 10.97.230.1 as the server.  The rest of the 10.97.230 range is what I will use later.  For the curious, those numbers are based on elements of my home address.  Not truly random, but good enough for this project. Do not bridge the network connections.  I never allowed the cluster nodes direct access to any public network. Format the fixed VHD and leave it alone for now. Promote the VM to a Domain Controller.  If you have never done this, don’t worry.  The only meaningful decision is what to call the new domain.  I prefer a bogus name that does not correspond to a real Top-Level Domain (TLD).  .com, .biz., .net, .org  are all TLDs that we know and love.  I chose .test as the TLD since it is descriptive AND it does not exist in the real world.  The domain is called MicroAD.  This gives me MicroAD.Test as my domain. During the promotion process, you will be prompted to install DNS as part of the Domain creation process.  You want to accept this option.  The installer will automatically assign this DNS server as the authoritative owner of the MicroAD.test DNS domain (not to be confused with the MicroAD.test Active Directory domain.) For the rest of the DCPROMO process, just accept the defaults. Now let’s make our IP address management easy.  Add the DHCP Role to the server.  Add the server (10.97.230.1 in this case) as the default gateway to assign to DHCP clients.  Here is where you have to be VERY careful and bind it ONLY to the Internal NIC.  Trust me, your network admin will NOT like an extra DHCP server “helping” out on her network.  Go ahead and create a range of 10-20 IP Addresses in your scope.  You might find other uses for a pocket domain controller <cough> Mirroring </cough> than just for building a cluster.  And Clustering in SQL 2008 and Windows 2008 R2 fully supports DHCP addresses. Now we have three of the five key roles ready.  Two more to go. Next comes file sharing.  Since your cluster node VMs will not have access to any outside, you have to have some way to get files into these VMs.  I simply go to the root of C: and create a “Shared” folder.  I then share it out and grant full control to “Everyone” to both the share and to the underlying NTFS folder.   This will be immensely useful for Service Packs, demo databases, and any other software that isn’t packaged as an ISO that we can mount to the VM. Finally we need to create a block-level multi-connect storage device.  The kind folks at Starwinds Software (http://www.starwindsoftware.com/) graciously gave me a non-expiring demo license for expressly this purpose.  Their iSCSI SAN software lets you create an iSCSI target from nearly any storage medium.  Refreshingly, their product does exactly what they say it does.  Thanks. Remember that 16 GB VHD file?  That is where we are going to carve into our LUNs.  I created an iSCSI folder off the root, just so I can keep everything organized.  I then carved 5 ea. 2 GB iSCSI targets from that folder.  I chose a fixed VHD for performance.  I tried this earlier with a dynamically expanding VHD, but too many layers of abstraction and sparseness combined to make it unusable even for a demo.  Stick with a fixed VHD so there is a one-to-one mapping between abstract and physical storage.  If you read the previous post, you know what I named these iSCSI LUNs and why.  Yes, I do have some left over space.  Always leave yourself room for future growth or options. This gets us up to where we can actually build the nodes and install SQL.  As with most clusters, the real work happens long before the individual nodes get installed and configured.  At least it does if you want the cluster to be a true high-availability platform.

    Read the article

  • J2EE Applications, SPARC T4, Solaris Containers, and Resource Pools

    - by user12620111
    I've obtained a substantial performance improvement on a SPARC T4-2 Server running a J2EE Application Server Cluster by deploying the cluster members into Oracle Solaris Containers and binding those containers to cores of the SPARC T4 Processor. This is not a surprising result, in fact, it is consistent with other results that are available on the Internet. See the "references", below, for some examples. Nonetheless, here is a summary of my configuration and results. (1.0) Before deploying a J2EE Application Server Cluster into a virtualized environment, many decisions need to be made. I'm not claiming that all of the decisions that I have a made will work well for every environment. In fact, I'm not even claiming that all of the decisions are the best possible for my environment. I'm only claiming that of the small sample of configurations that I've tested, this is the one that is working best for me. Here are some of the decisions that needed to be made: (1.1) Which virtualization option? There are several virtualization options and isolation levels that are available. Options include: Hard partitions:  Dynamic Domains on Sun SPARC Enterprise M-Series Servers Hypervisor based virtualization such as Oracle VM Server for SPARC (LDOMs) on SPARC T-Series Servers OS Virtualization using Oracle Solaris Containers Resource management tools in the Oracle Solaris OS to control the amount of resources an application receives, such as CPU cycles, physical memory, and network bandwidth. Oracle Solaris Containers provide the right level of isolation and flexibility for my environment. To borrow some words from my friends in marketing, "The SPARC T4 processor leverages the unique, no-cost virtualization capabilities of Oracle Solaris Zones"  (1.2) How to associate Oracle Solaris Containers with resources? There are several options available to associate containers with resources, including (a) resource pool association (b) dedicated-cpu resources and (c) capped-cpu resources. I chose to create resource pools and associate them with the containers because I wanted explicit control over the cores and virtual processors.  (1.3) Cluster Topology? Is it best to deploy (a) multiple application servers on one node, (b) one application server on multiple nodes, or (c) multiple application servers on multiple nodes? After a few quick tests, it appears that one application server per Oracle Solaris Container is a good solution. (1.4) Number of cluster members to deploy? I chose to deploy four big 64-bit application servers. I would like go back a test many 32-bit application servers, but that is left for another day. (2.0) Configuration tested. (2.1) I was using a SPARC T4-2 Server which has 2 CPU and 128 virtual processors. To understand the physical layout of the hardware on Solaris 10, I used the OpenSolaris psrinfo perl script available at http://hub.opensolaris.org/bin/download/Community+Group+performance/files/psrinfo.pl: test# ./psrinfo.pl -pv The physical processor has 8 cores and 64 virtual processors (0-63) The core has 8 virtual processors (0-7)   The core has 8 virtual processors (8-15)   The core has 8 virtual processors (16-23)   The core has 8 virtual processors (24-31)   The core has 8 virtual processors (32-39)   The core has 8 virtual processors (40-47)   The core has 8 virtual processors (48-55)   The core has 8 virtual processors (56-63)     SPARC-T4 (chipid 0, clock 2848 MHz) The physical processor has 8 cores and 64 virtual processors (64-127)   The core has 8 virtual processors (64-71)   The core has 8 virtual processors (72-79)   The core has 8 virtual processors (80-87)   The core has 8 virtual processors (88-95)   The core has 8 virtual processors (96-103)   The core has 8 virtual processors (104-111)   The core has 8 virtual processors (112-119)   The core has 8 virtual processors (120-127)     SPARC-T4 (chipid 1, clock 2848 MHz) (2.2) The "before" test: without processor binding. I started with a 4-member cluster deployed into 4 Oracle Solaris Containers. Each container used a unique gigabit Ethernet port for HTTP traffic. The containers shared a 10 gigabit Ethernet port for JDBC traffic. (2.3) The "after" test: with processor binding. I ran one application server in the Global Zone and another application server in each of the three non-global zones (NGZ):  (3.0) Configuration steps. The following steps need to be repeated for all three Oracle Solaris Containers. (3.1) Stop AppServers from the BUI. (3.2) Stop the NGZ. test# ssh test-z2 init 5 (3.3) Enable resource pools: test# svcadm enable pools (3.4) Create the resource pool: test# poolcfg -dc 'create pool pool-test-z2' (3.5) Create the processor set: test# poolcfg -dc 'create pset pset-test-z2' (3.6) Specify the maximum number of CPU's that may be addd to the processor set: test# poolcfg -dc 'modify pset pset-test-z2 (uint pset.max=32)' (3.7) bash syntax to add Virtual CPUs to the processor set: test# (( i = 64 )); while (( i < 96 )); do poolcfg -dc "transfer to pset pset-test-z2 (cpu $i)"; (( i = i + 1 )) ; done (3.8) Associate the resource pool with the processor set: test# poolcfg -dc 'associate pool pool-test-z2 (pset pset-test-z2)' (3.9) Tell the zone to use the resource pool that has been created: test# zonecfg -z test-z1 set pool=pool-test-z2 (3.10) Boot the Oracle Solaris Container test# zoneadm -z test-z2 boot (3.11) Save the configuration to /etc/pooladm.conf test# pooladm -s (4.0) Results. Using the resource pools improves both throughput and response time: (5.0) References: System Administration Guide: Oracle Solaris Containers-Resource Management and Oracle Solaris Zones Capitalizing on large numbers of processors with WebSphere Portal on Solaris WebSphere Application Server and T5440 (Dileep Kumar's Weblog)  http://www.brendangregg.com/zones.html Reuters Market Data System, RMDS 6 Multiple Instances (Consolidated), Performance Test Results in Solaris, Containers/Zones Environment on Sun Blade X6270 by Amjad Khan, 2009.

    Read the article

  • New Whitepaper: Oracle WebLogic Clustering

    - by ACShorten
    A new whitepaper is available that outlines the concepts and steps on implementing web application server clustering using the Oracle Utilities Application Framework and Oracle WebLogic Server. The whitepaper include the following: A short discussion on the concepts of clustering How to setup a cluster using Oracle WebLogic's utilities How to configure the Oracle Utilities Application Framework to take advantage of clustering How to deploy the Oracle Utilities Application based products in a clustered environment Common cluster operations The whitepaper is available from My Oracle Support at Doc Id: 1334558.1.

    Read the article

< Previous Page | 23 24 25 26 27 28 29 30 31 32 33 34  | Next Page >