Search Results

Search found 1815 results on 73 pages for 'percona xtradb cluster'.

Page 25/73 | < Previous Page | 21 22 23 24 25 26 27 28 29 30 31 32 | Next Page >

Amazon AWS EC2 + Puppet, get Puppet to know AWS instance tags

- by Piotr Jasiulewicz

I am having a problem with my AWS deployment, fairly new to AWS and Puppet. So coming to my question - can you distinguish puppet nodes with AWS machine tags or CNAME domains? A little background about the plan: have multiple clusters of machines, one php cluster, one legacy php cluster, one java cluster, one perl cluster control configuration with puppet - still pretty new to puppet but as a developer I like the idea of being able to version control configuration of servers have autoscaling enabled on those clusters - obviously the main benefit of the cloud that makes the much hight cost when it comes to any reasonable performance worth it (those amazon machines are slower than my phone...) deployment controlled by Capistrano, this makes things a lot easier So in AWS you get those super nasty public/private machine dns's... no way you can identify machines on those. In order to easer the problem, seams like AWS want's you to tag everything - so I did. Found a script that makes a CNAME record for each machine with the tag "ShortName" thanks to the Route53 API. Every machine has a ShortName tag that becomes its CNAME, unfortunately puppet still resolves the private dns name. I'd like to have node 'perl-cluster'{} in puppet, anyone any clue ho to achieve this? Thanks

Read the article
Reset Network Load Balancer Connection Pool

- by bill_the_loser

I am currently working on load testing a web application on a virtual machine cluster. I am looking for a way to flush the connection pool / NLB cache so that it is like each machine connecting to the NLB is connecting for the first time and doesn't get directed back to the node that it was on last time. This a Windows 2003 Server cluster, behind a Microsoft software based Network Load Balancer. Additional Information: To do the load testing I'm using virtual machines, one for each node on the cluster. Somehow I got two virtual machines connecting to the same node and I'm looking for an easier way to reset those connections without going in to the NLB Manager and stopping and starting each node on the NLB. Update: We went ahead and changed the affinity on all of the nodes of the cluster to none. Now it's a non-issue.

Read the article
OpenLDAP replication fails, "syncrepl_entry: rid=666 be_modify failed (20)"

- by Pavel

I've configured a second host to replicate the main LDAP server via syncrepl in the slapd.conf: syncrepl rid=666 provider=ldaps://my-main-server.com type=refreshAndPersist searchBase="dc=Staff,dc=my-main-server,dc=com" filter="(objectClass=*)" scope=sub schemachecking=off bindmethod=simple binddn="cn=repadmin,dc=my-main-server,dc=com" credentials=mypassword When I restart slapd, it writes to /var/log/debug Jun 11 15:48:33 cluster-mn-04 slapd[29441]: @(#) $OpenLDAP: slapd 2.4.9 (Mar 31 2009 07:18:37) $ ^Ibuildd@yellow:/build/buildd/openldap2.3-2.4.9/debian/build/servers/slapd Jun 11 15:48:34 cluster-mn-04 slapd[29442]: slapd starting Jun 11 15:48:34 cluster-mn-04 slapd[29442]: null_callback : error code 0x14 Jun 11 15:48:34 cluster-mn-04 slapd[29442]: syncrepl_entry: rid=666 be_modify failed (20) Jun 11 15:48:34 cluster-mn-04 slapd[29442]: do_syncrepl: rid=666 quitting I've looked into the sources for the return code and found only #define LDAP_TYPE_OR_VALUE_EXISTS 0x14 in include/ldap.h. Anyway, I don't quite get what the error message means. Can you help me debugging this problem and figure out why the LDAP replication doesn't work? I've managed to put a "manual" copy via slapcat and slapadd into the database, but I'd like to sync automatically. UPDATE: "Solved" by removing /var/lib/ldap/* and re-importing the database with slapadd.

Read the article
issue in installing postgresql 9.3.4 on Windows server 2003 x64

- by randydom

Hello i really did all what i know to install the PostgreSQL 9.3.4 on my windows 2003 server x64, but i'm always stopped with this error : please see the error : http://oi57.tinypic.com/s4tb8i.jpg I really don't know what to do , if i click OK then when i go to the windows services list i don't find the PostgreSQL service so i can't Start the service . can any one please help me to install it correctly . PS: i've followed all steps in the : wiki.postgresql.org/wiki/Troubleshooting_Installation many thanks . here's the installer log * where i get " Failed to initialise the database cluster with initdb " : Called IsVistaOrNewer()... 'winmgmts' object initialized... Version:5.2 MajorVersion:5 Ensuring we can write to the data directory (using cacls): Executing batch file 'rad22ADE.bat'... processed dir: C:\Program Files\PostgreSQL\9.2\data Executing batch file 'rad22ADE.bat'... The files belonging to this database system will be owned by user "Administrator". This user must also own the server process. The database cluster will be initialized with locale "English_United States.1252". The default text search configuration will be set to "english". fixing permissions on existing directory C:/Program Files/PostgreSQL/9.2/data ... initdb: could not change permissions of directory "C:/Program Files/PostgreSQL/9.2/data": Permission denied Called Die(Failed to initialise the database cluster with initdb)... Failed to initialise the database cluster with initdb Script stderr: Program ended with an error exit code Error running cscript //NoLogo "C:\Program Files\PostgreSQL\9.2/installer/server/initcluster.vbs" "NT AUTHORITY\NetworkService" "postgres" "****" "C:\Program Files\PostgreSQL\9.2" "C:\Program Files\PostgreSQL\9.2\data" 5432 "DEFAULT" 0 : Program ended with an error exit code Problem running post-install step. Installation may not complete correctly The database cluster initialisation failed. Creating Uninstaller Creating uninstaller 25% Creating uninstaller 50% Creating uninstaller 75% Creating uninstaller 100% Installation completed Log finished 05/02/2014 at 04:04:04

Read the article
What's a good(affordable) business router that can limit traffic to certain machines(ips/ports)?

- by Ryan Detzel

We are using the basic Verizon router but it sucks so we're looking for a new one that allows us to limit users and our hadoop cluster to certain limits. Our problem is one person can start downloading something and kill the network and every hour we download logs into our cluster but it floods the network unless we rate limit it. Ideally we want to be able to say: total: 35 mbps Hadoop Cluster (15 mbps) Phones (1 mbps) Office(25 people) (19mbps but no one machine can have more than 5mbps)

Read the article
Unable to mount cifs in redhat 6

- by user3734522

I am relatively new to Linux, and I am trying to mount a CIFS filesystem from an openfiler instance I have on my network in Red Hat. The openfiler instance is authenticating using AD. I am able to connect using samba: smbclient '\\10.25.214.26\cluster_storage.cluster.Cluster' -U [DOMAIN]+[USERNAME] Enter DOMAIN+USERNAME's password: Domain=[DOMAIN] OS=[Unix] Server=[Samba 3.5.6] smb: \> When I attempt to mount on boot via fstab, I am told that the line is bad during startup. mount -t cifs -o username=[DOMAIN]+[USERNAME], password=[my password], domain=[domain.edu] '\\10.25.214.26\cluster_storage.cluster.Cluster' /mnt/scratch Any help would be greatly appreciated.

Read the article
Squeezing hardware

- by [email protected]

It's very common that high availability means duplicate hardware so costs grows up.Nowadays, CIOs and DBAs has the main challenge of reduce the money spent increasing the performance and the availability. Since Grid Infrastructure 11gR2, there is a new feature that helps them to afford this challenge: Server PoolsNow, in Grid Infrastructure 11gR2, you can define server pools across the cluster setting up the minimum number of servers, the maximum and how important is the pool.For example:Consider that "Velasco, Boixeda & co" has 3 apps in a 6 servers cluster.First One is the main core business appSecond one is Mid RangeAnd third it's a database not very important.We Define the following resource requirements for expected workload:1- Main App 2 servers required2- Mid Range App requires 1 server3- Is not a required app in case of disasterThe we define 3 server pools across the cluster:1- Main pool min two servers, max three servers, importance four2- Mid pool, min one server max two servers, importance two3- test pool,min zero servers, max one server, importance oneSo the initial configuration is:-Main pool has three servers-Mid pool has two servers-Test pool has one serverLogically, we can see the cluster like this:If any server fails, the following algorithm will be applied:1.-The server pool of least importance2.-IF server pools are of the same importance, THEN then the Server Pool that has more than its defined minimum servers Is chosenHope it helps

Read the article
Clustering and custom applications

- by Ahmed ilyas

I was not entirely sure what tags to put but hope this is ok. This is just a general question in regards to clustering and applications: so lets say we have a clustered environment setup. We cluster SQL Server (I dont know exactly how its done but lets just say its been done for the sake of argument). Now if a website or application is trying to access that database for read/write (say an ASP.NET app or a C# Winforms app) and during that time SQL goes down - it takes a couple of minutes for the clustering failover to take affect to switch to another node. What happens during this time? I think it will time out/unable to connect. BUT is there a way for it to place the request in some pipeline so when the cluster node is back up/switched over it will continue as normal? as you can see, I know nothing much about clustering! what about your own custom .NET apps? Would there be a special way to develop them? I know that you can say create a simple Hello world app, and cluster that but they wouldnt be something you could see interms of the UI or anything, so they would effectively need to be developed as a Windows Service perhaps or even as a standard Console app which runs and not wait for user input but you wouldnt see any output from it (unless you redirect output to somewhere else) What im getting at here is... for those who have experience or developed a cluster application in .NET, how did you do it and what are the things to be aware of? For example we have the cloud service - fundamentally its built on clustering - if there is an outage, another node takes place and service is resumed as normal but we dont really see much of that downtime.

Read the article
?RAC????????????

- by Allen Gao

Normal 0 7.8 ? 0 2 false false false EN-US ZH-CN X-NONE DefSemiHidden="true" DefQFormat="false" DefPriority="99" LatentStyleCount="267" UnhideWhenUsed="false" QFormat="true" Name="Normal"/ UnhideWhenUsed="false" QFormat="true" Name="heading 1"/ UnhideWhenUsed="false" QFormat="true" Name="Title"/ UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/ UnhideWhenUsed="false" QFormat="true" Name="Strong"/ UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/ UnhideWhenUsed="false" Name="Table Grid"/ UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/ UnhideWhenUsed="false" Name="Light Shading"/ UnhideWhenUsed="false" Name="Light List"/ UnhideWhenUsed="false" Name="Light Grid"/ UnhideWhenUsed="false" Name="Medium Shading 1"/ UnhideWhenUsed="false" Name="Medium Shading 2"/ UnhideWhenUsed="false" Name="Medium List 1"/ UnhideWhenUsed="false" Name="Medium List 2"/ UnhideWhenUsed="false" Name="Medium Grid 1"/ UnhideWhenUsed="false" Name="Medium Grid 2"/ UnhideWhenUsed="false" Name="Medium Grid 3"/ UnhideWhenUsed="false" Name="Dark List"/ UnhideWhenUsed="false" Name="Colorful Shading"/ UnhideWhenUsed="false" Name="Colorful List"/ UnhideWhenUsed="false" Name="Colorful Grid"/ UnhideWhenUsed="false" Name="Light Shading Accent 1"/ UnhideWhenUsed="false" Name="Light List Accent 1"/ UnhideWhenUsed="false" Name="Light Grid Accent 1"/ UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/ UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/ UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/ UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/ UnhideWhenUsed="false" QFormat="true" Name="Quote"/ UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/ UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/ UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/ UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/ UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/ UnhideWhenUsed="false" Name="Dark List Accent 1"/ UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/ UnhideWhenUsed="false" Name="Colorful List Accent 1"/ UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/ UnhideWhenUsed="false" Name="Light Shading Accent 2"/ UnhideWhenUsed="false" Name="Light List Accent 2"/ UnhideWhenUsed="false" Name="Light Grid Accent 2"/ UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/ UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/ UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/ UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/ UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/ UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/ UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/ UnhideWhenUsed="false" Name="Dark List Accent 2"/ UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/ UnhideWhenUsed="false" Name="Colorful List Accent 2"/ UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/ UnhideWhenUsed="false" Name="Light Shading Accent 3"/ UnhideWhenUsed="false" Name="Light List Accent 3"/ UnhideWhenUsed="false" Name="Light Grid Accent 3"/ UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/ UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/ UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/ UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/ UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/ UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/ UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/ UnhideWhenUsed="false" Name="Dark List Accent 3"/ UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/ UnhideWhenUsed="false" Name="Colorful List Accent 3"/ UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/ UnhideWhenUsed="false" Name="Light Shading Accent 4"/ UnhideWhenUsed="false" Name="Light List Accent 4"/ UnhideWhenUsed="false" Name="Light Grid Accent 4"/ UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/ UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/ UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/ UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/ UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/ UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/ UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/ UnhideWhenUsed="false" Name="Dark List Accent 4"/ UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/ UnhideWhenUsed="false" Name="Colorful List Accent 4"/ UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/ UnhideWhenUsed="false" Name="Light Shading Accent 5"/ UnhideWhenUsed="false" Name="Light List Accent 5"/ UnhideWhenUsed="false" Name="Light Grid Accent 5"/ UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/ UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/ UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/ UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/ UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/ UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/ UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/ UnhideWhenUsed="false" Name="Dark List Accent 5"/ UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/ UnhideWhenUsed="false" Name="Colorful List Accent 5"/ UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/ UnhideWhenUsed="false" Name="Light Shading Accent 6"/ UnhideWhenUsed="false" Name="Light List Accent 6"/ UnhideWhenUsed="false" Name="Light Grid Accent 6"/ UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/ UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/ UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/ UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/ UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/ UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/ UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/ UnhideWhenUsed="false" Name="Dark List Accent 6"/ UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/ UnhideWhenUsed="false" Name="Colorful List Accent 6"/ UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/ UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/ UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/ UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/ UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/ UnhideWhenUsed="false" QFormat="true" Name="Book Title"/ /* Style Definitions */ table.MsoNormalTable {mso-style-name:????; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.5pt; mso-bidi-font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi; mso-font-kerning:1.0pt;} ????????????????????????????????????????,??????????????Oracle RAC?????????????????????????????,???????????????????,??????RAC???????????,????????????????????????????????????,????3???RAC????????????? ????????MOS ??"Top 11 Things to do NOW to Stabilize your RAC Cluster Environment”(DOC ID 1344678.1)???,???,??????3???????????,?????????????????????????????,???,?????????????????,??,??????????????,??????????????????????,?????????????,??????????????????????,???????RAC DBA???? ??????? (PSU)??,?????????PSU? ???????????,???????Oracle???????????(PSU)???PSU?????????????????,??PSU???????????????????PSU????????,???????????????PSU,????????6????????????????????BUG????,??????????,?????????????????????,???9???,???RAC???(Cluster)BUG,??7%??BUG??????,??????????BUG??????????????????????PSU??????RAC???,PSU????????: PSU?????Grid Infrastructure(GI)home,???????????RDBMS home???????,??GI home????PSU,?????home?????,??????????GI????????????,??????,??RDBMS PSU,GI PSU??????????GI home??????PSU,???????RDBMS??PSU? RAC????PSU????rolling????? –?????????GI? RDBMS?????????????????,??PSU???????,???????????????? ???????????PSU,????????????,?????????PSU????,???RAC?????????????PSU???,???????????????????? ??PSU?????, ??????MOS??: NOTE 854428.1 Intro to Patch Set Updates (PSU) NOTE 1082394.1 11.2.0.X Grid Infrastructure PSU Known Issues NOTE 756671.1 Oracle Recommended Patches -- Oracle Database NOTE 161549.1 Oracle Database, Networking and Grid Agent Patches for Microsoft Platforms NOTE 810394.1 RAC and Oracle Clusterware Best Practices and Starter Kit 11gR2???????,?Diagwait???13? ?2012?,??45%????????11gR2???????,????diagwait?13????RAC???????????,????diagwait??????????????,????????????????, diagwait??RAC?????????????: ?????,??????OPROCD?????1??0.5?????,????,??OPROCD??? 1.5????,?????????diagwait????13??OPROCD??????????10?( diagwait - CSS????[???3?]),????????OPROCD???????????????'?'?????????????,1.5??????????????????OPROCD?????????????11?(1?????+10????)? ?????/???????,??diagwait,??????????????????????,??,????????????? ?11g?2?(11.2.0.1?????)??,?????????????,???????,??????????????????,????????????????,?????????????????????diagwait????????,????????????????????,????????Oracle?????(OCR),?????????OCR???????????,?????????diagwai?????????????????: # $CLUSTERWARE_HOME\bin\crsctl get css diagwait ????DIAGWAIT???,??????MOS??: NOTE 567730.1 Changes in Oracle Clusterware on Linux with the 10.2.0.4 Patchset NOTE 559365.1 Using Diagwait as a diagnostic to get more information for diagnosing Oracle Clusterware Node evictions NOTE 810394.1 RAC and Oracle Clusterware Best Practices and Starter Kit ??OS Watcher Black Box(OSWbb) ? Cluster Health Monitor(CHM) ????????OS??????????????,??,??????OS Watcher Black Box(OSWbb)(??OS Watcher)?Cluster Health Monitor(CHM)????????OS???,??DBA????????????????????????????,?????????????,??????????,?????????????????????????OS????????,????????????,???????????????????? OSWbb?????????,??????,????OS??????????????,????OS??????OSWbb???????: ?????,??30??????????OS?????????????(??5??)????????????????????,?1???5????????????????????????30????????,Oracle???????????????OS?????????????,Oracle??????OSWbb?20???????? OSWbb?????????????????Oracle???????????????????OS????,??,?????????????????????????Oracle???????,?????????????,????????????????? ???11.2.0.3??,??????(HP-UX??)?,Oracle GI?????????,Cluster Health Monitor (CHM)?CHM??????,?????OSW????,??,???????OSW????,?????????? Oracle??????????????????OSWbb?/?CHM,?????????,????????????????????,??????????OSWbb,???????????RAC??,??????????????????(???NOTE 580513.1“How To Start OS Watcher Black Box Every System Boot”??????)? ??OSWbb?CHM?????, ??????MOS??: NOTE 301137.1 OS Watcher Black Box User Guide NOTE 1328466.1 Cluster Health Monitor (CHM) FAQ NOTE 810394.1 RAC and Oracle Clusterware Best Practices and Starter Kit ?? ?????????RAC/ Oracle?????????????3???????????3?,?????RAC??????,?????????????????,?????MOS??: NOTE 1344678.1 Top 11 Things to do NOW to Stabilize your RAC Cluster Environment ????,???MOS-RAC/Scalability community??,?Oracle???????????,????RAC/ Oracle?????

Read the article
links for 2010-04-07

- by Bob Rhubart

James McGovern: Enterprise Architecture and Social CRM "With a few exceptions, the vast majority of enterprise architects I know spend an awful lot of time focused on internal issues whether it is rationalization, the cloud, storage governance, data center consolidation, creation of reference architectures, portfolio management and other considerations that aren’t even visible to customers. One should ask whether IT can be truly successful if we are busy listening to the business but otherwise are blissfully ignorant towards the customers they serve." -- James McGovern (tags: enterprisearchitecture crm socialcomputing) WRF Benchmark: X6275 Beats Power6 - BestPerf "Oracle's Sun Blade X6275 cluster is 28% faster than the IBM POWER6 cluster on Weather Research and Forecasting (WRF) continental United Status (CONUS) benchmark datasets. The Sun Blade X6275 cluster used a Quad Data Rate (QDR) InfiniBand connection along with Intel compilers and MPI." (tags: oracle sun x6275 benchmarks)

Read the article
My error with upgrading 4.0 to 4.2- What NOT to do...

- by Steve Tunstall

Last week, I was helping a client upgrade from the 2011.1.4.0 code to the newest 2011.1.4.2 code. We downloaded the 4.2 update from MOS, upload and unpacked it on both controllers, and upgraded one of the controllers in the cluster with no issues at all. As this was a brand-new system with no networking or pools made on it yet, there were not any resources to fail back and forth between the controllers. Each controller had it's own, private, management interface (igb0 and igb1) and that's it. So we took controller 1 as the passive controller and upgraded it first. The first controller came back up with no issues and was now on the 4.2 code. Great. We then did a takeover on controller 1, making it the active head (although there were no resources for it to take), and then proceeded to upgrade controller 2. Upon upgrading the second controller, we ran the health check with no issues. We then ran the update and it ran and rebooted normally. However, something strange then happened. It took longer than normal to come back up, and when it did, we got the "cluster controllers on different code" error message that one gets when the two controllers of a cluster are running different code. But we just upgraded the second controller to 4.2, so they should have been the same, right??? Going into the Maintenance-->System screen of controller 2, we saw something very strange. The "current version" was still on 4.0, and the 4.2 code was there but was in the "previous" state with the rollback icon, as if it was the OLDER code and not the newer code. I have never seen this happen before. I would have thought it was a bad 4.2 code file, but it worked just fine with controller 1, so I don't think that was it. Other than the fact the code did not update, there was nothing else going on with this system. It had no yellow lights, no errors in the Problems section, and no errors in any of the logs. It was just out of the box a few hours ago, and didn't even have a storage pool yet. So.... We deleted the 4.2 code, uploaded it from scratch, ran the health check, and ran the upgrade again. once again, it seemed to go great, rebooted, and came back up to the same issue, where it came to 4.0 instead of 4.2. See the picture below.... HERE IS WHERE I MADE A BIG MISTAKE.... I SHOULD have instantly called support and opened a Sev 2 ticket. They could have done a shared shell and gotten the correct Fishwork engineer to look at the files and the code and determine what file was messed up and fixed it. The system was up and working just fine, it was just on an older code version, not really a huge problem at all. Instead, I went ahead and clicked the "Rollback" icon, thinking that the system would rollback to the 4.2 code. Ouch... What happened was that the system said, "Fine, I will delete the 4.0 code and boot to your 4.2 code"... Which was stupid on my part because something was wrong with the 4.2 code file here and the 4.0 was just fine. So now the system could not boot at all, and the 4.0 code was completely missing from the system, and even a high-level Fishworks engineer could not help us. I had messed it up good. We could only get to the ILOM, and I had to re-image the system from scratch using a hard-to-get-and-use FishStick USB drive. These are tightly controlled and difficult to get, almost always handcuffed to an engineer who will drive out to re-image a system. This took another day of my client's time. So.... If you see a "previous version" of your system code which is actually a version higher than the current version... DO NOT ROLL IT BACK.... It did not upgrade for a very good reason. In my case, after the system was re-imaged to a code level just 3 back, we once again tried the same 4.2 code update and it worked perfectly the first time and is now great and stable. Lesson learned. By the way, our buddy Ryan Matthews wanted to point out the best practice and supported way of performing an upgrade of an active/active ZFSSA, where both controllers are doing some of the work. These steps would not have helpped me for the above issue, but it's important to follow the correct proceedure when doing an upgrade. 1) Upload software to both controllers and wait for it to unpack 2) On controller "A" navigate to configuration/cluster and click "takeover" 3) Wait for controller "B" to finish restarting, then login to it, navigate to maintenance/system, and roll forward to the new software. 4) Wait for controller "B" to apply the update and finish rebooting 5) Login to controller "B", navigate to configuration/cluster and click "takeover" 6) Wait for controller "A" to finish restarting, then login to it, navigate to maintenance/system, and roll forward to the new software. 7) Wait for controller "A" to apply the update and finish rebooting 8) Login to controller "B", navigate to configuration/cluster and click "failback"

Read the article
eSTEP TechCast - December 2012

- by Cinzia Mascanzoni

Join our next eSTEP TechCast will be on Thursday, 06. December 2012, 11:00 - 12:00 GMT (12:00 - 13:00 CET; 15:00 - 16:00 GST) Title: Innovations with Oracle Solaris Cluster 4 In this webcast we will focus at the integration of the cluster software with the IPS packaging system of Solaris 11, which makes installing and updating the software much easier and much more reliable, especially with virtualization technologies involved. Our webcast will also reflect new versions of Oracle Solaris Cluster if they will be announced in the meantime.Call Info: Call-in-toll-free number: 08006948154 (United Kingdom) Call-in-toll-free number: +44-2081181001 (United Kingdom) Show global numbers Conference Code: 803 594 3 Security Passcode: 9876 Webex Info (Oracle Web Conference) Meeting Number: 255 760 510 Meeting Password: tech2011 Playback / Recording / Archive: The webcasts will be recorded and will be available shortly after the event in the eSTEP portal under the Events tab, where you could find also material from already delivered eSTEP TechCasts. Use your email-adress and PIN: eSTEP_2011 to get access.

Read the article
11gR2 ??????????

- by Allen Gao

???????11gR2 GI????????????????????,?10g????,???????GI?????????????1.Ocssd.bin:????????10g??????????,???????(Node Monitoring)????(Group Management)?????????????“??????????”????????2.Cssdagent.bin/Cssdmonitor.bin:?2????11gR2??????????????ocssd.bin??????(Local HeartBeat),??????1??????????????????ocssd.bin???????????,????????ocssd.bin????????,??????,???????????10g??oclsomon/oclsvmon(?????????????)?oprocd????,????11gR2???????—rebootless restart,?????????11.2.0.2????????????,????????????(????????)??????ocssd.bin?????,??????????????,??????????GI stack?????,??GI stack??????????(short disk I/O timeout)??graceful shutdown,????????,??,????????????????????????11gR2 ??????????????1.Ocssd.log2.Cssdagent ? cssdmonitor logs<GI_home>/log/<node_name>/agent/ohasd/oracssdagent_root/oracssdagent_root.log<GI_home>/log/<node_name>/agent/ohasd/oracssdmonitor_root_root/oracssdmonitor_root.log3.Cluster alert log<GI_home>/log/<node_name>/alert<node_name>.log4.OS log5.OSW ?? CHM ????,??????????????????1.???????????????????????????????,??????10g???????????????????????????GI alert log ??,?????node2?2012-08-15 16:30:06.554 [cssd(11011) ]CRS-1612:Network communication with node node1 (1) missing for 50% of timeout interval. Removal of this node from cluster in 14.510 seconds2012-08-15 16:30:13.586 [cssd(11011) ]CRS-1611:Network communication with node node1 (1) missing for 75% of timeout interval. Removal of this node from cluster in 7.470 seconds2012-08-15 16:30:18.606 [cssd(11011) ]CRS-1610:Network communication with node node1 (1) missing for 90% of timeout interval. Removal of this node from cluster in 2.450 seconds2012-08-15 16:30:21.073 [cssd(11011) ]CRS-1632:Node node1 is being removed from the cluster in cluster incarnation 2363798322012-08-15 16:30:21.086 [cssd(11011) ]CRS-1601:CSSD Reconfiguration complete. Active nodes are node2 .?????????????node1?????????????????,???????, node2?? node1 ?????????node1 ???,???node1 ???????????????(????os log ??OSW ????),??node1 ???????node2??node1?????????,????node1??????????,???reconfiguration,????????????,????????????,?11.2.0.2??,??rebootless restart???,node eviction ????????GI stack??,????????????,???node2?node1?????????,node1?ocssd.bin??????(????ocssd.log??)??node1???????????????,??node1??????GI node eviction????2.???????????????,?????10g???????,???????????3.??ocssd.bin ????Cssdagent/Cssdmonitor.bin????????????,??????,????,????oracssdagent_root.log ?oracssdmonitor_root.log ????????2012-07-23 14:09:58.506: [ USRTHRD][1095805248] (:CLSN00111: )clsnproc_needreboot: Impending reboot at 75% of limit 28030; disk timeout 28030, network timeout 26380, last heartbeat from CSSD at epoch seconds 1343023777.410, 21091 milliseconds ago based on invariant clock 269251595; now polling at 100 ms……2012-07-23 14:10:02.704: [ USRTHRD][1095805248] (:CLSN00111: )clsnproc_needreboot: Impending reboot at 90% of limit 28030; disk timeout 28030, network timeout 26380, last heartbeat from CSSD at epoch seconds 1343023777.410, 25291 milliseconds ago based on invariant clock 269251595; now polling at 100 ms……???????????????timeout???28 ???(misscount – reboot time)?4.?????????????????? ??????????????????????,????ocssd.bin??????,?????????????,?????????????ocssd.bin??,????????os???????????OSW??,???? ??????? cpu ???Linux OSWbb v5.0 node1SNAP_INTERVAL 30CPU_COUNT 8OSWBB_ARCHIVE_DEST /osw/archiveprocs -----------memory---------- ---swap-- -----io---- -system-- -----cpu------r b swpd free buff cache si so bi bo in cs us sy id wa……zzz ***Mon Aug 30 17:55:21 CST 2012158 6 4200956 923940 7664 19088464 0 0 1296 3574 11153 231579 0 100 0 0 0zzz ***Mon Aug 30 17:55:53 CST 2012135 4 4200956 923760 7812 19089344 0 0 4 45 570 14563 0 100 0 0 0zzz ***Mon Aug 30 17:56:53 CST 2012126 2 4200956 923784 8396 19083620 0 0 196 1121 651 15941 2 98 0 0 0?????????????,????10g??????11gR2????????????????,??????,????????Note 1050693.1 : Troubleshooting 11.2 Clusterware Node Evictions (Reboots)

Read the article
Which MySQL Fork/Version to Pick??

- by Drew

As most of you know, Sun acquired MySQL (and later Oracle acquired Sun), and during these acquisitions, there were a lot of FUD in MySQL community which resulted in creation of various forks. Today we have MySQL from MySQL, Percona (XtraDB) MySQL, OurDelta MySQL, MariaDB, Drizzle to name a few. Which brings us to the source of the problem. We are in the process of upgrading our databases (hardware/software) and I would like to know which one of the forks should I go with. Each has their own set of pros/cons. We are currently using MySQL 5.0.x from MySQL/Linux on an 8-core machine. Our new hardware is a monster with 32 cores and 32GB of memory connecting to a fast NetApp Storage via FC. I would like to stick with MySQL from MySQL but I have heard horror stories on how badly MySQL 5.1 performs on many cores. I have also heard that MySQL 5.4 performs better on multi-core machines but that's still not production ready. In addition, I have also heard a lot of good things about Percona builds. This is what I know so far: MySQL 5.1 from MySQL: Reliable choice, but doesn't scale well on a big machine Percona: Scales well, good backing company. I don't have much experience with it MariaDB: Don't know much about it besides that it was founded by Original MySQL developers (including Monty) OurDelta: Don't know much Drizzle: Mostly optimized for cloud computing I would like to know what's the general notion about this problem. Which build/version should I go with? How are you guys picking your builds/versions? Thanks!

Read the article
?11gR2 RAC???ASM DISK Path????

- by Liu Maclean(???)

????T.askmaclean.com???????11gR2?ASM DISK?????,??????: aix 6.1,grid 11.2.0.3+asm11.2.0.3+rac ???????????aix????????mpio,??diskgroup ?????veritas dmp???,?????asm?disk_strings=/dev/vx/rdmp/*,crs/asm??????????????/dev/vx/rdmp/?????,?????????diskgroup??? crs???????:2012-07-13 15:07:29.748: [ GPNP][1286]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2108 clsgpnp_profileCallUrlInt] get-profile call to url “ipc://GPNPD_ggtest1? disco “” [f=0 claimed- host: cname: seq: auth:]2012-07-13 15:07:29.762: [ GPNP][1286]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2236 clsgpnp_profileCallUrlInt] Result: (0) CLSGPNP_OK. Successful get-profile CALL to remote “ipc://GPNPD_ggtest1? disco “”2012-07-13 15:07:29.762: [ CSSD][1286]clssnmReadDiscoveryProfile: voting file discovery string(/dev/vx/rdmp/*)2012-07-13 15:07:29.762: [ CSSD][1286]clssnmvDDiscThread: using discovery string /dev/vx/rdmp/* for initial discovery2012-07-13 15:07:29.762: [ SKGFD][1286]Discovery with str:/dev/vx/rdmp/*: 2012-07-13 15:07:29.762: [ SKGFD][1286]UFS discovery with :/dev/vx/rdmp/*: 2012-07-13 15:07:29.769: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_919: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_212: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_211: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_210: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_209: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_181: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_180: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_3: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_2: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_1: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_0: 2012-07-13 15:07:29.771: [ SKGFD][1286]OSS discovery with :/dev/vx/rdmp/*: 2012-07-13 15:07:29.771: [ SKGFD][1286]Handle 1115e7510 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_916: 2012-07-13 15:07:29.772: [ SKGFD][1286]Handle 1118758b0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_912: 2012-07-13 15:07:29.773: [ SKGFD][1286]Handle 1118d9cf0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_908: 2012-07-13 15:07:29.773: [ SKGFD][1286]Handle 1118da450 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_904: 2012-07-13 15:07:29.773: [ SKGFD][1286]Handle 1118dad70 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_903: 2012-07-13 15:07:29.802: [ CLSF][1286]checksum failed for disk:/dev/vx/rdmp/v_df8000_916:2012-07-13 15:07:29.803: [ SKGFD][1286]Lib :UFS:: closing handle 1115e7510 for disk :/dev/vx/rdmp/v_df8000_916: 2012-07-13 15:07:29.803: [ SKGFD][1286]Lib :UFS:: closing handle 1118758b0 for disk :/dev/vx/rdmp/v_df8000_912: 2012-07-13 15:07:29.804: [ SKGFD][1286]Handle 1115e6710 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_202: 2012-07-13 15:07:29.808: [ SKGFD][1286]Handle 1115e7030 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_201: 2012-07-13 15:07:29.809: [ SKGFD][1286]Handle 1115e7ad0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_200: 2012-07-13 15:07:29.809: [ SKGFD][1286]Handle 1118733f0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_199: 2012-07-13 15:07:29.816: [ CLSF][1286]checksum failed for disk:/dev/vx/rdmp/v_df8000_186:2012-07-13 15:07:29.816: [ SKGFD][1286]Lib :UFS:: closing handle 1118de5d0 for disk :/dev/vx/rdmp/v_df8000_186: 2012-07-13 15:07:29.816: [ CSSD][1286]clssnmvDiskVerify: Successful discovery of 0 disks2012-07-13 15:07:29.816: [ CSSD][1286]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery2012-07-13 15:07:29.816: [ CSSD][1286]clssnmvFindInitialConfigs: No voting files found2012-07-13 15:07:29.816: [ CSSD][1286](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds2012-07-13 15:07:30.169: [ CSSD][1029]clssgmExecuteClientRequest(): type(37) size(80) only connect and exit messages are allowed before lease acquisition proc(1115e4870) client(0) ??????ASM DISK PATH???????,????11gR2 RAC+ASM????,??CRS??????,????crsctl start crs -excl -nocrs???????CSS???ASM??, ???????(clssnmCompleteInitVFDiscovery: Voting file not found),????Voteing file????????????????? ?????????,???????11gR2 RAC+ASM??ASM DISK??: 1.?????????ASM DISK?????,??????UDEV????????,???UDEV????ASM DISK?/dev/asm-disk* ??? /dev/rasm-disk*???, ??????udev rule??????: [grid@maclean1 ~]$ export ORACLE_HOME=/g01/grid/app/11.2.0/grid [grid@maclean1 ~]$ /g01/grid/app/11.2.0/grid/bin/sqlplus / as sysasm SQL*Plus: Release 11.2.0.3.0 Production on Sun Jul 15 04:09:28 2012 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> show parameter diskstri NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ asm_diskstring string /dev/asm* ??????ASM?????asm_diskstring ?/dev/asm*, ???root????UDEV RULE?? : [root@maclean1 rules.d]# cp 99-oracle-asmdevices.rules 99-oracle-asmdevices.rules.bak [root@maclean1 rules.d]# vi 99-oracle-asmdevices.rules [root@maclean1 rules.d]# cat 99-oracle-asmdevices.rules KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB09cadb31-cfbea255_", NAME="rasm-diskb", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB5f097069-59efb82f_", NAME="rasm-diskc", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB4e1a81c0-20478bc4_", NAME="rasm-diskd", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VBdcce9285-b13c5a27_", NAME="rasm-diske", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB82effe1a-dbca7dff_", NAME="rasm-diskf", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB950d279f-c581cb51_", NAME="rasm-diskg", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB14400d81-651672d7_", NAME="rasm-diskh", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB31b1237b-78aa22bb_", NAME="rasm-diski", OWNER="grid", GROUP="asmadmin", MODE="0660" ???????99-oracle-asmdevices.rules?UDEV RULE????,??????????/dev/rasm-disk*???,??????ASM DISK???, ????????????????RAC CRS??????? ??????votedisk?ocr ????: [root@maclean1 rules.d]# /g01/grid/app/11.2.0/grid/bin/crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 6896bfc3d1464f9fbf0ea9df87e023ad (/dev/asm-diskb) [SYSTEMDG] 2. ONLINE 58eb81b656084ff2bfd315d9badd08b7 (/dev/asm-diskc) [SYSTEMDG] 3. ONLINE 6bf7324625c54f3abf2c942b1e7f70d9 (/dev/asm-diskd) [SYSTEMDG] 4. ONLINE 43ad8ae20c354f5ebf7083bc30bf94cc (/dev/asm-diske) [SYSTEMDG] 5. ONLINE 4c225359d51b4f93bfba01080664b3d7 (/dev/asm-diskf) [SYSTEMDG] Located 5 voting disk(s). [root@maclean1 rules.d]# /g01/grid/app/11.2.0/grid/bin/ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 2844 Available space (kbytes) : 259276 ID : 879001605 Device/File Name : +SYSTEMDG Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded ??votedisk file?????????ASM DISK,?????????crsctl replace votedisk, ??????LINUX OS: [root@maclean1 rules.d]# init 6 rebooting ............ [root@maclean1 dev]# ls -l *asm* brw-rw---- 1 grid asmadmin 8, 16 Jul 15 04:15 rasm-diskb brw-rw---- 1 grid asmadmin 8, 32 Jul 15 04:15 rasm-diskc brw-rw---- 1 grid asmadmin 8, 48 Jul 15 04:15 rasm-diskd brw-rw---- 1 grid asmadmin 8, 64 Jul 15 04:15 rasm-diske brw-rw---- 1 grid asmadmin 8, 80 Jul 15 04:15 rasm-diskf brw-rw---- 1 grid asmadmin 8, 96 Jul 15 04:15 rasm-diskg brw-rw---- 1 grid asmadmin 8, 112 Jul 15 04:15 rasm-diskh brw-rw---- 1 grid asmadmin 8, 128 Jul 15 04:15 rasm-diski ??????????/dev/rasm-disk*?ASM DISK,??ASM??????css?????/dev/asm*?????ASM DISK,??????????????ASM DISK: more /g01/grid/app/11.2.0/grid/log/maclean1/cssd/ocssd.log 2012-07-15 04:17:45.208: [ SKGFD][1099548992]Discovery with str:/dev/asm*: 2012-07-15 04:17:45.208: [ SKGFD][1099548992]UFS discovery with :/dev/asm*: 2012-07-15 04:17:45.208: [ SKGFD][1099548992]OSS discovery with :/dev/asm*: 2012-07-15 04:17:45.208: [ CSSD][1099548992]clssnmvDiskVerify: Successful discovery of 0 disks 2012-07-15 04:17:45.208: [ CSSD][1099548992]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery 2012-07-15 04:17:45.208: [ CSSD][1099548992]clssnmvFindInitialConfigs: No voting files found 2012-07-15 04:17:45.208: [ CSSD][1099548992](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds 2012-07-15 04:17:45.251: [ CSSD][1096661312]clssgmExecuteClientRequest(): type(37) size(80) only connect and exit messages are allowed before lease acquisition proc(0x26a8ba0) client((nil)) 2012-07-15 04:17:45.251: [ CSSD][1096661312]clssgmDeadProc: proc 0x26a8ba0 2012-07-15 04:17:45.251: [ CSSD][1096661312]clssgmDestroyProc: cleaning up proc(0x26a8ba0) con(0xfe6) skgpid ospid 3751 with 0 clients, refcount 0 2012-07-15 04:17:45.252: [ CSSD][1096661312]clssgmDiscEndpcl: gipcDestroy 0xfe6 2012-07-15 04:17:45.829: [ CSSD][1096661312]clssscSelect: cookie accept request 0x2318ea0 2012-07-15 04:17:45.829: [ CSSD][1096661312]clssgmAllocProc: (0x2659480) allocated 2012-07-15 04:17:45.830: [ CSSD][1096661312]clssgmClientConnectMsg: properties of cmProc 0x2659480 - 1,2,3,4,5 2012-07-15 04:17:45.830: [ CSSD][1096661312]clssgmClientConnectMsg: Connect from con(0x114e) proc(0x2659480) pid(3751) version 11:2:1:4, properties: 1,2,3,4,5 2012-07-15 04:17:45.830: [ CSSD][1096661312]clssgmClientConnectMsg: msg flags 0x0000 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssscSelect: cookie accept request 0x253ddd0 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssscevtypSHRCON: getting client with cmproc 0x253ddd0 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssgmRegisterClient: proc(3/0x253ddd0), client(61/0x26877b0) 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssgmExecuteClientRequest(): type(6) size(684) only connect and exit messages are allowed before lease acquisition proc(0x253ddd0) client(0x26877b0) 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssgmDiscEndpcl: gipcDestroy 0x1174 2012-07-15 04:17:46.070: [ CSSD][1096661312]clssscSelect: cookie accept request 0x26368a0 2012-07-15 04:17:46.070: [ CSSD][1096661312]clssscevtypSHRCON: getting client with cmproc 0x26368a0 2012-07-15 04:17:46.070: [ CSSD][1096661312]clssgmRegisterClient: proc(5/0x26368a0), client(50/0x26877b0) ??11gR2?CRS?????ASM,??ocr???ASM?,??ASM???????,???CRS?????????: [root@maclean1 ~]# crsctl check has CRS-4638: Oracle High Availability Services is online [root@maclean1 ~]# crsctl check crs CRS-4638: Oracle High Availability Services is online CRS-4535: Cannot communicate with Cluster Ready Services CRS-4530: Communications failure contacting Cluster Synchronization Services daemon CRS-4534: Cannot communicate with Event Manager 2. ?????ASM DISK PATH???????,?????????????CRS: ??????OHASD??: [root@maclean1 ~]# crsctl stop has -f CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'maclean1' CRS-2673: Attempting to stop 'ora.mdnsd' on 'maclean1' CRS-2673: Attempting to stop 'ora.crf' on 'maclean1' CRS-2677: Stop of 'ora.mdnsd' on 'maclean1' succeeded CRS-2677: Stop of 'ora.crf' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'maclean1' CRS-2677: Stop of 'ora.gipcd' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gpnpd' on 'maclean1' CRS-2677: Stop of 'ora.gpnpd' on 'maclean1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'maclean1' has completed CRS-4133: Oracle High Availability Services has been stopped. 3. ?-excl -nocrs????CRS,?????ASM ???????CRS??: [root@maclean1 ~]# crsctl start crs -excl -nocrs CRS-4123: Oracle High Availability Services has been started. CRS-2672: Attempting to start 'ora.mdnsd' on 'maclean1' CRS-2676: Start of 'ora.mdnsd' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.gpnpd' on 'maclean1' CRS-2676: Start of 'ora.gpnpd' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.cssdmonitor' on 'maclean1' CRS-2672: Attempting to start 'ora.gipcd' on 'maclean1' CRS-2676: Start of 'ora.cssdmonitor' on 'maclean1' succeeded CRS-2676: Start of 'ora.gipcd' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.cssd' on 'maclean1' CRS-2672: Attempting to start 'ora.diskmon' on 'maclean1' CRS-2676: Start of 'ora.diskmon' on 'maclean1' succeeded CRS-2676: Start of 'ora.cssd' on 'maclean1' succeeded CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'maclean1' CRS-2672: Attempting to start 'ora.ctssd' on 'maclean1' CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'maclean1' CRS-2676: Start of 'ora.ctssd' on 'maclean1' succeeded CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.asm' on 'maclean1' CRS-2676: Start of 'ora.asm' on 'maclean1' succeeded #??????CRS_HOME???ORACLE_BASE?777??,??????? [root@maclean1 ~]# chmod 777 /g01 4.??ASM???disk_strings????ASM DISK PATH??: [root@maclean1 ~]# su - grid [grid@maclean1 ~]$ sqlplus / as sysasm SQL*Plus: Release 11.2.0.3.0 Production on Sun Jul 15 04:40:40 2012 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> alter system set asm_diskstring='/dev/rasm*'; System altered. SQL> alter diskgroup systemdg mount; Diskgroup altered. SQL> create spfile from memory; File created. SQL> startup force mount; ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance ASM instance started Total System Global Area 283930624 bytes Fixed Size 2227664 bytes Variable Size 256537136 bytes ASM Cache 25165824 bytes ASM diskgroups mounted SQL> show parameter spfile NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ spfile string /g01/grid/app/11.2.0/grid/dbs/ spfile+ASM1.ora SQL> show parameter disk NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ asm_diskgroups string SYSTEMDG asm_diskstring string /dev/rasm* SQL> create pfile from spfile; File created. SQL> create spfile='+SYSTEMDG' from pfile; File created. SQL> startup force; ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance ASM instance started Total System Global Area 283930624 bytes Fixed Size 2227664 bytes Variable Size 256537136 bytes ASM Cache 25165824 bytes ASM diskgroups mounted SQL> show parameter spfile NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ spfile string +SYSTEMDG/maclean-cluster/asmp arameterfile/registry.253.7886 82933 ???????asm_diskstring ,????ASM DISKGROUP??SPFILE , ??ASM?????SPFILE?????????????????? 5. crsctl replace votedisk ???votedisk????: [root@maclean1 ~]# crsctl replace votedisk +systemdg Successful addition of voting disk 864a00efcfbe4f42bfd0f4f6b60472a0. Successful addition of voting disk ab14d6e727614f29bf53b9870052a5c8. Successful addition of voting disk 754c03c168854f46bf2daee7287bf260. Successful addition of voting disk 9ed58f37f3e84f28bfcd9b101f2af9f3. Successful addition of voting disk 4ce7b7c682364f12bf4df5ce1fb7814e. Successfully replaced voting disk group with +systemdg. CRS-4266: Voting file(s) successfully replaced [root@maclean1 ~]# crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 864a00efcfbe4f42bfd0f4f6b60472a0 (/dev/rasm-diskb) [SYSTEMDG] 2. ONLINE ab14d6e727614f29bf53b9870052a5c8 (/dev/rasm-diskc) [SYSTEMDG] 3. ONLINE 754c03c168854f46bf2daee7287bf260 (/dev/rasm-diskd) [SYSTEMDG] 4. ONLINE 9ed58f37f3e84f28bfcd9b101f2af9f3 (/dev/rasm-diske) [SYSTEMDG] 5. ONLINE 4ce7b7c682364f12bf4df5ce1fb7814e (/dev/rasm-diskf) [SYSTEMDG] Located 5 voting disk(s). [root@maclean1 ~]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 2844 Available space (kbytes) : 259276 ID : 879001605 Device/File Name : +SYSTEMDG Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded ??replace?votedisk??? ASM DISK?,???votedisk?OCR??????? 6.??CRS??: [root@maclean1 ~]# crsctl stop crs CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'maclean1' CRS-2673: Attempting to stop 'ora.mdnsd' on 'maclean1' CRS-2673: Attempting to stop 'ora.ctssd' on 'maclean1' CRS-2673: Attempting to stop 'ora.asm' on 'maclean1' CRS-2677: Stop of 'ora.mdnsd' on 'maclean1' succeeded CRS-2677: Stop of 'ora.asm' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'maclean1' CRS-2677: Stop of 'ora.ctssd' on 'maclean1' succeeded CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.cssd' on 'maclean1' CRS-2677: Stop of 'ora.cssd' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'maclean1' CRS-2677: Stop of 'ora.gipcd' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gpnpd' on 'maclean1' CRS-2677: Stop of 'ora.gpnpd' on 'maclean1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'maclean1' has completed CRS-4133: Oracle High Availability Services has been stopped. [root@maclean1 ~]# crsctl stat res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.BACKUPDG.dg ONLINE ONLINE maclean1 ora.DATA.dg ONLINE ONLINE maclean1 ora.LISTENER.lsnr ONLINE ONLINE maclean1 ora.SYSTEMDG.dg ONLINE ONLINE maclean1 ora.asm ONLINE ONLINE maclean1 Started ora.gsd OFFLINE OFFLINE maclean1 ora.net1.network ONLINE ONLINE maclean1 ora.ons ONLINE ONLINE maclean1 -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE maclean1 ora.cvu 1 ONLINE ONLINE maclean1 ora.maclean1.vip 1 ONLINE ONLINE maclean1 ora.maclean2.vip 1 ONLINE INTERMEDIATE maclean1 FAILED OVER ora.oc4j 1 ONLINE OFFLINE STARTING ora.prod.db 1 ONLINE OFFLINE Instance Shutdown,S TARTING 2 ONLINE OFFLINE ora.scan1.vip 1 ONLINE ONLINE maclean1 ???????ASM?????SPFILE,???????????????,?????CRS??????? ??11gR2 RAC+ASM?????????,????????????????ASM DISK PATH??????????

Read the article
?11gR2 RAC???ASM DISK Path????

- by Liu Maclean(???)

????T.askmaclean.com???????11gR2?ASM DISK?????,??????: aix 6.1,grid 11.2.0.3+asm11.2.0.3+rac ???????????aix????????mpio,??diskgroup ?????veritas dmp???,?????asm?disk_strings=/dev/vx/rdmp/*,crs/asm??????????????/dev/vx/rdmp/?????,?????????diskgroup??? crs???????:2012-07-13 15:07:29.748: [ GPNP][1286]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2108 clsgpnp_profileCallUrlInt] get-profile call to url “ipc://GPNPD_ggtest1? disco “” [f=0 claimed- host: cname: seq: auth:]2012-07-13 15:07:29.762: [ GPNP][1286]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2236 clsgpnp_profileCallUrlInt] Result: (0) CLSGPNP_OK. Successful get-profile CALL to remote “ipc://GPNPD_ggtest1? disco “”2012-07-13 15:07:29.762: [ CSSD][1286]clssnmReadDiscoveryProfile: voting file discovery string(/dev/vx/rdmp/*)2012-07-13 15:07:29.762: [ CSSD][1286]clssnmvDDiscThread: using discovery string /dev/vx/rdmp/* for initial discovery2012-07-13 15:07:29.762: [ SKGFD][1286]Discovery with str:/dev/vx/rdmp/*: 2012-07-13 15:07:29.762: [ SKGFD][1286]UFS discovery with :/dev/vx/rdmp/*: 2012-07-13 15:07:29.769: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_919: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_212: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_211: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_210: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_209: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_181: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_180: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_3: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_2: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_1: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_0: 2012-07-13 15:07:29.771: [ SKGFD][1286]OSS discovery with :/dev/vx/rdmp/*: 2012-07-13 15:07:29.771: [ SKGFD][1286]Handle 1115e7510 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_916: 2012-07-13 15:07:29.772: [ SKGFD][1286]Handle 1118758b0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_912: 2012-07-13 15:07:29.773: [ SKGFD][1286]Handle 1118d9cf0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_908: 2012-07-13 15:07:29.773: [ SKGFD][1286]Handle 1118da450 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_904: 2012-07-13 15:07:29.773: [ SKGFD][1286]Handle 1118dad70 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_903: 2012-07-13 15:07:29.802: [ CLSF][1286]checksum failed for disk:/dev/vx/rdmp/v_df8000_916:2012-07-13 15:07:29.803: [ SKGFD][1286]Lib :UFS:: closing handle 1115e7510 for disk :/dev/vx/rdmp/v_df8000_916: 2012-07-13 15:07:29.803: [ SKGFD][1286]Lib :UFS:: closing handle 1118758b0 for disk :/dev/vx/rdmp/v_df8000_912: 2012-07-13 15:07:29.804: [ SKGFD][1286]Handle 1115e6710 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_202: 2012-07-13 15:07:29.808: [ SKGFD][1286]Handle 1115e7030 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_201: 2012-07-13 15:07:29.809: [ SKGFD][1286]Handle 1115e7ad0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_200: 2012-07-13 15:07:29.809: [ SKGFD][1286]Handle 1118733f0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_199: 2012-07-13 15:07:29.816: [ CLSF][1286]checksum failed for disk:/dev/vx/rdmp/v_df8000_186:2012-07-13 15:07:29.816: [ SKGFD][1286]Lib :UFS:: closing handle 1118de5d0 for disk :/dev/vx/rdmp/v_df8000_186: 2012-07-13 15:07:29.816: [ CSSD][1286]clssnmvDiskVerify: Successful discovery of 0 disks2012-07-13 15:07:29.816: [ CSSD][1286]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery2012-07-13 15:07:29.816: [ CSSD][1286]clssnmvFindInitialConfigs: No voting files found2012-07-13 15:07:29.816: [ CSSD][1286](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds2012-07-13 15:07:30.169: [ CSSD][1029]clssgmExecuteClientRequest(): type(37) size(80) only connect and exit messages are allowed before lease acquisition proc(1115e4870) client(0) ??????ASM DISK PATH???????,????11gR2 RAC+ASM????,??CRS??????,????crsctl start crs -excl -nocrs???????CSS???ASM??, ???????(clssnmCompleteInitVFDiscovery: Voting file not found),????Voteing file????????????????? ?????????,???????11gR2 RAC+ASM??ASM DISK??: 1.?????????ASM DISK?????,??????UDEV????????,???UDEV????ASM DISK?/dev/asm-disk* ??? /dev/rasm-disk*???, ??????udev rule??????: [grid@maclean1 ~]$ export ORACLE_HOME=/g01/grid/app/11.2.0/grid [grid@maclean1 ~]$ /g01/grid/app/11.2.0/grid/bin/sqlplus / as sysasm SQL*Plus: Release 11.2.0.3.0 Production on Sun Jul 15 04:09:28 2012 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> show parameter diskstri NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ asm_diskstring string /dev/asm* ??????ASM?????asm_diskstring ?/dev/asm*, ???root????UDEV RULE?? : [root@maclean1 rules.d]# cp 99-oracle-asmdevices.rules 99-oracle-asmdevices.rules.bak [root@maclean1 rules.d]# vi 99-oracle-asmdevices.rules [root@maclean1 rules.d]# cat 99-oracle-asmdevices.rules KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB09cadb31-cfbea255_", NAME="rasm-diskb", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB5f097069-59efb82f_", NAME="rasm-diskc", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB4e1a81c0-20478bc4_", NAME="rasm-diskd", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VBdcce9285-b13c5a27_", NAME="rasm-diske", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB82effe1a-dbca7dff_", NAME="rasm-diskf", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB950d279f-c581cb51_", NAME="rasm-diskg", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB14400d81-651672d7_", NAME="rasm-diskh", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB31b1237b-78aa22bb_", NAME="rasm-diski", OWNER="grid", GROUP="asmadmin", MODE="0660" ???????99-oracle-asmdevices.rules?UDEV RULE????,??????????/dev/rasm-disk*???,??????ASM DISK???, ????????????????RAC CRS??????? ??????votedisk?ocr ????: [root@maclean1 rules.d]# /g01/grid/app/11.2.0/grid/bin/crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 6896bfc3d1464f9fbf0ea9df87e023ad (/dev/asm-diskb) [SYSTEMDG] 2. ONLINE 58eb81b656084ff2bfd315d9badd08b7 (/dev/asm-diskc) [SYSTEMDG] 3. ONLINE 6bf7324625c54f3abf2c942b1e7f70d9 (/dev/asm-diskd) [SYSTEMDG] 4. ONLINE 43ad8ae20c354f5ebf7083bc30bf94cc (/dev/asm-diske) [SYSTEMDG] 5. ONLINE 4c225359d51b4f93bfba01080664b3d7 (/dev/asm-diskf) [SYSTEMDG] Located 5 voting disk(s). [root@maclean1 rules.d]# /g01/grid/app/11.2.0/grid/bin/ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 2844 Available space (kbytes) : 259276 ID : 879001605 Device/File Name : +SYSTEMDG Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded ??votedisk file?????????ASM DISK,?????????crsctl replace votedisk, ??????LINUX OS: [root@maclean1 rules.d]# init 6 rebooting ............ [root@maclean1 dev]# ls -l *asm* brw-rw---- 1 grid asmadmin 8, 16 Jul 15 04:15 rasm-diskb brw-rw---- 1 grid asmadmin 8, 32 Jul 15 04:15 rasm-diskc brw-rw---- 1 grid asmadmin 8, 48 Jul 15 04:15 rasm-diskd brw-rw---- 1 grid asmadmin 8, 64 Jul 15 04:15 rasm-diske brw-rw---- 1 grid asmadmin 8, 80 Jul 15 04:15 rasm-diskf brw-rw---- 1 grid asmadmin 8, 96 Jul 15 04:15 rasm-diskg brw-rw---- 1 grid asmadmin 8, 112 Jul 15 04:15 rasm-diskh brw-rw---- 1 grid asmadmin 8, 128 Jul 15 04:15 rasm-diski ??????????/dev/rasm-disk*?ASM DISK,??ASM??????css?????/dev/asm*?????ASM DISK,??????????????ASM DISK: more /g01/grid/app/11.2.0/grid/log/maclean1/cssd/ocssd.log 2012-07-15 04:17:45.208: [ SKGFD][1099548992]Discovery with str:/dev/asm*: 2012-07-15 04:17:45.208: [ SKGFD][1099548992]UFS discovery with :/dev/asm*: 2012-07-15 04:17:45.208: [ SKGFD][1099548992]OSS discovery with :/dev/asm*: 2012-07-15 04:17:45.208: [ CSSD][1099548992]clssnmvDiskVerify: Successful discovery of 0 disks 2012-07-15 04:17:45.208: [ CSSD][1099548992]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery 2012-07-15 04:17:45.208: [ CSSD][1099548992]clssnmvFindInitialConfigs: No voting files found 2012-07-15 04:17:45.208: [ CSSD][1099548992](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds 2012-07-15 04:17:45.251: [ CSSD][1096661312]clssgmExecuteClientRequest(): type(37) size(80) only connect and exit messages are allowed before lease acquisition proc(0x26a8ba0) client((nil)) 2012-07-15 04:17:45.251: [ CSSD][1096661312]clssgmDeadProc: proc 0x26a8ba0 2012-07-15 04:17:45.251: [ CSSD][1096661312]clssgmDestroyProc: cleaning up proc(0x26a8ba0) con(0xfe6) skgpid ospid 3751 with 0 clients, refcount 0 2012-07-15 04:17:45.252: [ CSSD][1096661312]clssgmDiscEndpcl: gipcDestroy 0xfe6 2012-07-15 04:17:45.829: [ CSSD][1096661312]clssscSelect: cookie accept request 0x2318ea0 2012-07-15 04:17:45.829: [ CSSD][1096661312]clssgmAllocProc: (0x2659480) allocated 2012-07-15 04:17:45.830: [ CSSD][1096661312]clssgmClientConnectMsg: properties of cmProc 0x2659480 - 1,2,3,4,5 2012-07-15 04:17:45.830: [ CSSD][1096661312]clssgmClientConnectMsg: Connect from con(0x114e) proc(0x2659480) pid(3751) version 11:2:1:4, properties: 1,2,3,4,5 2012-07-15 04:17:45.830: [ CSSD][1096661312]clssgmClientConnectMsg: msg flags 0x0000 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssscSelect: cookie accept request 0x253ddd0 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssscevtypSHRCON: getting client with cmproc 0x253ddd0 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssgmRegisterClient: proc(3/0x253ddd0), client(61/0x26877b0) 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssgmExecuteClientRequest(): type(6) size(684) only connect and exit messages are allowed before lease acquisition proc(0x253ddd0) client(0x26877b0) 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssgmDiscEndpcl: gipcDestroy 0x1174 2012-07-15 04:17:46.070: [ CSSD][1096661312]clssscSelect: cookie accept request 0x26368a0 2012-07-15 04:17:46.070: [ CSSD][1096661312]clssscevtypSHRCON: getting client with cmproc 0x26368a0 2012-07-15 04:17:46.070: [ CSSD][1096661312]clssgmRegisterClient: proc(5/0x26368a0), client(50/0x26877b0) ??11gR2?CRS?????ASM,??ocr???ASM?,??ASM???????,???CRS?????????: [root@maclean1 ~]# crsctl check has CRS-4638: Oracle High Availability Services is online [root@maclean1 ~]# crsctl check crs CRS-4638: Oracle High Availability Services is online CRS-4535: Cannot communicate with Cluster Ready Services CRS-4530: Communications failure contacting Cluster Synchronization Services daemon CRS-4534: Cannot communicate with Event Manager 2. ?????ASM DISK PATH???????,?????????????CRS: ??????OHASD??: [root@maclean1 ~]# crsctl stop has -f CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'maclean1' CRS-2673: Attempting to stop 'ora.mdnsd' on 'maclean1' CRS-2673: Attempting to stop 'ora.crf' on 'maclean1' CRS-2677: Stop of 'ora.mdnsd' on 'maclean1' succeeded CRS-2677: Stop of 'ora.crf' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'maclean1' CRS-2677: Stop of 'ora.gipcd' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gpnpd' on 'maclean1' CRS-2677: Stop of 'ora.gpnpd' on 'maclean1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'maclean1' has completed CRS-4133: Oracle High Availability Services has been stopped. 3. ?-excl -nocrs????CRS,?????ASM ???????CRS??: [root@maclean1 ~]# crsctl start crs -excl -nocrs CRS-4123: Oracle High Availability Services has been started. CRS-2672: Attempting to start 'ora.mdnsd' on 'maclean1' CRS-2676: Start of 'ora.mdnsd' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.gpnpd' on 'maclean1' CRS-2676: Start of 'ora.gpnpd' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.cssdmonitor' on 'maclean1' CRS-2672: Attempting to start 'ora.gipcd' on 'maclean1' CRS-2676: Start of 'ora.cssdmonitor' on 'maclean1' succeeded CRS-2676: Start of 'ora.gipcd' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.cssd' on 'maclean1' CRS-2672: Attempting to start 'ora.diskmon' on 'maclean1' CRS-2676: Start of 'ora.diskmon' on 'maclean1' succeeded CRS-2676: Start of 'ora.cssd' on 'maclean1' succeeded CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'maclean1' CRS-2672: Attempting to start 'ora.ctssd' on 'maclean1' CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'maclean1' CRS-2676: Start of 'ora.ctssd' on 'maclean1' succeeded CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.asm' on 'maclean1' CRS-2676: Start of 'ora.asm' on 'maclean1' succeeded #??????CRS_HOME???ORACLE_BASE?777??,??????? [root@maclean1 ~]# chmod 777 /g01 4.??ASM???disk_strings????ASM DISK PATH??: [root@maclean1 ~]# su - grid [grid@maclean1 ~]$ sqlplus / as sysasm SQL*Plus: Release 11.2.0.3.0 Production on Sun Jul 15 04:40:40 2012 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> alter system set asm_diskstring='/dev/rasm*'; System altered. SQL> alter diskgroup systemdg mount; Diskgroup altered. SQL> create spfile from memory; File created. SQL> startup force mount; ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance ASM instance started Total System Global Area 283930624 bytes Fixed Size 2227664 bytes Variable Size 256537136 bytes ASM Cache 25165824 bytes ASM diskgroups mounted SQL> show parameter spfile NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ spfile string /g01/grid/app/11.2.0/grid/dbs/ spfile+ASM1.ora SQL> show parameter disk NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ asm_diskgroups string SYSTEMDG asm_diskstring string /dev/rasm* SQL> create pfile from spfile; File created. SQL> create spfile='+SYSTEMDG' from pfile; File created. SQL> startup force; ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance ASM instance started Total System Global Area 283930624 bytes Fixed Size 2227664 bytes Variable Size 256537136 bytes ASM Cache 25165824 bytes ASM diskgroups mounted SQL> show parameter spfile NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ spfile string +SYSTEMDG/maclean-cluster/asmp arameterfile/registry.253.7886 82933 ???????asm_diskstring ,????ASM DISKGROUP??SPFILE , ??ASM?????SPFILE?????????????????? 5. crsctl replace votedisk ???votedisk????: [root@maclean1 ~]# crsctl replace votedisk +systemdg Successful addition of voting disk 864a00efcfbe4f42bfd0f4f6b60472a0. Successful addition of voting disk ab14d6e727614f29bf53b9870052a5c8. Successful addition of voting disk 754c03c168854f46bf2daee7287bf260. Successful addition of voting disk 9ed58f37f3e84f28bfcd9b101f2af9f3. Successful addition of voting disk 4ce7b7c682364f12bf4df5ce1fb7814e. Successfully replaced voting disk group with +systemdg. CRS-4266: Voting file(s) successfully replaced [root@maclean1 ~]# crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 864a00efcfbe4f42bfd0f4f6b60472a0 (/dev/rasm-diskb) [SYSTEMDG] 2. ONLINE ab14d6e727614f29bf53b9870052a5c8 (/dev/rasm-diskc) [SYSTEMDG] 3. ONLINE 754c03c168854f46bf2daee7287bf260 (/dev/rasm-diskd) [SYSTEMDG] 4. ONLINE 9ed58f37f3e84f28bfcd9b101f2af9f3 (/dev/rasm-diske) [SYSTEMDG] 5. ONLINE 4ce7b7c682364f12bf4df5ce1fb7814e (/dev/rasm-diskf) [SYSTEMDG] Located 5 voting disk(s). [root@maclean1 ~]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 2844 Available space (kbytes) : 259276 ID : 879001605 Device/File Name : +SYSTEMDG Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded ??replace?votedisk??? ASM DISK?,???votedisk?OCR??????? 6.??CRS??: [root@maclean1 ~]# crsctl stop crs CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'maclean1' CRS-2673: Attempting to stop 'ora.mdnsd' on 'maclean1' CRS-2673: Attempting to stop 'ora.ctssd' on 'maclean1' CRS-2673: Attempting to stop 'ora.asm' on 'maclean1' CRS-2677: Stop of 'ora.mdnsd' on 'maclean1' succeeded CRS-2677: Stop of 'ora.asm' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'maclean1' CRS-2677: Stop of 'ora.ctssd' on 'maclean1' succeeded CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.cssd' on 'maclean1' CRS-2677: Stop of 'ora.cssd' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'maclean1' CRS-2677: Stop of 'ora.gipcd' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gpnpd' on 'maclean1' CRS-2677: Stop of 'ora.gpnpd' on 'maclean1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'maclean1' has completed CRS-4133: Oracle High Availability Services has been stopped. [root@maclean1 ~]# crsctl stat res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.BACKUPDG.dg ONLINE ONLINE maclean1 ora.DATA.dg ONLINE ONLINE maclean1 ora.LISTENER.lsnr ONLINE ONLINE maclean1 ora.SYSTEMDG.dg ONLINE ONLINE maclean1 ora.asm ONLINE ONLINE maclean1 Started ora.gsd OFFLINE OFFLINE maclean1 ora.net1.network ONLINE ONLINE maclean1 ora.ons ONLINE ONLINE maclean1 -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE maclean1 ora.cvu 1 ONLINE ONLINE maclean1 ora.maclean1.vip 1 ONLINE ONLINE maclean1 ora.maclean2.vip 1 ONLINE INTERMEDIATE maclean1 FAILED OVER ora.oc4j 1 ONLINE OFFLINE STARTING ora.prod.db 1 ONLINE OFFLINE Instance Shutdown,S TARTING 2 ONLINE OFFLINE ora.scan1.vip 1 ONLINE ONLINE maclean1 ???????ASM?????SPFILE,???????????????,?????CRS??????? ??11gR2 RAC+ASM?????????,????????????????ASM DISK PATH?????????, ???????????????,????!

Read the article
Essbase BSO Data Fragmentation

- by Ann Donahue

Essbase BSO Data Fragmentation Data fragmentation naturally occurs in Essbase Block Storage (BSO) databases where there are a lot of end user data updates, incremental data loads, many lock and send, and/or many calculations executed. If an Essbase database starts to experience performance slow-downs, this is an indication that there may be too much fragmentation. See Chapter 54 Improving Essbase Performance in the Essbase DBA Guide for more details on measuring and eliminating fragmentation: http://docs.oracle.com/cd/E17236_01/epm.1112/esb_dbag/daprcset.html Fragmentation is likely to occur in the following situations: Read/write databases that users are constantly updating data Databases that execute calculations around the clock Databases that frequently update and recalculate dense members Data loads that are poorly designed Databases that contain a significant number of Dynamic Calc and Store members Databases that use an isolation level of uncommitted access with commit block set to zero There are two types of data block fragmentation Free space tracking, which is measured using the Average Fragmentation Quotient statistic. Block order on disk, which is measured using the Average Cluster Ratio statistic. Average Fragmentation Quotient The Average Fragmentation Quotient ratio measures free space in a given database. As you update and calculate data, empty spaces occur when a block can no longer fit in its original space and will either append at the end of the file or fit in another empty space that is large enough. These empty spaces take up space in the .PAG files. The higher the number the more empty spaces you have, therefore, the bigger the .PAG file and the longer it takes to traverse through the .PAG file to get to a particular record. An Average Fragmentation Quotient value of 3.174765 means the database is 3% fragmented with free space. Average Cluster Ratio Average Cluster Ratio describes the order the blocks actually exist in the database. An Average Cluster Ratio number of 1 means all the blocks are ordered in the correct sequence in the order of the Outline. As you load data and calculate data blocks, the sequence can start to be out of order. This is because when you write to a block it may not be able to place back in the exact same spot in the database that it existed before. The lower this number the more out of order it becomes and the more it affects performance. An Average Cluster Ratio value of 1 means no fragmentation. Any value lower than 1 i.e. 0.01032828 means the data blocks are getting further out of order from the outline order. Eliminating Data Block Fragmentation Both types of data block fragmentation can be removed by doing a dense restructure or export/clear/import of the data. There are two types of dense restructure: 1. Implicit Restructures Implicit dense restructure happens when outline changes are done using EAS Outline Editor or Dimension Build. Essbase restructures create new .PAG files restructuring the data blocks in the .PAG files. When Essbase restructures the data blocks, it regenerates the index automatically so that index entries point to the new data blocks. Empty blocks are NOT removed with implicit restructures. 2. Explicit Restructures Explicit dense restructure happens when a manual initiation of the database restructure is executed. An explicit dense restructure is a full restructure which comprises of a dense restructure as outlined above plus the removal of empty blocks Empty Blocks vs. Fragmentation The existence of empty blocks is not considered fragmentation. Empty blocks can be created through calc scripts or formulas. An empty block will add to an existing database block count and will be included in the block counts of the database properties. There are no statistics for empty blocks. The only way to determine if empty blocks exist in an Essbase database is to record your current block count, export the entire database, clear the database then import the exported data. If the block count decreased, the difference is the number of empty blocks that had existed in the database.

Read the article
Redundant Interconnect with Highly Available IP (HAIP) ??

- by JaneZhang(???)

?11.2.0.2??,Oracle ?????Grid Infrastructure(GI)????Redundant Interconnect with Highly Available IP(HAIP). ?11.2.0.2??,???????????OS?????????,??HAIP??,????????????????????? ???GI????,??????????????????,??: ???,HAIP???????169.254.*.*,????????????HAIP ???1?,???4?(???????),??????????? ??:$ crsctl stat res -t -init NAME TARGET STATE SERVER STATE_DETAILS Cluster Resources--------------------------------------------------------------------------------ora.cluster_interconnect.haip 1 ONLINE ONLINE node2 #ifconfig -aeth1 Link encap:Ethernet HWaddr 00:14:22:BD:59:DE <=====???? inet addr:192.168.10.2 Bcast:192.168.10.255 Mask:255.255.255.0 inet6 addr: fe80::214:22ff:febd:59de/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:54297359 errors:0 dropped:0 overruns:0 frame:0 TX packets:58151488 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:837602539 (798.8 MiB) TX bytes:3809085161 (3.5 GiB) Interrupt:169 eth1:1 Link encap:Ethernet HWaddr 00:14:22:BD:59:DE <=====???????? inet addr:169.254.185.195 Bcast:169.254.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:169 ???????HAIP??cluster interconnect: Cluster communication is configured to use the following interface(s) for this instance 169.254.185.195cluster interconnect IPC version:Oracle UDP/IP (generic)IPC Vendor 1 proto 2ASM ?????? HAIP ??cluster interconnect: Cluster communication is configured to use the following interface(s) for this instance 169.254.185.195cluster interconnect IPC version:Oracle UDP/IP (generic)IPC Vendor 1 proto 2 Oracle????ASM??????HAIP??????????????????????,????????????????????,???????????,????HAIP????????????????,??????????? HAIP ????????????,????????????????? ??HAIP?????,???My Oracle Support Note ??1210883.1.

Read the article
Unexpected start of already-primary server processes when heartbeat on secondary is stopped.

- by vorik

Hi, I've got an active-passive Heartbeat cluster with Apache, MySQL, ActiveMQ and DRBD. Today, I wanted to perform hardware-maintenance on the secondary node (node04), so I stopped the heartbeat service before shutting it down. Then, the primary node (node03) received a shutdown notice from the secondary node (node04). This logging comes from the primary node: node03 heartbeat[4458]: 2010/03/08_08:52:56 info: Received shutdown notice from 'node04.companydomain.nl'. heartbeat[4458]: 2010/03/08_08:52:56 info: Resources being acquired from node04.companydomain.nl. harc[27522]: 2010/03/08_08:52:56 info: Running /etc/ha.d/rc.d/status status heartbeat[27523]: 2010/03/08_08:52:56 info: Local Resource acquisition completed. mach_down[27567]: 2010/03/08_08:52:56 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired mach_down[27567]: 2010/03/08_08:52:56 info: mach_down takeover complete for node node04.companydomain.nl. heartbeat[4458]: 2010/03/08_08:52:56 info: mach_down takeover complete. harc[27620]: 2010/03/08_08:52:56 info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp ip-request-resp[27620]: 2010/03/08_08:52:56 received ip-request-resp drbddisk OK yes ResourceManager[27645]: 2010/03/08_08:52:56 info: Acquiring resource group: node03.companydomain.nl drbddisk Filesystem::/dev/drbd0::/data::ext3 mysql apache::/etc/httpd/conf/httpd.conf LVSSyncDaemonSwap::master monitor activemq tivoli-cluster MailTo::[email protected]::DRBDFailureDrisAcc MailTo::[email protected]::DRBDFailureDrisAcc 1.2.3.212 ResourceManager[27645]: 2010/03/08_08:52:56 info: Running /etc/ha.d/resource.d/drbddisk start Filesystem[27700]: 2010/03/08_08:52:57 INFO: Running OK ResourceManager[27645]: 2010/03/08_08:52:57 info: Running /etc/ha.d/resource.d/mysql start mysql[27783]: 2010/03/08_08:52:57 Starting MySQL[ OK ] apache[27853]: 2010/03/08_08:52:57 INFO: Running OK ResourceManager[27645]: 2010/03/08_08:52:57 info: Running /etc/ha.d/resource.d/monitor start monitor[28160]: 2010/03/08_08:52:58 ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/activemq start activemq[28210]: 2010/03/08_08:52:58 Starting ActiveMQ Broker... ActiveMQ Broker is already running. ResourceManager[27645]: 2010/03/08_08:52:58 ERROR: Return code 1 from /etc/ha.d/resource.d/activemq ResourceManager[27645]: 2010/03/08_08:52:58 CRIT: Giving up resources due to failure of activemq ResourceManager[27645]: 2010/03/08_08:52:58 info: Releasing resource group: node03.companydomain.nl drbddisk Filesystem::/dev/drbd0::/data::ext3 mysql apache::/etc/httpd/conf/httpd.conf LVSSyncDaemonSwap::master monitor activemq tivoli-cluster MailTo::[email protected]::DRBDFailureDrisAcc MailTo::[email protected]::DRBDFailureDrisAcc 1.2.3.212 ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/IPaddr 1.2.3.212 stop IPaddr[28329]: 2010/03/08_08:52:58 INFO: ifconfig eth0:0 down IPaddr[28312]: 2010/03/08_08:52:58 INFO: Success ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/MailTo [email protected] DRBDFailureDrisAcc stop MailTo[28378]: 2010/03/08_08:52:58 INFO: Success ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/MailTo [email protected] DRBDFailureDrisAcc stop MailTo[28433]: 2010/03/08_08:52:58 INFO: Success ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/tivoli-cluster stop ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/activemq stop activemq[28503]: 2010/03/08_08:53:01 Stopping ActiveMQ Broker... Stopped ActiveMQ Broker. ResourceManager[27645]: 2010/03/08_08:53:01 info: Running /etc/ha.d/resource.d/monitor stop monitor[28681]: 2010/03/08_08:53:01 ResourceManager[27645]: 2010/03/08_08:53:01 info: Running /etc/ha.d/resource.d/LVSSyncDaemonSwap master stop LVSSyncDaemonSwap[28714]: 2010/03/08_08:53:02 info: ipvs_syncmaster down LVSSyncDaemonSwap[28714]: 2010/03/08_08:53:02 info: ipvs_syncbackup up LVSSyncDaemonSwap[28714]: 2010/03/08_08:53:02 info: ipvs_syncmaster released ResourceManager[27645]: 2010/03/08_08:53:02 info: Running /etc/ha.d/resource.d/apache /etc/httpd/conf/httpd.conf stop apache[28782]: 2010/03/08_08:53:03 INFO: Killing apache PID 18390 apache[28782]: 2010/03/08_08:53:03 INFO: apache stopped. apache[28771]: 2010/03/08_08:53:03 INFO: Success ResourceManager[27645]: 2010/03/08_08:53:03 info: Running /etc/ha.d/resource.d/mysql stop mysql[28851]: 2010/03/08_08:53:24 Shutting down MySQL.....................[ OK ] ResourceManager[27645]: 2010/03/08_08:53:24 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext3 stop Filesystem[29010]: 2010/03/08_08:53:25 INFO: Running stop for /dev/drbd0 on /data Filesystem[29010]: 2010/03/08_08:53:25 INFO: Trying to unmount /data Filesystem[29010]: 2010/03/08_08:53:25 ERROR: Couldn't unmount /data; trying cleanup with SIGTERM Filesystem[29010]: 2010/03/08_08:53:25 INFO: Some processes on /data were signalled Filesystem[29010]: 2010/03/08_08:53:27 INFO: unmounted /data successfully Filesystem[28999]: 2010/03/08_08:53:27 INFO: Success ResourceManager[27645]: 2010/03/08_08:53:27 info: Running /etc/ha.d/resource.d/drbddisk stop heartbeat[4458]: 2010/03/08_08:53:29 WARN: node node04.companydomain.nl: is dead heartbeat[4458]: 2010/03/08_08:53:29 info: Dead node node04.companydomain.nl gave up resources. heartbeat[4458]: 2010/03/08_08:53:29 info: Link node04.companydomain.nl:eth0 dead. heartbeat[4458]: 2010/03/08_08:53:29 info: Link node04.companydomain.nl:eth1 dead. hb_standby[29193]: 2010/03/08_08:53:57 Going standby [foreign]. heartbeat[4458]: 2010/03/08_08:53:57 info: node03.companydomain.nl wants to go standby [foreign] Soo... What just happened here??? Heartbeat on node04 stopped and told node03, which was the active node at the time. Somehow, node03 decided to start the cluster processes that were already running. (For the processes that are not critical, I always return a 0 from the startupscript so it does not stops the entire cluster when a non-essential part fails.) When starting ActiveMQ, it returns status 1 because it is already running. This fails the node and shuts everything down. As heartbeat is not running on the secondary node, it cannot failover to there. When I tried to run ha_takeover to restart the resources, absolutely nothing happened. Only after I restarted heartbeat on the primary node the resources could be started (after a delay of 2 minutes). These are my questions: Why does heartbeat on the primary node try to start the cluster processes again? Why did ha_takeover not work? What can I do to prevent this from happening? Server configuration: DRBD: version: 8.3.7 (api:88/proto:86-91) GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by [email protected], 2010-01-20 09:14:48 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate B r---- ns:0 nr:6459432 dw:6459432 dr:0 al:0 bm:301 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0 uname -a Linux node04 2.6.18-164.11.1.el5 #1 SMP Wed Jan 6 13:26:04 EST 2010 x86_64 x86_64 x86_64 GNU/Linux haresources node03.companydomain.nl \ drbddisk \ Filesystem::/dev/drbd0::/data::ext3 \ mysql \ apache::/etc/httpd/conf/httpd.conf \ LVSSyncDaemonSwap::master \ monitor \ activemq \ tivoli-cluster \ MailTo::[email protected]::DRBDFailureDrisAcc \ MailTo::[email protected]::DRBDFailureDrisAcc \ 1.2.3.212 ha.cf debugfile /var/log/ha-debug logfile /var/log/ha-log keepalive 500ms deadtime 30 warntime 10 initdead 120 udpport 694 mcast eth0 225.0.0.3 694 1 0 mcast eth1 225.0.0.4 694 1 0 auto_failback off node node03.companydomain.nl node node04.companydomain.nl respawn hacluster /usr/lib64/heartbeat/dopd apiauth dopd gid=haclient uid=hacluster Thank you very much in advance, Ger Apeldoorn

Read the article
MySQL: Pacemaker cannot start the failed master as a new slave?

- by quanta

I'm going to setup failover for MySQL replication (1 master and 1 slave) follow this guide: https://github.com/jayjanssen/Percona-Pacemaker-Resource-Agents/blob/master/doc/PRM-setup-guide.rst Here're the output of crm configure show: node serving-6192 \ attributes p_mysql_mysql_master_IP="192.168.6.192" node svr184R-638.localdomain \ attributes p_mysql_mysql_master_IP="192.168.6.38" primitive p_mysql ocf:percona:mysql \ params config="/etc/my.cnf" pid="/var/run/mysqld/mysqld.pid" socket="/var/lib/mysql/mysql.sock" replication_user="repl" replication_passwd="x" test_user="test_user" test_passwd="x" \ op monitor interval="5s" role="Master" OCF_CHECK_LEVEL="1" \ op monitor interval="2s" role="Slave" timeout="30s" OCF_CHECK_LEVEL="1" \ op start interval="0" timeout="120s" \ op stop interval="0" timeout="120s" primitive writer_vip ocf:heartbeat:IPaddr2 \ params ip="192.168.6.8" cidr_netmask="32" \ op monitor interval="10s" \ meta is-managed="true" ms ms_MySQL p_mysql \ meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" globally-unique="false" target-role="Master" is-managed="true" colocation writer_vip_on_master inf: writer_vip ms_MySQL:Master order ms_MySQL_promote_before_vip inf: ms_MySQL:promote writer_vip:start property $id="cib-bootstrap-options" \ dc-version="1.0.12-unknown" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ no-quorum-policy="ignore" \ stonith-enabled="false" \ last-lrm-refresh="1341801689" property $id="mysql_replication" \ p_mysql_REPL_INFO="192.168.6.192|mysql-bin.000006|338" crm_mon: Last updated: Mon Jul 9 10:30:01 2012 Stack: openais Current DC: serving-6192 - partition with quorum Version: 1.0.12-unknown 2 Nodes configured, 2 expected votes 2 Resources configured. ============ Online: [ serving-6192 svr184R-638.localdomain ] Master/Slave Set: ms_MySQL Masters: [ serving-6192 ] Slaves: [ svr184R-638.localdomain ] writer_vip (ocf::heartbeat:IPaddr2): Started serving-6192 Editing /etc/my.cnf on the serving-6192 of wrong syntax to test failover and it's working fine: svr184R-638.localdomain being promoted to become the master writer_vip switch to svr184R-638.localdomain Current state: Last updated: Mon Jul 9 10:35:57 2012 Stack: openais Current DC: serving-6192 - partition with quorum Version: 1.0.12-unknown 2 Nodes configured, 2 expected votes 2 Resources configured. ============ Online: [ serving-6192 svr184R-638.localdomain ] Master/Slave Set: ms_MySQL Masters: [ svr184R-638.localdomain ] Stopped: [ p_mysql:0 ] writer_vip (ocf::heartbeat:IPaddr2): Started svr184R-638.localdomain Failed actions: p_mysql:0_monitor_5000 (node=serving-6192, call=15, rc=7, status=complete): not running p_mysql:0_demote_0 (node=serving-6192, call=22, rc=7, status=complete): not running p_mysql:0_start_0 (node=serving-6192, call=26, rc=-2, status=Timed Out): unknown exec error Remove the wrong syntax from /etc/my.cnf on serving-6192, and restart corosync, what I would like to see is serving-6192 was started as a new slave but it doesn't: Failed actions: p_mysql:0_start_0 (node=serving-6192, call=4, rc=1, status=complete): unknown error Here're snippet of the logs which I'm suspecting: Jul 09 10:46:32 serving-6192 lrmd: [7321]: info: rsc:p_mysql:0:4: start Jul 09 10:46:32 serving-6192 lrmd: [7321]: info: RA output: (p_mysql:0:start:stderr) Error performing operation: The object/attribute does not exist Jul 09 10:46:32 serving-6192 crm_attribute: [7420]: info: Invoked: /usr/sbin/crm_attribute -N serving-6192 -l reboot --name readable -v 0 The full logs: http://fpaste.org/AyOZ/ The strange thing is I can starting it manually: export OCF_ROOT=/usr/lib/ocf export OCF_RESKEY_config="/etc/my.cnf" export OCF_RESKEY_pid="/var/run/mysqld/mysqld.pid" export OCF_RESKEY_socket="/var/lib/mysql/mysql.sock" export OCF_RESKEY_replication_user="repl" export OCF_RESKEY_replication_passwd="x" export OCF_RESKEY_test_user="test_user" export OCF_RESKEY_test_passwd="x" sh -x /usr/lib/ocf/resource.d/percona/mysql start: http://fpaste.org/RVGh/ Did I make something wrong?

Read the article
propertyregex removes return characters in multiline

- by javydreamercsw

I'm using ants propertyregex method to change a property and it works fine up to a point. I'm lossing return characters. Here's what I'm trying to change: cluster.path=\ ${nbplatform.active.dir}/harness:\ ${nbplatform.active.dir}/platform:\ ${nbplatform.active.dir}/nb This is in a .properties file. So I'm trying to change it like this: <propertyregex property="cluster.path" input="${cluster.path}" regexp="nbplatform.active.dir" replace="xplatform.base" global="true" override="true"/> The stuff is replaced but I get: cluster.path= ${xplatform.base}/harness\: ${xplatform.base}/platform\: ${xplatform.base}/nb This brakes logic down the line not controlled by me (Netbeans) that uses the ':' as delimiter. Any idea?

Read the article
MySQL query problem on group by and max

- by 1s2a3n4j5e6e7v

My table structure is (id,cluster,qid,priority). I'm trying to figure out how I can display the maximum value of priority for each cluster. Say cluster 1 has priorities 100, 102, 105. I want to display the record containing 105. Please help.

Read the article
Tomcat session stickiness in session replication

- by rabbit

Hi, I am having a 2 instance load balanced and session replicated tomcat 6.0.20 cluster. Should sticky_session be set to true or false for in memory session replication. http://tomcat.apache.org/connectors-doc/reference/workers.html mentions : Set sticky_session to False when Tomcat is using a Session Manager which can persist session data across multiple instances of Tomcat. where as /tomcat-6.0-doc/cluster-howto.html (Cluster Basics) mentions : Make sure that your loadbalancer is configured for sticky session mode.

Read the article
Problem drawing a polygon on data clusters in MATLAB

- by Hossein

Hi, I have some data points which I have devided into them into some clusters with some clustering algorithms as the picture below:(it might takes some time for the image to appear) Each color represents different cluster. I have to draw polygons around each cluster. I use convhull for this reason. But as you can see the polygon for the red cluster is very big and covers a lot of areas, which is not the one I am looking for. I need to draw lines(ploygons) exactly around my data sets. For example in the picture above I want a polygon that is drawn exactly the same(and around) as the red cluster with the 3 branches. In other words, in this case I need a polygon with 3 branches to cover my red clusters not that big polygon that covers the whole area. Can anyone help me with this? Please Note that the solution should be general, because the clusters will change in each run of the algorithm, so it needs to be in a way that is general.

Read the article
Changing path to basedir of mysql

- by shantanuo

When-ever I need to start mysql from command line, I need to cd to the base directory and then use mysql command as shown below: # cd /home/ec2-user/percona-5.5.30-tokudb-7.0.1-fedora-x86_64/ # ./bin/mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 3 mysql> How do I start mysql simply by typing "mysql" at command prompt? I tried to export the path but it did not work. export path=$PATH:/home/ec2-user/percona-5.5.30-tokudb-7.0.1-fedora-x86_64/bin/

Read the article

< Previous Page | 21 22 23 24 25 26 27 28 29 30 31 32 | Next Page >