Search Results

Search found 527 results on 22 pages for 'a ha'.

Page 2/22 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

NFS v4, HA Migration, and stale handles on clients

- by Karl Katzke

I'm managing a server running NFS v4 with Pacemaker/OpenAIS. NFS is configured to use TCP. When I migrate the NFS server to another node in the Pacemaker cluster, even though the metadata is persisted, connections from the clients 'hang' and eventually time out after 90 seconds. After that 90 seconds, the old mountpoint becomes 'stale' and the mounted files can no longer be accessed. The 90 second grace period seems to be part of the server configuration and not the client configuration. I see this message on the server: kernel: NFSD: starting 90-second grace period If I restart the NFS client on the client nodes after I migrate (unmounting and then remounting the share), then I don't experience the problem, but connections and file transfers still interrupted. Three questions: What is the 90 second grace period? What's it there for? How can I keep the files from going stale on the clients without restarting them after I migrate the NFS server to another node? Is it actually possible to migrate the NFS server without having large file uploads drop?

Read the article
Best practice for Exchange 2010 HA topology considering 6 x Exchange licenses and TMG 2010

- by MadBoy

What would be best topology considering that: 6 x Exchange 2010 Standard Licenses 2 x Separate locations that are supposed to support redundancy in case of link problems 4 x Forefront TMG 2010 with Forefront Security and Forefront Protection/Security Multiple locations worldwide using those Exchange. Most locations will be connected with VPN Tunnel (the ones hosting Exchange for sure). I was thinking something like this: Location MAIN (about 70-100 people): 2x TMG 2010 in NLB 1x Exchange 2010 CAS/HUB Role 2x Exchange 2010 Mailbox Role (Active + Passive) Location SUPPORT (about 20 people): 2x TMG 2010 in NLB 1x Exchange 2010 CAS/HUB Role 2x Exchange 2010 Mailbox Role (Active + Passive) Management wants to make sure that in case of problems in main location (power failure, link loss etc) second location can support all traffic from around the world and vice-versa. We have 6-7 locations and more comming up (not big ones but like 10+ people per each location). I do know that CAS/HUB is single point of failure (and no NLB), but i simply lack more licenses to do some redundancy on that. What do you think about this approach? What would be better approach according to you?

Read the article
how to design pound -> varnish -> jboss for ha + loadbalancing

- by andreash

Hello, I'm planning a new infrastructure for our web application. We have two JBossAS5 servers, running in a cluster. Session state will be replicated via JBoss Cache. In front of that, there should be some cache, to speed up delivery of static elements. However, most of the traffic to our app will be via HTTPS. So far, I had been thinking of two Varnish caches in front of the JBossASs, each being configured for loadbalancing to the two JBossASs via round-robin. Since Varnish doesn't handle HTTPS, then there would need to be two pound proxies in front of the Varnishs, dealing with the HTTPS. The two pounds would be made high-available with Heartbeat/LinuxHA. The traffic to www.example.com would then be going through our firewall, from there to the virtual IP of the pounds, from there to the Varnishs, and from there to the JBossASs. Question 1: Does this make sense? Or is it overly complicated, and the same goal can be reached with simpler methods? Question 2: If my layout is fine, how do I configure the pound - Varnish step? Should I a) make the Varnish service high-available through Heartbeat/LinuxHA as well and direct traffic from pound to the virtual IP of the Varnishs, or should I rather b) Configure two independent Varnishs and use load-balancing in pound to address the different Varnishs? Thanks a lot for your insight! Andreas.

Read the article
(simple) linux HA with vmware vsphere?

- by derhelge

I hope my upcoming question is specific enough, and you are able and willing to support :-) We have several openSUSE VMs in an ESX-Cluster (three ESX-Servers) with an attached iSCSI-SAN. All of those Linux VMs are "single point of failure"-configured, which means in the case of a Web-Server: LAMP, storage, etc. everything on this machine. This was very simple and in case of a failure (in the last years: kernel panics or apache crashes) a simple reboot triggered by a script did it. But the problem is: How to upgrade/maintain the w(eb-)application or the underlying OS without downtime? This wasn't really managable and i did this in the early morning ;) How can i achieve a "simple" High-Availability Cluster now? I thought of: DRBD with heartbeat with 2 VMs. And for the storage a RDM (raw device mapped) LUN and change the read-write-permissions for both VMs. Is this a good idea? Anyone has a better solution?

Read the article
Linux HA - Best Heartbeat hardware solution

- by Martino Dino

Hi all I would ask anyone what is the best layer 2 medium for heartbeat in Linux and how it's best configured. More precisely I've been thinking about a dedicated NIC for that purpose but then i thought that if a switch breaks then i would loose the heartbeat connection for most of the cluster and STONITH 'BUM'!!! Will probably loose my job after :) Distributing the heartbeat onto the main NICs of every node trough a vif sounds reasonable but im not sure if this is the best option (at least the switches are redundant to some extent). Is it possible to use heartbeat over a bonded interface and that sounds reasonable? Do you have any other tip/solution for that issue?

Read the article
MySQL proxy HA with no need to reconnect after node failure

- by Matthias

I use MySQL with Galera wsrep to get synchronous replication, that part it's up and running I need to setup a kind of proxy to handle client connections. Since any node in cluster can fail, clients will not connect nodes directly, but only via proxy. Currently I use Galera Load Balancer which does it work, but with one exception: if one node fails, all clients connected via proxy to that node get connection error and need to reconnect. I have no control over server applications connected to proxy and some of them can't reconnect automatically and need manual restart. So the question is how to force proxy automatically redirect already connected applications to new data node, without need to reconnect?

Read the article
SonicWall HA "gotchas"?

- by Mark Henderson

We're looking to move away from PFSense and CARP to a pair of SonicWall NSA 24001 configured in Active/Passive for High Availability. I've never dealt with SonicWall before, so is there anything I should know that their sales guy won't tell me? I'm aware that they had an issue with a lot of their devices shutting down connectivity because of a licensing fault, and they have an overtly complex management GUI (on the older devices at least), but are there any other big "gotchas" that I need to be aware of before committing a not insubstantial amount of money towards these devices? 1If you're outside the US, the SonicWall global sites suck balls. Use the US site for all your product research, and then use your local site when you're after local information.

Read the article
PostgreSQL 9.0 HA load balancing between servers

- by Vijay Ramachandran

Hey folks, I'm bashing my head to configure load balancing stuff between two database servers. I have no clue whether, I can find any mechanism to implement this. I already tried to implement Heart beat clustering but it requires virtual Ip wherein I can't create virtual IP or assign my own IP address in amazon EC2. Is there a way to configure PostgreSQL database servers in similar to Amazon load balancing kind of thing ? If so, please suggest the solution. Thanks in advance.

Read the article
Simple Central Storage for HA mail server

- by jtnire

Hi Everyone, I will have 2 Postfix servers. One will be a backup of the other. What is the easiest method to provide central storage to both of these boxes? My infrastructure is very simple: Just a lot of Xen hosts, so there is no SAN or anything. Each Xen host does have RAID1 though. I don't mind mounting NFS shares on each of those mail servers, as long as the NFS server wasn't a single point of failure. Is there such a thing as redundant NFS? Any help would be appreciated Thanks

Read the article
Un-failing over a Cisco PIX 515e

- by ABrown

We had a power outage at our data center last week and when our dual PIX 515E running IOS 7.0(8) (configured with a failover cable) came back, they were in a failed over state where the Secondary unit is active and the Primary unit is standby I have tried 'failover reset', 'failover active', and 'failover reload-standby' as well as executing reloads on both units in a variety of orders, and they don't come back Primary/Active Secondary/Standby. The only thing in my arsenal that I haven't tried is driving to the data center and performing a hard reboot, which I hate to do. I have read How Failover Works on the Cisco Secure Firewall and it seems like this should be wicked straight forward. output of show failover on Primary: Failover On Cable status: Normal Failover unit Primary Failover LAN Interface: N/A - Serial-based failover enabled Unit Poll frequency 15 seconds, holdtime 45 seconds Interface Poll frequency 15 seconds Interface Policy 1 Monitored Interfaces 2 of 250 maximum Version: Ours 7.0(8), Mate 7.0(8) Last Failover at: 02:52:05 UTC Mar 10 2010 This host: Primary - Standby Ready Active time: 0 (sec) Interface outside (x.x.x.165): Normal Interface inside (y.y.y.3): Normal Other host: Secondary - Active Active time: 897045 (sec) Interface outside (x.x.x.164): Normal Interface inside (y.y.y.4): Normal Stateful Failover Logical Update Statistics Link : Unconfigured. output of show failover on Secondary: Failover On Cable status: Normal Failover unit Secondary Failover LAN Interface: N/A - Serial-based failover enabled Unit Poll frequency 15 seconds, holdtime 45 seconds Interface Poll frequency 15 seconds Interface Policy 1 Monitored Interfaces 2 of 250 maximum Version: Ours 7.0(8), Mate 7.0(8) Last Failover at: 02:03:04 UTC Feb 28 2010 This host: Secondary - Active Active time: 896925 (sec) Interface outside (x.x.x.164): Normal Interface inside (y.y.y.4): Normal Other host: Primary - Standby Ready Active time: 0 (sec) Interface outside (x.x.x.165): Normal Interface inside (y.y.y.3): Normal Stateful Failover Logical Update Statistics Link : Unconfigured. I'm seeing the following in my syslog: Mar 10 03:05:00 fw1 %PIX-5-111008: User 'enable_15' executed the 'failover reset' command. Mar 10 03:05:09 fw1 %PIX-5-111008: User 'enable_15' executed the 'failover reload-standby' command. Mar 10 03:05:12 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=406,op=20,my=Active,peer=Failed. Mar 10 03:05:12 fw1 %PIX-6-720028: (VPN-Secondary) HA status callback: Peer state Failed. Mar 10 03:06:09 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=401,op=0,my=Active,peer=Failed. Mar 10 03:06:09 fw1 %PIX-6-720024: (VPN-Secondary) HA status callback: Control channel is down. Mar 10 03:06:09 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=401,op=1,my=Active,peer=Failed. Mar 10 03:06:10 fw1 %PIX-6-720024: (VPN-Secondary) HA status callback: Control channel is up. Mar 10 03:06:10 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=411,op=2,my=Active,peer=Failed. Mar 10 03:06:23 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=406,op=80,my=Active,peer=Standby Ready. Mar 10 03:06:23 fw1 %PIX-6-720028: (VPN-Secondary) HA status callback: Peer state Standby Ready. Mar 10 03:06:24 fw2 %PIX-6-720027: (VPN-Primary) HA status callback: My state Standby Ready. Mar 10 03:07:05 fw1 %PIX-5-111008: User 'enable_15' executed the 'failover reset' command. Mar 10 03:07:31 fw1 %PIX-5-111008: User 'enable_15' executed the 'failover active' command. Mar 10 03:08:04 fw1 %PIX-5-611103: User logged out: Uname: enable_1 Mar 10 03:08:04 fw1 %PIX-6-315011: SSH session from admin1_int on interface inside for user "pix" terminated normally Mar 10 03:08:39 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=406,op=20,my=Active,peer=Failed. Mar 10 03:08:39 fw1 %PIX-6-720028: (VPN-Secondary) HA status callback: Peer state Failed. Mar 10 03:09:10 fw1 %PIX-6-605005: Login permitted from admin1_int/36891 to inside:192.168.4.4/ssh for user "pix" Mar 10 03:09:23 fw1 %PIX-5-111008: User 'enable_15' executed the 'failover reset' command. Mar 10 03:09:38 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=401,op=0,my=Active,peer=Failed. Mar 10 03:09:39 fw1 %PIX-6-720024: (VPN-Secondary) HA status callback: Control channel is down. Mar 10 03:09:39 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=401,op=1,my=Active,peer=Failed. Mar 10 03:09:39 fw1 %PIX-6-720024: (VPN-Secondary) HA status callback: Control channel is up. Mar 10 03:09:39 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=411,op=2,my=Active,peer=Failed. Mar 10 03:09:52 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=406,op=80,my=Active,peer=Standby Ready. Mar 10 03:09:52 fw1 %PIX-6-720028: (VPN-Secondary) HA status callback: Peer state Standby Ready. Mar 10 03:09:53 fw2 %PIX-6-720027: (VPN-Primary) HA status callback: My state Standby Ready. I'm not exactly sure how to interpret that syslog data. Primary doesn't seem to even try to become Active. When I reload the individual units separately, my connections are retained, so it doesn't seem like I have a real hardware failure. Is there something I can query (IOS or SNMP) to check for hardware issues? Any thoughts? My IOS-fu is weak. Thanks for any help you might provide, Aaron

Read the article
High Availability for IaaS, PaaS and SaaS in the Cloud

- by BuckWoody

Outages, natural disasters and unforeseen events have proved that even in a distributed architecture, you need to plan for High Availability (HA). In this entry I'll explain a few considerations for HA within Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS) and Software-as-a-Service (SaaS). In a separate post I'll talk more about Disaster Recovery (DR), since each paradigm has a different way to handle that. Planning for HA in IaaS IaaS involves Virtual Machines - so in effect, an HA strategy here takes on many of the same characteristics as it would on-premises. The primary difference is that the vendor controls the hardware, so you need to verify what they do for things like local redundancy and so on from the hardware perspective. As far as what you can control and plan for, the primary factors fall into three areas: multiple instances, geographical dispersion and task-switching. In almost every cloud vendor I've studied, to ensure your application will be protected by any level of HA, you need to have at least two of the Instances (VM's) running. This makes sense, but you might assume that the vendor just takes care of that for you - they don't. If a single VM goes down (for whatever reason) then the access to it is lost. Depending on multiple factors, you might be able to recover the data, but you should assume that you can't. You should keep a sync to another location (perhaps the vendor's storage system in another geographic datacenter or to a local location) to ensure you can continue to serve your clients. You'll also need to host the same VM's in another geographical location. Everything from a vendor outage to a network path problem could prevent your users from reaching the system, so you need to have multiple locations to handle this. This means that you'll have to figure out how to manage state between the geo's. If the system goes down in the middle of a transaction, you need to figure out what part of the process the system was in, and then re-create or transfer that state to the second set of systems. If you didn't write the software yourself, this is non-trivial. You'll also need a manual or automatic process to detect the failure and re-route the traffic to your secondary location. You could flip a DNS entry (if your application can tolerate that) or invoke another process to alias the first system to the second, such as load-balancing and so on. There are many options, but all of them involve coding the state into the application layer. If you've simply moved a state-ful application to VM's, you may not be able to easily implement an HA solution. Planning for HA in PaaS Implementing HA in PaaS is a bit simpler, since it's built on the concept of stateless applications deployment. Once again, you need at least two copies of each element in the solution (web roles, worker roles, etc.) to remain available in a single datacenter. Also, you need to deploy the application again in a separate geo, but the advantage here is that you could work out a "shared storage" model such that state is auto-balanced across the world. In fact, you don't have to maintain a "DR" site, the alternate location can be live and serving clients, and only take on extra load if the other site is not available. In Windows Azure, you can use the Traffic Manager service top route the requests as a type of auto balancer. Even with these benefits, I recommend a second backup of storage in another geographic location. Storage is inexpensive; and that second copy can be used for not only HA but DR. Planning for HA in SaaS In Software-as-a-Service (such as Office 365, or Hadoop in Windows Azure) You have far less control over the HA solution, although you still maintain the responsibility to ensure you have it. Since each SaaS is different, check with the vendor on the solution for HA - and make sure you understand what they do and what you are responsible for. They may have no HA for that solution, or pin it to a particular geo, or perhaps they have a massive HA built in with automatic load balancing (which is often the case). All of these options (with the exception of SaaS) involve higher costs for the design. Do not sacrifice reliability for cost - that will always cost you more in the end. Build in the redundancy and HA at the very outset of the project - if you try to tack it on later in the process the business will push back and potentially not implement HA. References: http://www.bing.com/search?q=windows+azure+High+Availability (each type of implementation is different, so I'm routing you to a search on the topic - look for the "Patterns and Practices" results for the area in Azure you're interested in)

Read the article
From a language design perspective, if Javascript objects are simply associative arrays, then why ha

- by Christopher Altman

I was reading about objects in O'Reilly Javascript Pocket Reference and the book made the following statement. An object is a compound data type that contains any number of properties. Javascript objects are associative arrays: they associate arbitrary data values with arbitrary names. From a language design perspective, if objects are simply associative arrays, then why have objects? I appreciate the convenience of having objects in the language, but if convenience is the main purpose for adding a data type, then how do you decide what to add and what to not add in a language? A language can quickly become bloated and less valuable if it is weighed down by several overlapping methods and data types (Is this a true statement or am I missing something).

Read the article
HA with nginx and cloud environment

- by gotts

I have a node in cloud environment which is used now as nginx and mongrels behind it. This is what nginx config looks like: upstream mongrel { server 127.0.0.1:8000; server 127.0.0.1:8001; server 127.0.0.1:8002; } I want to achieve the following: add another node nginx has to know about this new node automatically without stopping him, changing config(manually adding new node's mongrels) and starting it again. How can I make my load balancer(nginx) work in the way so it can be self-aware of nodes in cloud?

Read the article
Oracle Database 11g Underground Advice for Database Administrators, by April C. Sims

- by alejandro.vargas

Recently I received a request to review the book "Oracle Database 11g Underground Advice for Database Administrators" by April C. Sims I was happy to have the opportunity know some details about the author, she is an active contributor to the Oracle DBA community, through her blog "Oracle High Availability" . The book is a serious and interesting work, I think it provides a good study and reference guide for DBA's that want to understand and implement highly available environments. She starts walking over the more general aspects and skills required by a DBA and then goes on explaining the steps required to implement Data Guard, using RMAN, upgrading to 11g, etc.

Read the article
Exadata Parameter _AUTO_MANAGE_EXADATA_DISKS

- by AVargas

Normal 0 false false false EN-US X-NONE HE MicrosoftInternetExplorer4 DefSemiHidden="true" DefQFormat="false" DefPriority="99" LatentStyleCount="267" UnhideWhenUsed="false" QFormat="true" Name="Normal"/ UnhideWhenUsed="false" QFormat="true" Name="heading 1"/ UnhideWhenUsed="false" QFormat="true" Name="Title"/ UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/ UnhideWhenUsed="false" QFormat="true" Name="Strong"/ UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/ UnhideWhenUsed="false" Name="Table Grid"/ UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/ UnhideWhenUsed="false" Name="Light Shading"/ UnhideWhenUsed="false" Name="Light List"/ UnhideWhenUsed="false" Name="Light Grid"/ UnhideWhenUsed="false" Name="Medium Shading 1"/ UnhideWhenUsed="false" Name="Medium Shading 2"/ UnhideWhenUsed="false" Name="Medium List 1"/ UnhideWhenUsed="false" Name="Medium List 2"/ UnhideWhenUsed="false" Name="Medium Grid 1"/ UnhideWhenUsed="false" Name="Medium Grid 2"/ UnhideWhenUsed="false" Name="Medium Grid 3"/ UnhideWhenUsed="false" Name="Dark List"/ UnhideWhenUsed="false" Name="Colorful Shading"/ UnhideWhenUsed="false" Name="Colorful List"/ UnhideWhenUsed="false" Name="Colorful Grid"/ UnhideWhenUsed="false" Name="Light Shading Accent 1"/ UnhideWhenUsed="false" Name="Light List Accent 1"/ UnhideWhenUsed="false" Name="Light Grid Accent 1"/ UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/ UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/ UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/ UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/ UnhideWhenUsed="false" QFormat="true" Name="Quote"/ UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/ UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/ UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/ UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/ UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/ UnhideWhenUsed="false" Name="Dark List Accent 1"/ UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/ UnhideWhenUsed="false" Name="Colorful List Accent 1"/ UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/ UnhideWhenUsed="false" Name="Light Shading Accent 2"/ UnhideWhenUsed="false" Name="Light List Accent 2"/ UnhideWhenUsed="false" Name="Light Grid Accent 2"/ UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/ UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/ UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/ UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/ UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/ UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/ UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/ UnhideWhenUsed="false" Name="Dark List Accent 2"/ UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/ UnhideWhenUsed="false" Name="Colorful List Accent 2"/ UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/ UnhideWhenUsed="false" Name="Light Shading Accent 3"/ UnhideWhenUsed="false" Name="Light List Accent 3"/ UnhideWhenUsed="false" Name="Light Grid Accent 3"/ UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/ UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/ UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/ UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/ UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/ UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/ UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/ UnhideWhenUsed="false" Name="Dark List Accent 3"/ UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/ UnhideWhenUsed="false" Name="Colorful List Accent 3"/ UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/ UnhideWhenUsed="false" Name="Light Shading Accent 4"/ UnhideWhenUsed="false" Name="Light List Accent 4"/ UnhideWhenUsed="false" Name="Light Grid Accent 4"/ UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/ UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/ UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/ UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/ UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/ UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/ UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/ UnhideWhenUsed="false" Name="Dark List Accent 4"/ UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/ UnhideWhenUsed="false" Name="Colorful List Accent 4"/ UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/ UnhideWhenUsed="false" Name="Light Shading Accent 5"/ UnhideWhenUsed="false" Name="Light List Accent 5"/ UnhideWhenUsed="false" Name="Light Grid Accent 5"/ UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/ UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/ UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/ UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/ UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/ UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/ UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/ UnhideWhenUsed="false" Name="Dark List Accent 5"/ UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/ UnhideWhenUsed="false" Name="Colorful List Accent 5"/ UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/ UnhideWhenUsed="false" Name="Light Shading Accent 6"/ UnhideWhenUsed="false" Name="Light List Accent 6"/ UnhideWhenUsed="false" Name="Light Grid Accent 6"/ UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/ UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/ UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/ UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/ UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/ UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/ UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/ UnhideWhenUsed="false" Name="Dark List Accent 6"/ UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/ UnhideWhenUsed="false" Name="Colorful List Accent 6"/ UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/ UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/ UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/ UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/ UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/ UnhideWhenUsed="false" QFormat="true" Name="Book Title"/ /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:Arial; mso-bidi-theme-font:minor-bidi;} Exadata auto disk management is controlled by the parameter _AUTO_MANAGE_EXADATA_DISKS. The default value for this parameter is TRUE.When _AUTO_MANAGE_EXADATA_DISKS is enabled, Exadata automate the following disk operations:If a griddisk becomes unavailable/available, ASM will OFFLINE/ONLINE it.If a physicaldisk fails or its status change to predictive failure, for all griddisks built on it ASM will DROP FORCE the failed ones and DROP the ones with predictive failures.If a flashdisk performance degrades, if there are griddisks built on it, they will be DROPPED FORCE in ASM.If a physicaldisk is replaced, the celldisk and griddisks will be recreated and the griddisks will be automatically ADDED in ASM, if they were automatically dropped by ASM. If you manually drop the disks, that will not happen.If a NORMAL, ONLINE griddisk is manually dropped, FORCE option should not be used, otherwise the disk will be automatically added back in ASM. If a gridisk is inactivated, ASM will automatically OFFLINE it.If a gridisk is activated, ASM will automatically ONLINED it. There are some error conditions that may require to temporarily disable _AUTO_MANAGE_EXADATA_DISKS.Details on MOS 1408865.1 - Exadata Auto Disk Management Add disk failing and ASM Rebalance interrupted with error ORA-15074. Immediately after taking care of the problem _AUTO_MANAGE_EXADATA_DISKS should be set back to its default value of TRUE. Full details on Auto disk management feature in Exadata (Doc ID 1484274.1)

Read the article
OSB 10gR3 High Availability

- by [email protected]

OSB_HA_WP.pdf

Read the article
Debugging a classic ReaderWriterLock deadlock with SOSex.dll

- by Latest Microsoft Blogs

I was helping out on an issue the other day where the process would stall if they added enough users in their load tests. Btw, serious kudos to them for making load tests, so much nicer to work with a problem in test rather than when it is getting Read More......(read more)

Read the article
Architecture for highly available MySQL with automatic failover in physically diverse locations

- by Warner

I have been researching high availability (HA) solutions for MySQL between data centers. For servers located in the same physical environment, I have preferred dual master with heartbeat (floating VIP) using an active passive approach. The heartbeat is over both a serial connection as well as an ethernet connection. Ultimately, my goal is to maintain this same level of availability but between data centers. I want to dynamically failover between both data centers without manual intervention and still maintain data integrity. There would be BGP on top. Web clusters in both locations, which would have the potential to route to the databases between both sides. If the Internet connection went down on site 1, clients would route through site 2, to the Web cluster, and then to the database in site 1 if the link between both sites is still up. With this scenario, due to the lack of physical link (serial) there is a more likely chance of split brain. If the WAN went down between both sites, the VIP would end up on both sites, where a variety of unpleasant scenarios could introduce desync. Another potential issue I see is difficulty scaling this infrastructure to a third data center in the future. The network layer is not a focus. The architecture is flexible at this stage. Again, my focus is a solution for maintaining data integrity as well as automatic failover with the MySQL databases. I would likely design the rest around this. Can you recommend a proven solution for MySQL HA between two physically diverse sites? Thank you for taking the time to read this. I look forward to reading your recommendations.

Read the article
Heartbeat on SLES 11

- by Autobyte

I am trying to setup Linux-HA (Heartbeat) on a new SLES 11 Server anyone know of a good step-by-step for this the site's docs leave a little to be desired... Thanks!

Read the article
What does it mean when ARP shows <incomplete> on eth1

- by Geoff Dalgas

We have been using HAProxy along with heartbeat from the Linux-HA project. We are using two linux instances to provide a failover. Each server has with their own public IP and a single IP which is shared between the two using a virtual interface (eth1:1) at IP: 69.59.196.211 The virtual interface (eth1:1) IP 69.59.196.211 is configured as the gateway for the windows servers behind them and we use ip_forwarding to route traffic. We are experiencing an occasional network outage on one of our windows servers behind our linux gateways. HAProxy will detect the server is offline which we can verify by remoting to the failed server and attempting to ping the gateway: Pinging 69.59.196.211 with 32 bytes of data: Reply from 69.59.196.220: Destination host unreachable. Running arp -a on this failed server shows that there is no entry for the gateway address (69.59.196.211): Interface: 69.59.196.220 --- 0xa Internet Address Physical Address Type 69.59.196.161 00-26-88-63-c7-80 dynamic 69.59.196.210 00-15-5d-0a-3e-0e dynamic 69.59.196.212 00-21-5e-4d-45-c9 dynamic 69.59.196.213 00-15-5d-00-b2-0d dynamic 69.59.196.215 00-21-5e-4d-61-1a dynamic 69.59.196.217 00-21-5e-4d-2c-e8 dynamic 69.59.196.219 00-21-5e-4d-38-e5 dynamic 69.59.196.221 00-15-5d-00-b2-0d dynamic 69.59.196.222 00-15-5d-0a-3e-09 dynamic 69.59.196.223 ff-ff-ff-ff-ff-ff static 224.0.0.22 01-00-5e-00-00-16 static 224.0.0.252 01-00-5e-00-00-fc static 225.0.0.1 01-00-5e-00-00-01 static On our linux gateway instances arp -a shows: peak-colo-196-220.peak.org (69.59.196.220) at <incomplete> on eth1 stackoverflow.com (69.59.196.212) at 00:21:5e:4d:45:c9 [ether] on eth1 peak-colo-196-215.peak.org (69.59.196.215) at 00:21:5e:4d:61:1a [ether] on eth1 peak-colo-196-219.peak.org (69.59.196.219) at 00:21:5e:4d:38:e5 [ether] on eth1 peak-colo-196-222.peak.org (69.59.196.222) at 00:15:5d:0a:3e:09 [ether] on eth1 peak-colo-196-209.peak.org (69.59.196.209) at 00:26:88:63:c7:80 [ether] on eth1 peak-colo-196-217.peak.org (69.59.196.217) at 00:21:5e:4d:2c:e8 [ether] on eth1 Why would arp occasionally set the entry for this failed server as <incomplete>? Should we be defining our arp entries statically? I've always left arp alone since it works 99% of the time, but in this one instance it appears to be failing. Are there any additional troubleshooting steps we can take help resolve this issue? THINGS WE HAVE TRIED I added a static arp entry for testing on one of the linux gateways which still didn't help. root@haproxy2:~# arp -a peak-colo-196-215.peak.org (69.59.196.215) at 00:21:5e:4d:61:1a [ether] on eth1 peak-colo-196-221.peak.org (69.59.196.221) at 00:15:5d:00:b2:0d [ether] on eth1 stackoverflow.com (69.59.196.212) at 00:21:5e:4d:45:c9 [ether] on eth1 peak-colo-196-219.peak.org (69.59.196.219) at 00:21:5e:4d:38:e5 [ether] on eth1 peak-colo-196-209.peak.org (69.59.196.209) at 00:26:88:63:c7:80 [ether] on eth1 peak-colo-196-217.peak.org (69.59.196.217) at 00:21:5e:4d:2c:e8 [ether] on eth1 peak-colo-196-220.peak.org (69.59.196.220) at 00:21:5e:4d:30:8d [ether] PERM on eth1 root@haproxy2:~# arp -i eth1 -s 69.59.196.220 00:21:5e:4d:30:8d root@haproxy2:~# ping 69.59.196.220 PING 69.59.196.220 (69.59.196.220) 56(84) bytes of data. --- 69.59.196.220 ping statistics --- 7 packets transmitted, 0 received, 100% packet loss, time 6006ms Rebooting the windows web server solves this issue temporarily with no other changes to the network but our experience shows this issue will come back. Swapping network cards and switches I noticed the link light on the port of the switch for the failed windows server was running at 100Mb instead of 1Gb on the failed interface. I moved the cable to several other open ports and the link indicated 100Mb for each port that I tried. I also swapped the cable with the same result. I tried changing the properties of the network card in windows and the server locked up and required a hard reset after clicking apply. This windows server has two physical network interfaces so I have swapped the cables and network settings on the two interfaces to see if the problem follows the interface. If the public interface goes down again we will know that it is not an issue with the network card. (We also tried another switch we have on hand, no change) Changing network hardware driver versions We've had the same problem with the latest Broadcom driver, as well as the built-in driver that ships in Windows Server 2008 R2. Replacing network cables As a last ditch effort we remembered another change that occurred was the replacement of all of the patch cords between our servers / switch. We had purchased two sets, one green of lengths 1ft - 3ft for the private interfaces and another set of red cables for the public interfaces. We swapped out all of the public interface patch cables with a different brand and ran our servers without issue for a full week ... aaaaaand then the problem recurred. Disable checksum offload, remove TProxy We also tried disabling TCP/IP checksum offload in the driver, no change. We're now pulling out TProxy and moving to a more traditional x-forwarded-for network arrangement without any fancy IP address rewriting. We'll see if that helps.

Read the article
Dedicated Servers: Is one better then two for LAMP pseudo HA setup? [closed]

- by bikedorkseattle

Possible Duplicate: How to find web hosting that meets my requirements? I know there are zillions of commentary about hosting out there, but I haven't read much about this. Our current well known host is having too many problems, the hardware we are on it subpar, and I'm ready to leave. A day of downtime can cost as much as our monthly hosting bill. A month of bad performance is just killing us right now, user and google wise. I'm wondering about running two dedicated boxes for LAMP, one running as the primary Nginx/Apache (proxy pass), and the other as the MySQL box. Running a single box scares the bejesus out of me because who knows how long it will take anyone to fix a raid card or whatever. The idea is to set this up using some sort of failover system using pacemaker and heartbeat. If one server goes down the other can take over for the other running both web and db. There are some good articles over at Linode about this. I have a few DBs that are 1GB+ and would like to load them into memory. Because of this, I'm shying away from a Linode HA setup because for the price I could do it with two dedicated like I described. Am I mad or an idiot? What are people out there doing for pseodu high availability good performance setups under $400/month? I'm a webmaster; I do a lot of things none of it that well :)

Read the article
could corosync can support unicast heartbeat mode?

- by Emre He

could corosync can support unicast heartbeat mode? from another thread in serverfault, some guy raised below corosync conf: totem { version: 2 secauth: off interface { member { memberaddr: 10.xxx.xxx.xxx } member { memberaddr: 10.xxx.xxx.xxx } ringnumber: 0 bindnetaddr: 10.xxx.xxx.xxx mcastport: 694 } transport: udpu } is this conf type means unicast mode? thanks, Emre

Read the article
Active node stops resources when pasive node is shutdown

- by Wakaru44

2 nodes, active/pasive. 2 resources, a virtual ip, openLdap, and the nfs mount where openldap saves the data. When both nodes are up, things worked fine. You could move resources away and put the active in stanby. But when i rebooted the passive node, ( with the resources in the active node), and the passive node loses conectivity, all the resources in the active where stopped by pacemaker. I'm reading the documentation right now, but I just need a little quick tip to figure what could be hapenning here. Im using: corosync pacemaker RHEL 6

Read the article
Linux HA / cluster: what are the differences between Pacemaker, Heartbeat, Corosync, wackamole?

- by Continuation

Can you help me understand Linux HA? Pacemaker, Heartbeat, Corosync seem to be part of a whole HA stack, but how do they fit together? How does wackamole differ from Pacemaker/Heartbeat/Corosync? I've seen opinions that wackamole is better than Heartbeat because it's peer-based. Is that valid? The last release of wackamole was 2.5 years ago. Is it still being maintained or active? What would you recommend for HA setup for web/application/database servers?

Read the article
Heartbeat won't successfully start up resources from a cold boot when a failed node is present

- by Matthew

I currently have two ubuntu servers running Heartbeat and DRBD. The servers are directory connected with a 1000Mbps crossover cable on eth1 and have access to an IP camera LAN on eth0. Now, let's say that one node is down and the remaining functional node is booting after having been shut down. The node that is still functioning won't start up heartbeat and provide access to the drbd resource from a cold boot. I have to manually restart heartbeat by sudo service heartbeat restart to get everything up and running. How can I get it to start fine from a cold start, when only one server is present? Here is the ha.cf: debug /var/log/ha-debug logfile /var/log/ha-log logfacility none keepalive 2 deadtime 10 warntime 7 initdead 60 ucast eth1 192.168.2.2 ucast eth0 10.1.10.201 node EMserver1 node EMserver2 respawn hacluster /usr/lib/heartbeat/ipfail ping 10.1.10.22 10.1.10.21 10.1.10.11 auto_failback off Some material from the syslog: harc[4604]: 2012/11/27_13:54:49 info: Running /etc/ha.d//rc.d/status status mach_down[4632]: 2012/11/27_13:54:49 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired mach_down[4632]: 2012/11/27_13:54:49 info: mach_down takeover complete for node emserver2. Nov 27 13:54:49 EMserver1 heartbeat: [4586]: info: Initial resource acquisition complete (T_RESOURCES(us)) Nov 27 13:54:49 EMserver1 heartbeat: [4586]: info: mach_down takeover complete. IPaddr[4679]: 2012/11/27_13:54:49 INFO: Resource is stopped Nov 27 13:54:49 EMserver1 heartbeat: [4605]: info: Local Resource acquisition completed. harc[4713]: 2012/11/27_13:54:49 info: Running /etc/ha.d//rc.d/ip-request-resp ip-request-resp ip-request-resp[4713]: 2012/11/27_13:54:49 received ip-request-resp IPaddr::10.1.10.254 OK yes ResourceManager[4732]: 2012/11/27_13:54:50 info: Acquiring resource group: emserver1 IPaddr::10.1.10.254 drbddisk::r0 Filesystem::/dev/drbd1::/shr::ext4 nfs-kernel-server IPaddr[4759]: 2012/11/27_13:54:50 INFO: Resource is stopped ResourceManager[4732]: 2012/11/27_13:54:50 info: Running /etc/ha.d/resource.d/IPaddr 10.1.10.254 start IPaddr[4816]: 2012/11/27_13:54:50 INFO: Using calculated nic for 10.1.10.254: eth0 IPaddr[4816]: 2012/11/27_13:54:50 INFO: Using calculated netmask for 10.1.10.254: 255.255.255.0 IPaddr[4816]: 2012/11/27_13:54:50 INFO: eval ifconfig eth0:0 10.1.10.254 netmask 255.255.255.0 broadcast 10.1.10.255 IPaddr[4804]: 2012/11/27_13:54:50 INFO: Success ResourceManager[4732]: 2012/11/27_13:54:50 info: Running /etc/ha.d/resource.d/drbddisk r0 start Filesystem[4965]: 2012/11/27_13:54:50 INFO: Resource is stopped ResourceManager[4732]: 2012/11/27_13:54:50 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd1 /shr ext4 start Filesystem[5039]: 2012/11/27_13:54:50 INFO: Running start for /dev/drbd1 on /shr Filesystem[5033]: 2012/11/27_13:54:51 INFO: Success ResourceManager[4732]: 2012/11/27_13:54:51 info: Running /etc/init.d/nfs-kernel-server start Nov 27 13:55:00 EMserver1 heartbeat: [4586]: info: Local Resource acquisition completed. (none) Nov 27 13:55:00 EMserver1 heartbeat: [4586]: info: local resource transition completed. Nov 27 13:57:46 EMserver1 heartbeat: [4586]: info: Heartbeat shutdown in progress. (4586) Nov 27 13:57:46 EMserver1 heartbeat: [5286]: info: Giving up all HA resources. ResourceManager[5301]: 2012/11/27_13:57:46 info: Releasing resource group: emserver1 IPaddr::10.1.10.254 drbddisk::r0 Filesystem::/dev/drbd1::/shr::ext4 nfs-kernel-server ResourceManager[5301]: 2012/11/27_13:57:46 info: Running /etc/init.d/nfs-kernel-server stop ResourceManager[5301]: 2012/11/27_13:57:46 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd1 /shr ext4 stop Filesystem[5372]: 2012/11/27_13:57:46 INFO: Running stop for /dev/drbd1 on /shr Filesystem[5372]: 2012/11/27_13:57:47 INFO: Trying to unmount /shr Filesystem[5372]: 2012/11/27_13:57:47 INFO: unmounted /shr successfully Filesystem[5366]: 2012/11/27_13:57:47 INFO: Success ResourceManager[5301]: 2012/11/27_13:57:47 info: Running /etc/ha.d/resource.d/drbddisk r0 stop ResourceManager[5301]: 2012/11/27_13:57:47 info: Running /etc/ha.d/resource.d/IPaddr 10.1.10.254 stop IPaddr[5509]: 2012/11/27_13:57:47 INFO: ifconfig eth0:0 down IPaddr[5497]: 2012/11/27_13:57:47 INFO: Success Nov 27 13:57:47 EMserver1 heartbeat: [5286]: info: All HA resources relinquished. Nov 27 13:57:48 EMserver1 heartbeat: [4586]: info: killing /usr/lib/heartbeat/ipfail process group 4603 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBFIFO process 4589 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBWRITE process 4590 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBREAD process 4591 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBWRITE process 4592 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBREAD process 4593 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBWRITE process 4594 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBREAD process 4595 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBWRITE process 4596 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBREAD process 4597 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBWRITE process 4598 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBREAD process 4599 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4589 exited. 11 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4596 exited. 10 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4598 exited. 9 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4590 exited. 8 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4595 exited. 7 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4591 exited. 6 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4592 exited. 5 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4593 exited. 4 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4597 exited. 3 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4594 exited. 2 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4599 exited. 1 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: emserver1 Heartbeat shutdown complete. Here is some more from the log ResourceManager[2576]: 2012/11/28_16:32:42 info: Acquiring resource group: emserver1 IPaddr::10.1.10.254 drbddisk::r0 Filesystem::/dev/drbd1::/shr::ext4 nfs-kernel-server IPaddr[2602]: 2012/11/28_16:32:42 INFO: Running OK Filesystem[2653]: 2012/11/28_16:32:43 INFO: Running OK Nov 28 16:32:52 EMserver1 heartbeat: [1695]: WARN: node emserver2: is dead Nov 28 16:32:52 EMserver1 heartbeat: [1695]: info: Dead node emserver2 gave up resources. Nov 28 16:32:52 EMserver1 ipfail: [1807]: info: Status update: Node emserver2 now has status dead Nov 28 16:32:52 EMserver1 heartbeat: [1695]: info: Link emserver2:eth1 dead. Nov 28 16:32:53 EMserver1 ipfail: [1807]: info: NS: We are still alive! Nov 28 16:32:53 EMserver1 ipfail: [1807]: info: Link Status update: Link emserver2/eth1 now has status dead Nov 28 16:32:55 EMserver1 ipfail: [1807]: info: Asking other side for ping node count. Nov 28 16:32:55 EMserver1 ipfail: [1807]: info: Checking remote count of ping nodes. Nov 28 16:32:57 EMserver1 heartbeat: [1695]: info: Heartbeat shutdown in progress. (1695) Nov 28 16:32:57 EMserver1 heartbeat: [2734]: info: Giving up all HA resources. ResourceManager[2751]: 2012/11/28_16:32:57 info: Releasing resource group: emserver1 IPaddr::10.1.10.254 drbddisk::r0 Filesystem::/dev/drbd1::/shr::ext4 nfs-kernel-server ResourceManager[2751]: 2012/11/28_16:32:57 info: Running /etc/init.d/nfs-kernel-server stop ResourceManager[2751]: 2012/11/28_16:32:57 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd1 /shr ext4 stop Filesystem[2829]: 2012/11/28_16:32:57 INFO: Running stop for /dev/drbd1 on /shr Filesystem[2829]: 2012/11/28_16:32:57 INFO: Trying to unmount /shr Filesystem[2829]: 2012/11/28_16:32:58 INFO: unmounted /shr successfully Filesystem[2823]: 2012/11/28_16:32:58 INFO: Success ResourceManager[2751]: 2012/11/28_16:32:58 info: Running /etc/ha.d/resource.d/drbddisk r0 stop ResourceManager[2751]: 2012/11/28_16:32:58 info: Running /etc/ha.d/resource.d/IPaddr 10.1.10.254 stop IPaddr[2971]: 2012/11/28_16:32:58 INFO: ifconfig eth0:0 down IPaddr[2958]: 2012/11/28_16:32:58 INFO: Success Nov 28 16:32:58 EMserver1 heartbeat: [2734]: info: All HA resources relinquished. Nov 28 16:32:59 EMserver1 heartbeat: [1695]: info: killing /usr/lib/heartbeat/ipfail process group 1807 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBFIFO process 1777 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBWRITE process 1778 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBREAD process 1779 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBWRITE process 1780 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBREAD process 1781 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBWRITE process 1782 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBREAD process 1783 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBWRITE process 1784 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBREAD process 1785 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBWRITE process 1786 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBREAD process 1787 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1778 exited. 11 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1779 exited. 10 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1780 exited. 9 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1781 exited. 8 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1782 exited. 7 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1783 exited. 6 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1784 exited. 5 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1785 exited. 4 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1786 exited. 3 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1787 exited. 2 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1777 exited. 1 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: emserver1 Heartbeat shutdown complete. If I restarted heartbeat at this point... the resources heartbeat controls would start up fine.... please help!

Read the article

Search Results

Search found 527 results on 22 pages for 'a ha'.

Page 2/22 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by Karl Katzke

- by MadBoy

- by andreash

- by derhelge

- by Martino Dino

- by Matthias

- by Mark Henderson

- by Vijay Ramachandran

- by jtnire

- by ABrown

- by BuckWoody

- by Christopher Altman

- by gotts

- by alejandro.vargas

- by AVargas

- by [email protected]

- by Latest Microsoft Blogs

- by Warner

- by Autobyte

- by Geoff Dalgas

- by bikedorkseattle

- by Emre He

- by Wakaru44

- by Continuation

- by Matthew

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >