Search Results

Search found 504 results on 21 pages for 'failover'.

Page 2/21 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Cluster failover and strange gratuitous arp behavior

    - by lazerpld
    I am experiencing a strange Windows 2008R2 cluster related issue that is bothering me. I feel that I have come close as to what the issue is, but still don't fully understand what is happening. I have a two node exchange 2007 cluster running on two 2008R2 servers. The exchange cluster application works fine when running on the "primary" cluster node. The problem occurs when failing over the cluster ressource to the secondary node. When failing over the cluster to the "secondary" node, which for instance is on the same subnet as the "primary", the failover initially works ok and the cluster ressource continues to work for a couple of minutes on the new node. Which means that the recieving node does send out a gratuitous arp reply packet that updated the arp tables on the network. But after x amount of time (typically within 5 minutes time) something updates the arp-tables again because all of a sudden the cluster service does not answer to pings. So basically I start a ping to the exchange cluster address when its running on the "primary node". It works just great. I failover the cluster ressource group to the "secondary node" and I only have loss of one ping which is acceptable. The cluster ressource still answers for some time after being failed over and all of a sudden the ping starts timing out. This is telling me that the arp table initially is updated by the secondary node, but then something (which I haven't found out yet) wrongfully updates it again, probably with the primary node's MAC. Why does this happen - has anyone experienced the same problem? The cluster is NOT running NLB and the problem stops immidiately after failing over back to the primary node where there are no problems. Each node is using NIC teaming (intel) with ALB. Each node is on the same subnet and has gateway and so on entered correctly as far as I am concerned. Edit: I was wondering if it could be related to network binding order maybe? Because I have noticed that the only difference I can see from node to node is when showing the local arp table. On the "primary" node the arp table is generated on the cluster address as the source. While on the "secondary" its generated from the nodes own network card. Any input on this? Edit: Ok here is the connection layout. Cluster address: A.B.6.208/25 Exchange application address: A.B.6.212/25 Node A: 3 physical nics. Two teamed using intels teaming with the address A.B.6.210/25 called public The last one used for cluster traffic called private with 10.0.0.138/24 Node B: 3 physical nics. Two teamed using intels teaming with the address A.B.6.211/25 called public The last one used for cluster traffic called private with 10.0.0.139/24 Each node sits in a seperate datacenter connected together. End switches being cisco in DC1 and NEXUS 5000/2000 in DC2. Edit: I have been testing a little more. I have now created an empty application on the same cluster, and given it another ip address on the same subnet as the exchange application. After failing this empty application over, I see the exact same problem occuring. After one or two minutes clients on other subnets cannot ping the virtual ip of the application. But while clients on other subnets cannot, another server from another cluster on the same subnet has no trouble pinging. But if i then make another failover to the original state, then the situation is the opposite. So now clients on same subnet cannot, and on other they can. We have another cluster set up the same way and on the same subnet, with the same intel network cards, the same drivers and same teaming settings. Here we are not seeing this. So its somewhat confusing. Edit: OK done some more research. Removed the NIC teaming of the secondary node, since it didnt work anyway. After some standard problems following that, I finally managed to get it up and running again with the old NIC teaming settings on one single physical network card. Now I am not able to reproduce the problem described above. So it is somehow related to the teaming - maybe some kind of bug? Edit: Did some more failing over without being able to make it fail. So removing the NIC team looks like it was a workaround. Now I tried to reestablish the intel NIC teaming with ALB (as it was before) and i still cannot make it fail. This is annoying due to the fact that now i actually cannot pinpoint the root of the problem. Now it just seems to be some kind of MS/intel hick-up - which is hard to accept because what if the problem reoccurs in 14 days? There is a strange thing that happened though. After recreating the NIC team I was not able to rename the team to "PUBLIC" which the old team was called. So something has not been cleaned up in windows - although the server HAS been restarted! Edit: OK after restablishing the ALB teaming the error came back. So I am now going to do some thorough testing and i will get back with my observations. One thing is for sure. It is related to Intel 82575EB NICS, ALB and Gratuitous Arp.

    Read the article

  • Add second ip address to an existing SQl 2008 failover clustering

    - by Cédric Boivin
    Hello, I got actually a failover cluster on Windows Server 2008, with sql server 2008. On each server i got two network card, with two different network one are on 10.10.10.x and other are on 192.168.99.x I want my sqlserver cluster listen on the two network. Is it possible and how i add new ip address. When i add a new ip address directly in the cluster, and i do a telnet on the 1433 port with the new cluster ip address it's not working. Thanks

    Read the article

  • IIS7 failover cluster across datacenters

    - by Scott
    Hello, I have servers in two different datacenters with each datacenter getting static IPs. What I would like to do is setup the servers as IIS7 servers and allowing them to failover from datacenter to datacenter with little (or preferably) no interruption. Servers on both sides are running Windows Server 2008 x64 with IIS7 (or 7.5 if needed). I am interested in how to point DNS traffic to the new datacenter without manual human intervention. For example: Datacenter A: IP: 192.168.1.115 Servers: Server 2008 x64 w/ IIS 7 Datacenter B: IP: 192.168.1.220 Servers: Server 2008 x64 w/ IIS 7 Other information: Domain Name: Example.org Domain DNS: 192.168.1.115 If Datacenter A connectivity went down (broken service line, etc.) how does the traffic know to route to Datacenter B on 192.168.1.220? Thanks, Scott

    Read the article

  • NFS failover WITHOUT DRBD?

    - by user439407
    So I am trying to set up a redundant NFS share in a cloud environment(all links internal, half gig links), and I am looking into using heartbeat for failover, but all the guides seem to be about combining DRBD and heartbeat to create a robust environment. If need be I can do that, but since my content is almost completely static, I would like to avoid the extra overhead and complexity of DRBD if possible, but still be able to fail over if one of the NFS servers fails. Is it possible to use heartbeat with NFS to achieve high-availability without using DRBD to copy the blocks? I am not married to NFSv4, so if NFSv3 over UDP is necessary, that won't be a problem(only a very small number of clients will be connecting to the share) Any comments are appreciated.

    Read the article

  • Clusterware 11gR2 &ndash; Setting up an Active/Passive failover configuration

    - by Gilles Haro
    Oracle is providing a large range of interesting solutions to ensure High Availability of the database. Dataguard, RAC or even both configurations (as recommended by Oracle for a Maximum Available Architecture - MAA) are the most frequently found and used solutions. But, when it comes to protecting a system with an Active/Passive architecture with failover capabilities, people often thinks to other expensive third party cluster systems. Oracle Clusterware technology, which comes along at no extra-cost with Oracle Database or Oracle Unbreakable Linux, is - in the knowing of most people - often linked to Oracle RAC and therefore, is seldom used to implement failover solutions. Oracle Clusterware 11gR2  (a part of Oracle 11gR2 Grid Infrastructure)  provides a comprehensive framework to setup automatic failover configurations. It is actually possible to make "failover-able'", and then to protect, almost any kind of application (from the simple xclock to the most complex Application Server). Quoting Oracle: “Oracle Clusterware is a portable cluster software that allows clustering of single servers so that they cooperate as a single system. Oracle Clusterware also provides the required infrastructure for Oracle Real Application Clusters (RAC). In addition Oracle Clusterware enables the protection of any Oracle application or any other kind of application within a cluster.” In the next couple of lines, I will try to present the different steps to achieve this goal : Have a fully operational 11gR2 database protected by automatic failover capabilities. I assume you are fluent in installing Oracle Database 11gR2, Oracle Grid Infrastructure 11gR2 on a Linux system and that ASM is not a problem for you (as I am using it as a shared storage). If not, please have a look at Oracle Documentation. As often, I made my tests using an Oracle VirtualBox environment. The scripts are tested and functional on my system. Unfortunately, there can always be a typo or a mistake. This blog entry does not replace a course around the Clusterware Framework. I just hope it will let you see how powerful it is and that it will give you the whilst to go further with it...  Note : This entry has been revised (rev.2) following comments from Philip Newlan. Prerequisite 2 Linux boxes (OELCluster01 and OELCluster02) at the same OS level. I used OEL 5 Update 5 with an Enterprise Kernel. Shared Storage (SAN). On my VirtualBox system, I used Openfiler to simulate the SAN Oracle 11gR2 Database (11.2.0.1) Oracle 11gR2 Grid Infrastructure (11.2.0.1)   Step 1 - Install the software Using asmlib, create 3 ASM disks (ASM_CRS, ASM_DTA and ASM_FRA) Install Grid Infrastructure for a cluster (OELCluster01 and OELCluster02 are the 2 nodes of the cluster) Use ASM_CRS to store Voting Disk and OCR. Use SCAN. Install Oracle Database Standalone binaries on both nodes. Use asmca to check/mount the disk groups on 2 nodes Use dbca to create and configure a database on the primary node Let's name it DB11G. Copy the pfile, password file to the second node. Create adump directoty on the second node.   Step 2 - Setup the resource to be protected After its creation with dbca, the database is automatically protected by the Oracle Restart technology available with Grid Infrastructure. Consequently, it restarts automatically (if possible) after a crash (ex: kill -9 smon). A database resource has been created for that in the Cluster Registry. We can observe this with the command : crsctl status resource that shows and ora.dba11g.db entry. Let's save the definition of this resource, for future use : mkdir -p /crs/11.2.0/HA_scripts chown oracle:oinstall /crs/11.2.0/HA_scripts crsctl status resource ora.db11g.db -p > /crs/11.2.0/HA_scripts/myResource.txt Although very interesting, Oracle Restart is not cluster aware and cannot restart the database on any other node of the cluster. So, let's remove it from the OCR definitions, we don't need it ! srvctl stop database -d DB11G srvctl remove database -d DB11G Instead of it, we need to create a new resource of a more general type : cluster_resource. Here are the steps to achieve this : Create an action script :  /crs/11.2.0/HA_scripts/my_ActivePassive_Cluster.sh #!/bin/bash export ORACLE_HOME=/oracle/product/11.2.0/dbhome_1 export ORACLE_SID=DB11G case $1 in 'start')   $ORACLE_HOME/bin/sqlplus /nolog <<EOF   connect / as sysdba   startup EOF   RET=0   ;; 'stop')   $ORACLE_HOME/bin/sqlplus /nolog <<EOF   connect / as sysdba   shutdown immediate EOF   RET=0   ;; 'clean')   $ORACLE_HOME/bin/sqlplus /nolog <<EOF   connect / as sysdba   shutdown abort    ##for i in `ps -ef | grep -i $ORACLE_SID | awk '{print $2}' ` ;do kill -9 $i; done EOF   RET=0   ;; 'check')    ok=`ps -ef | grep smon | grep $ORACLE_SID | wc -l`    if [ $ok = 0 ]; then      RET=1    else      RET=0    fi    ;; '*')      RET=0   ;; esac if [ $RET -eq 0 ]; then    exit 0 else    exit 1 fi   This script must provide, at least, methods to start, stop, clean and check the database. It is self-explaining and contains nothing special. Just be aware that it must be runnable (+x), it runs as Oracle user (because of the ACL property - see later) and needs to know about the environment. Also make sure it exists on every node of the cluster. Moreover, as of 11.2, the clean method is mandatory. It must provide the “last gasp clean up”, for example, a shutdown abort or a kill –9 of all the remaining processes. chmod +x /crs/11.2.0/HA_scripts/my_ActivePassive_Cluster.sh scp  /crs/11.2.0/HA_scripts/my_ActivePassive_Cluster.sh   oracle@OELCluster02:/crs/11.2.0/HA_scripts Create a new resource file, based on the information we got from previous  myResource.txt . Name it myNewResource.txt. myResource.txt  is shown below. As we can see, it defines an ora.database.type resource, named ora.db11g.db. A lot of properties are related to this type of resource and do not need to be used for a cluster_resource. NAME=ora.db11g.db TYPE=ora.database.type ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r-- ACTION_FAILURE_TEMPLATE= ACTION_SCRIPT= ACTIVE_PLACEMENT=1 AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX% AUTO_START=restore CARDINALITY=1 CHECK_INTERVAL=1 CHECK_TIMEOUT=600 CLUSTER_DATABASE=false DB_UNIQUE_NAME=DB11G DEFAULT_TEMPLATE=PROPERTY(RESOURCE_CLASS=database) PROPERTY(DB_UNIQUE_NAME= CONCAT(PARSE(%NAME%, ., 2), %USR_ORA_DOMAIN%, .)) ELEMENT(INSTANCE_NAME= %GEN_USR_ORA_INST_NAME%) DEGREE=1 DESCRIPTION=Oracle Database resource ENABLED=1 FAILOVER_DELAY=0 FAILURE_INTERVAL=60 FAILURE_THRESHOLD=1 GEN_AUDIT_FILE_DEST=/oracle/admin/DB11G/adump GEN_USR_ORA_INST_NAME= GEN_USR_ORA_INST_NAME@SERVERNAME(oelcluster01)=DB11G HOSTING_MEMBERS= INSTANCE_FAILOVER=0 LOAD=1 LOGGING_LEVEL=1 MANAGEMENT_POLICY=AUTOMATIC NLS_LANG= NOT_RESTARTING_TEMPLATE= OFFLINE_CHECK_INTERVAL=0 ORACLE_HOME=/oracle/product/11.2.0/dbhome_1 PLACEMENT=restricted PROFILE_CHANGE_TEMPLATE= RESTART_ATTEMPTS=2 ROLE=PRIMARY SCRIPT_TIMEOUT=60 SERVER_POOLS=ora.DB11G SPFILE=+DTA/DB11G/spfileDB11G.ora START_DEPENDENCIES=hard(ora.DTA.dg,ora.FRA.dg) weak(type:ora.listener.type,uniform:ora.ons,uniform:ora.eons) pullup(ora.DTA.dg,ora.FRA.dg) START_TIMEOUT=600 STATE_CHANGE_TEMPLATE= STOP_DEPENDENCIES=hard(intermediate:ora.asm,shutdown:ora.DTA.dg,shutdown:ora.FRA.dg) STOP_TIMEOUT=600 UPTIME_THRESHOLD=1h USR_ORA_DB_NAME=DB11G USR_ORA_DOMAIN=haroland USR_ORA_ENV= USR_ORA_FLAGS= USR_ORA_INST_NAME=DB11G USR_ORA_OPEN_MODE=open USR_ORA_OPI=false USR_ORA_STOP_MODE=immediate VERSION=11.2.0.1.0 I removed database type related entries from myResource.txt and modified some other to produce the following myNewResource.txt. Notice the NAME property that should not have the ora. prefix Notice the TYPE property that is not ora.database.type but cluster_resource. Notice the definition of ACTION_SCRIPT. Notice the HOSTING_MEMBERS that enumerates the members of the cluster (as returned by the olsnodes command). NAME=DB11G.db TYPE=cluster_resource DESCRIPTION=Oracle Database resource ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r-- ACTION_SCRIPT=/crs/11.2.0/HA_scripts/my_ActivePassive_Cluster.sh PLACEMENT=restricted ACTIVE_PLACEMENT=0 AUTO_START=restore CARDINALITY=1 CHECK_INTERVAL=10 DEGREE=1 ENABLED=1 HOSTING_MEMBERS=oelcluster01 oelcluster02 LOGGING_LEVEL=1 RESTART_ATTEMPTS=1 START_DEPENDENCIES=hard(ora.DTA.dg,ora.FRA.dg) weak(type:ora.listener.type,uniform:ora.ons,uniform:ora.eons) pullup(ora.DTA.dg,ora.FRA.dg) START_TIMEOUT=600 STOP_DEPENDENCIES=hard(intermediate:ora.asm,shutdown:ora.DTA.dg,shutdown:ora.FRA.dg) STOP_TIMEOUT=600 UPTIME_THRESHOLD=1h Register the resource. Take care of the resource type. It needs to be a cluster_resource and not a ora.database.type resource (Oracle recommendation) .   crsctl add resource DB11G.db  -type cluster_resource -file /crs/11.2.0/HA_scripts/myNewResource.txt Step 3 - Start the resource crsctl start resource DB11G.db This command launches the ACTION_SCRIPT with a start and a check parameter on the primary node of the cluster. Step 4 - Test this We will test the setup using 2 methods. crsctl relocate resource DB11G.db This command calls the ACTION_SCRIPT  (on the two nodes)  to stop the database on the active node and start it on the other node. Once done, we can revert back to the original node, but, this time we can use a more "MS$ like" method :Turn off the server on which the database is running. After short delay, you should observe that the database is relocated on node 1. Conclusion Once the software installed and the standalone database created (which is a rather common and usual task), the steps to reach the objective are quite easy : Create an executable action script on every node of the cluster. Create a resource file. Create/Register the resource with OCR using the resource file. Start the resource. This solution is a very interesting alternative to licensable third party solutions. References Clusterware 11gR2 documentation Oracle Clusterware Resource Reference Clusterware for Unbreakable Linux Using Oracle Clusterware to Protect A Single Instance Oracle Database 11gR1 (to have an idea of complexity) Oracle Clusterware on OTN   Gilles Haro Technical Expert - Core Technology, Oracle Consulting   

    Read the article

  • Providing high availability and failover using MySQL on EC2

    - by crb
    I would like to have a highly-available MySQL system, with automatic failover, running on Amazon EC2 instances. The standard approach to solving this is problem Heartbeat + DRBD, but I've found a lot of posts suggesting DRBD doesn't work on EC2, though none saying exactly why. Obviously, a serial heartbeat or distinct network is out of the question in the virtualised environment. It would also be good to have the different servers be in different availability zones, but we're getting into a much harder problem there. What are peoples' opinion on having a high uptime solution in "the cloud"?

    Read the article

  • IRC Services with failover support?

    - by insertjokehere
    I run a single server (call it 'server A') IRC 'network', and thank to the generosity of some friends, I have been given a second server ('server B') that I can run an IRCd on in order to provide redundancy in case server A crashes. This is fine, I can set up a round-robin DNS with the servers linked. The problem I have is what to do about services? Does anyone know of a way to get the services to 'fail over' in case of a server failure? Eg, Server A starts off running the services, but suddenly crashes. Server B detects this and starts its own copy of the services (ideally with the same configuration and data as the services on Server B) One solution that comes it mind is to write a bot that each server runs, that sit in a channel periodically checking if the bot from the other server is in the channel. If it is, then all is well. If not, then failover. I would prefer not to have to code this myself though We are currently using Unreal IRCd and Anope services on Linux

    Read the article

  • Providing high availability and failover using MySQL on EC2

    - by crb
    I would like to have a highly-available MySQL system, with automatic failover, running on Amazon EC2 instances. The standard approach to solving this is problem Heartbeat + DRBD, but I've found a lot of posts suggesting DRBD doesn't work on EC2, though none saying exactly why. Obviously, a serial heartbeat or distinct network is out of the question in the virtualised environment. It would also be good to have the different servers be in different availability zones, but we're getting into a much harder problem there. What are peoples' opinion on having a high uptime solution in "the cloud"? Note: This question was asked before RDS with multi-AZ was announced, which is the nice automatic answer for today's modern IT professional. :)

    Read the article

  • MySQL Cluster Failover doesn't work

    - by Lukasz
    I have two servers, where First server 10.100.15.150: 1. one mgm server 2. one ndbd 3. one mysql api Second server 10.100.15.160: 1. one ndbd 2. one mysql api When i start all 'parts' of cluster it looks : Cluster Configuration [ndbd(NDB)] 2 node(s) id=21 @10.100.15.150 (mysql-5.1.56 ndb-7.1.17, Nodegroup: 0) id=22 @10.100.15.160 (mysql-5.1.56 ndb-7.1.17, Nodegroup: 0, Master) [ndb_mgmd(MGM)] 1 node(s) id=3 @10.100.15.150 (mysql-5.1.56 ndb-7.1.17) [mysqld(API)] 2 node(s) id=11 @10.100.15.150 (mysql-5.1.56 ndb-7.1.17) id=12 @10.100.15.160 (mysql-5.1.56 ndb-7.1.17) When i shutdown first machine - 10.100.15.150, on second the nbdb process also has been shutdown so i cannot use this data node and cluster fail ... How i must configure this cluster to get FailOver working ? Thx

    Read the article

  • Oracle 10g Failover Database - How to fail back?

    - by rrkwells
    I want to know how the failover database concept works after recovery. We have defined our application to connect to a backup database in case the production database fails. If this happens, then all the transactions will be happening on that backup database. Once the production db server is running again, then how do we make sure the changes made in the backup database will be reflected on the production database? We want to make sure that any changes made while failed over are not lost. We are using Oracle 10g.

    Read the article

  • Intermittent unavailability of an instance in a failover cluster while a standby node is offline in

    - by Emil Fridriksson
    Hi everyone. I've got a small failover cluster that I run for the websites my company has. During a RAM upgrade of the standby server, our websites started to show errors about not being able to access the database server. I verified that the instance was indeed up and the server accessable via remote desktop. I also tried a SQL connection to it and it worked, but that might have been after it became available again. This happened on and off until we were able to roll back the hardware changes that were in progress on the standby server and we were able to bring it back up. There was nothing of interest in the SQL Server log, but there is a continous log for the whole duration of the problem, so there was no restart of the SQL Server service. The event viewer is of more interest, since it shows events relating to the heartbeat network card, but I don't know how that would affect the availability of the server, since the standby node is offline. I'd appreciate any help you can provide, it's not very redundant if the setup depends on the standby server being up. :) Here are the event logs from the time of the problem, I include all of them since I can't seem to see what could possibly be the cause of the problem. Event log: http://hlekkir.com:800/htmltable.htm

    Read the article

  • Failover strategy for a 4 servers scenario

    - by Joao Villa-Lobos
    Hi all, I am trying to figure out how to set up replication & failover in a scenario with 4 servers (2 per location) where any server may assume the Master role. My initial scenario is the following one: 2 servers in location A (One Master, One Slave); 2 servers in location B (Two Slaves). For this I'm thinking on using the configuration Master-Master Active-Passive suggested on O'Reilly's "High Performance MySQL" on all of them so each one can become a Master when needed. If the Master "dies" the other server from location A assumes the Master role whenever possible. It will always have a bigger priority then the servers on location B. A server on location B will only switch to Master if no server on location A is able to do so. Since MySQL can't handle this automatically I need some other way to implement this. I've read already about heartbeat and Maatkit. Is this the way to go? Has anyone used this in a similar scenario? Is there some other way to go in order to achieve this? Any pointers about failout will be appreciated. I want to keep this as simple as possible avoiding stuff such as DRDB. I'm not concerned about high availability just a way to switch roles automatically without too many hassle. I'm using SuSe Enterprise 10 and MySQL 5.1.30-community. Thanks in advance, João

    Read the article

  • SQL Server 2012 AlwaysOn: Multisite Failover Cluster Instance

    SQL Server Failover Clustering, which includes support for both local and multisite failover configurations, is part of the SQL Server 2012 AlwaysOn implementation suite, designed to provide high availability and disaster recovery for SQL Server. The multisite failover clustering technology has been enhanced significantly in SQL Server 2012. The multisite failover cluster architecture, enhancements in SQL Server 2012 to the technology, and some best practices to help with deployment of the technology are the primary focus of this paper.

    Read the article

  • SQL Server Database In Single User Mode after Failover

    - by jlichauc
    Here is a weird situation we experienced with a SQL Server 2008 Database Mirroring Failover. We have a pair of mirrored databases running in high-availability mode and both the principal and mirror showed as synchronized. As part of some maintenance I triggered a manual failover of the principal to the mirror. However after the failover the principal was now in single-user mode instead of the expected "Principal/Synchronized" state we usually get. The database had been in multi-user mode on the previous principal before this had happened. We ended up stopping all applications, restarting the SQL Server instances, and executing "ALTER DATABASE ... SET MULTI_USER" to bring the database back to the expected "Principal/Synchronized" state in a multi-user mode. Question. Does anyone know where SQL Server stores information about whether a database should be in single-user mode or not? I'm wondering if there is some system database or table that has this setting recorded somewhere. In particular we had an incident once with the database on the original principal (the one I was failing over to) where when trying to detach the database it was put into single-user mode. I'm wondering if that setting is cached somewhere and is the reason that SQL Server put it back into single-user mode after a failover.

    Read the article

  • Handling database failover for Rails applications on FreeBSD

    - by bianster
    I'm working on implementing database (Postgresql) failover for a Rails app that runs with Passenger/FreeBSD. Due to certain constraints regarding the server OS, it's necessary to continue using FreeBSD (as opposed to say, Ubuntu). I'm finding it to be quite a challenge to have failover handled within the Rails application, by way of a customised database adapter due to the fact that this application will be load-balanced between several webservers, and the multiple Rails processes that Passenger spawned in each webserver. I previously looked at setting up Pacemaker/Corosync to manage database server failover on a common IP but unfortunately I wasn't able to get past building the packages on FreeBSD. It does work rather well on Ubuntu 10.04 but I'm not likely to be able to use Ubuntu due to the OS constraints. I'm considering a custom witness daemon that simply pings the primary DB server, and this witness daemon switches all the webservers to the standby DB server when the primary becomes uncontactable (permanently/temporarily), to avoid split-brain. Though I would really like to know if there is a way to get Pacemaker(or something similar) to do the switch on FreeBSD.

    Read the article

  • SQL Cluster on Hyper V Failover Cluster

    - by Chris W
    We have a VM running SQL Server on a 6 node cluster of blades. The VM's data files are stored a SAN attached using a direct iSCSI connection. As this SQL server will be running a number of important databases we're debating whether we should be clustering the SQL Server or will the fact that the VM is running in the cluster itself sufficient to give us high availability. I'm used to running SQL clusters when dealing with physical servers but I'm a bit sketchy on what is best practice when all the servers are just VMs sat on Hyper V. If a blade running the VM fails I presume the VM will be started up on another load. I'm guessing the only benefit that adding a SQL cluster to the setup will give us it that the recovery time after a failure will be a little quicker? Are there any other benefits?

    Read the article

  • How to cluster two IIS servers for failover?

    - by Ram Gopal
    We have IIS servers running in 2 machines hosting few webservices which provided some integration services to an old document Mgmt system, word/excel related service, etc.... We need to cluster/load balance these 2 IIS in order to achieve a fail-over. i.e If one of the IIS server is down, the other on should be able to handle the request. The reverse proxy used in the DMZ is also IIS 7.5 Our overall business application is in fact a J2EE one and we have successfully deployed on a weblogic cluster installed on the same two machines and load balance from the same above mentioned IIS reverse proxy at DMZ. But we do not know how to achieve this in case of IIS.

    Read the article

  • CentOS Failover Cluster - SIOCADDRT: No such process (when adding a loopback)

    - by Steve Rolfe
    I'm trying to configure two web servers for a load balancing server. The load balancing aspect works fine (it sees both server, kills 'em if it needs to, and seems to direct traffic fine). The only issue is with the servers looping: /etc/sysconfig/network-scripts/ifcfg-lo:0 DEVICE=lo:0 IPADDR=<Virtual IP> NETMASK=255.255.255.255 ONBOOT=yes NAME=loopback Everytime I try a "service network restart" I get a SIOCADDRT: No such process when loading the loopback interface. Anyone have an idea what's causing this?

    Read the article

  • How to manage service failover?

    - by Jader Dias
    I am using Windows Network Load Balancing to keep my apps available even when one of the servers is down. But when all servers are up, but one instance of a service in one of them is down, I would like to not send requests to it, because those requests will be lost. Is there any solution that addresses this problem?

    Read the article

  • Router failover not detecting outside interface link lost

    - by Matt
    Suppose I have two routers configured in master/slave configuration. They look something like this (addresses are not real ones) 123.123.123.10 <===> [eth0] Router 1 (10.1.1.2) [eth1] ===> +----------+ | 10.1.1.1 | ===> LAN 172.123.123.10 <===> [eth0] Router 2 (10.1.1.3) [eth1] ===> +----------+ The 10.1.1.1 is the default route for the Network (10.1.1.0). What's slightly different in this config to other's I've seen is that I don't have an external virtual IP. Also, the 10.1.1.1 addresses are in real life, public IP's (not private ones shown here). This is more of a router setup than a firewall setup so I'm not using NAT here. Now the issue that I'm having is that I can't see any way to configure UCARP or VRRP to monitor both eth0 & eth1 and fail over to the backup router should either of them go down. What I'm seeing is that if Router1 is the master and I unplug eth0 on router1, it doesn't fail over to router 2. However, it will if instead I unplug eth1 of router 1. In VRRP I see there is a cluster group, but it seems that for this to work you need to have virtual ip's or vrrp instances rather than actual interfaces assigned to it. I hope my explanation is clear. How do I get around this?

    Read the article

  • Configuring HAProxy with memcache with failover

    - by Lawrie Matthews
    I'm configuring a new set of servers for an existing Wordpress site, and it's been requested that memcache be available and made more resilient. The idea proposed is to have HAProxy send requests to one of the two servers; if that memcache instance is inaccessible, then it should switch to the second, but should not switch back to the first if it comes back up unless the second is then unavailable. This doesn't appear to be a particularly common use case and I've not found much along these lines except to possibly set up the first node with an enormous rise value, such as: server server1 10.112.58.16:11211 check inter 5s fall 3 rise 99999999 server server2 10.112.58.19:11211 check backup which falls over as expected when server1 is unavailable. It won't ever fall back to server1, though, even if server2 goes offline. Can this be made to work?

    Read the article

  • Free DNS software with failover support?

    - by Lin
    I'm looking for DNS software that can accomplish the following: Check health of all A records at set intervals If server is unresponsive after multiple successive checks, replace A record with a working server When a server is down, check it periodically. Once it's up, restore normal A records Here's an equivalent I thought of: Run DNS servers with very low TTL (minutes) Use a cron job to periodically query all webservers Use sed to replace A records if need be, and then restart DNS server I have a hard time believing there isn't already something that can accomplish the above. I'm not looking for a paid service, and I'm restricted to anything I can run with root access to a VPS. Any suggestions would be great. Thanks!

    Read the article

  • Ways to do simple failover with one server and two IPs

    - by CrassHoppr
    The setup is one server (Windows 2008) at one location with two incoming connections. As the server has to interface with various on-site devices, and will have a small number of incoming connections, a data center is not an option, and instead cable/dsl connections must be used. The goal is that users visit https://service.site.com and are sent to either the primary IP address or a secondary IP if the primary is down. I've seen advice to use round robin DNS for this, but caching an IP for a downed interface is something I'd like to avoid. Is something like this possible with these constraints?

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >