Search Results

Search found 8495 results on 340 pages for 'high availability'.

Page 22/340 | < Previous Page | 18 19 20 21 22 23 24 25 26 27 28 29  | Next Page >

  • Drbd Primary/Primary + iSCSI: accessing to different files avoids split brain?

    - by Eddie C.
    I have a question / curiosity about split-brain on a Drbd Primary/Primary configuration. Supposing two nodes (hosts), host1 and host2 configured with Drbd Primary/Primary and two different shares (NFS, CIFS o iSCSI) of a replicated area (saying /drbd) /drbd/file1.data /drbd/file2.data If a pool of client would access only by host1 share reading and wrinting only file1.data and another pool only by host2 share to file2.data, this scenario should avoid split brain situation in case of one node failure or it's just a conjecture? The final purpose is load balance between the two nodes in normal condition and collapsing to one node only in case of failure. Thank you! Eddie

    Read the article

  • HAProxy MySQL Failover is not starting

    - by thiesdiggity
    I am trying to setup HAProxy with MySQL failover with Ubuntu. I used a setup similar to this serverfault question, however I am getting the following error when starting haproxy: [ALERT] 341/220001 (17405) : parsing [/etc/haproxy/haproxy.cfg:29] : unknown option 'mysql-check'. [ALERT] 341/220001 (17405) : Error(s) found in configuration file : /etc/haproxy/haproxy.cfg [ALERT] 341/220001 (17405) : Fatal errors found in configuration. I even tried installing the lastest version of HAProxy (1.4.22). Does anyone know how to fix this? I have Google'd the heck out of it and can't find any solution. Thanks for your time!

    Read the article

  • Stability, x86 Vs Sparc

    - by Jason T
    Our project are plan to migrate from Sparc to x86, and our HA requirement is 99.99%, previous on Sparc, we assume the hardware stability would like, hardware failure every 4 month or even one year, and also we have test data for our application, then we have requirement for each unplanned recovery (fail over) to achieve 99.99% (52.6 minutes unplanned downtime per year). But since we are going to use Intel x86, it seems the hardware stability is not so good as Sparc, but we don't have the detail data. So compare with Sparc, how about the stability of the Intel x86, should we assume we have more unplanned downtime? If so, how many, double? Where I can find some more detail of this two type of hardware?

    Read the article

  • HAProxy NGInx SSL setup

    - by Niclas
    I've been looking around different setups for a server cluster supporting SSL and I would like to benchmark my idea with you. Requirements: All servers in the cluster should be under the same full domain name. (http and https) Routing to subsystems is done on URI matching in HA proxy. All URIs have support for SSL support. Wish: Centralizing routing rules ---<----http-----<-- | | Inet -->HA--+---https--->NGInx_SSL_1..N | | +---http---> Apache_1..M | +---http---> NodeJS Idea: Configure HA to route all SSL traffic (mode=tcp,algorithm=Source) to an NGInx cluster turning https traffic into http. Re-pass the http traffic from NGInx to the HA for normal load-balancing which performs load balancing based on HA config. My question is simply: Is this the best way to to configure based on requirements above?

    Read the article

  • Free DNS software with failover support?

    - by Lin
    I'm looking for DNS software that can accomplish the following: Check health of all A records at set intervals If server is unresponsive after multiple successive checks, replace A record with a working server When a server is down, check it periodically. Once it's up, restore normal A records Here's an equivalent I thought of: Run DNS servers with very low TTL (minutes) Use a cron job to periodically query all webservers Use sed to replace A records if need be, and then restart DNS server I have a hard time believing there isn't already something that can accomplish the above. I'm not looking for a paid service, and I'm restricted to anything I can run with root access to a VPS. Any suggestions would be great. Thanks!

    Read the article

  • Website not available everywhere

    - by Cedric Reichenbach
    Today I noticed my website http://mint-nachhilfe.ch/ was down, but other people (located in different networks) said it looks up from there. When I came home, I double-checked, and I can really reach it from here. Also, this website considers it down. Some facts: It's a Tomcat webapp, connected to an Apache2 server. I restarted both, no change. Another (ruby on rails) application is connected to this Apache2, which I couldn't reach either, but is considered online by above check website. At any point, I could directly connect to the Tomcat over http://mint-nachhilfe.ch:8080! I don't know how to go on searching for the root error. I assume it's related to the Apache2 server, but how could that be?

    Read the article

  • master-slave datastore replication, automatic failover, and wackamole

    - by z8000
    I have 2 dedicated servers provisioned for my next project's datastores. The datastores are configured for master-slave replication. There's no inherent automatic failover but I of course want this. That is, I'd love for access to the master datastore to always just work without having to configure a client library to detect when a master is down and failover to the slave. I've seen Wackamole which is based on the Spread Toolkit. You provide Wackamole with a set of IPs and a bunch of nodes, and regardless of the up/down state of any of the nodes, those IPs will stay available/up. Wackamole detects when a node goes down and ARPs the IP(s) that were up on the now-down node. It's pretty neat actually. So, my thought was to use Wackamole to keep the 2 virtual private IPs available/up. Clients would then just always use the same private IP to access the master datastore and the same but distinct IP for the slave datastore, even if those IPs were hosted on the same node. My datastore servers are accessed over a private network. I am unsure if this messes with Wackamole though. Is this lunacy? How do you generally handle automatic failover of private services like a datastore. FWIW, it shouldn't matter but the datastore is Redis. I don't want to hear "use mySQL" please :) Thanks.

    Read the article

  • distributed, fault-tolerant network block device

    - by gucki
    I'm looking for a distributed, fault-tolerant network storage system which exposes block devices (not filesystems) on the clients. A client's block device should write simultaneously to several storage nodes A client's block device should not fail as long as not all storage nodes backing it went down The master should automatically redistribute storages' data when a storage node fails or gets added/ removed A single master (which is for metadata only) is fine So ideally the architecture would be very similar to moosefs (http://www.moosefs.org/) but instead of exposing a real filesystem mounted using a fuse client it'd expose block devices on the clients. I know of iscsi and drbd but both don't seem to offer what I'm looking for. Or am I missing something?

    Read the article

  • SBD killing both cluster nodes when there are even small SAN network problems

    - by Wieslaw Herr
    I am having problems with stonith SBD in a openais-based cluster. Some background: The active/passive cluster has two nodes, node1 and node2. They are configured to provide an NFS service to users. To avoid problems with split-brain, they are both configured to use SBD. SBD is using two 1MB disks available to the hosts via an multipath fibre-channel network. The problems start if something happens with the SAN network. For example, today one of the brocade switches got rebooted and both nodes lost 2 out of 4 paths to each disks, which resulted in both nodes committing suicide and rebooting. This, of course, was highly undesirable because a) there were paths left b) even if the switch would be out for 10-20 seconds a reboot cycle of both nodes would take 5-10 minutes and all NFS-locks would be lost. I tried increasing the SBD timeout values (to 10sec+ values, dump attached at the end), however a "WARN: Latency: No liveness for 4 s exceeds threshold of 3 s" hints that something isn't working as I would it expect to. Here is what I would like to know: a) Is SBD working as it should killing nodes when 2 paths are available? b) If not, is the multipath.conf file attached correct? The storage controller we use is an IBM SVC (IBM 2145), should there be any specific configuration for it? (as in multipath.conf.defaults) c) How should I go about increasing the timeouts in SBD attachements: Multipath.conf and sbd dump (http://hpaste.org/69537)

    Read the article

  • How do I load balance between two Linux machines?

    - by William Hilsum
    Inspired by the Stack Overflow network, I am now obsessed with HAProxy and trying to use it myself. At the moment, each HAProxy box has got two network cards (well, two configured, I can have a maximum of 4 and wasn't sure if they needed their own one for management between the boxes). On both machines, the backend one (eth1) is a private IP that goes to a switch connected to the webservers, and the front facing one (eth0) has a public internet IP that is routed straight though. In addition, I have created an additional virtual ip for eth0 called eth0:0 which has got a third public ip address. I just about get how to use it for load balancing between multiple web servers that are behind it, but, I am failing to load balance between the two HAProxy boxes - they appear to fight for the virtual IP, but, this does not appear to be a smart solution. Now, by using the virtual shared IP address, this solution appears to work and does seem to give me maximum uptime, but, is this the correct way to do it, or is there a smarter way? I have been looking at other Linux packages such as keepalived, but, I have only been using Linux (server) for a week now and am at the limits of my understanding. Is there anyone who has done this before and can you advise anything for maximum uptime?

    Read the article

  • Fortigate 200A firewall CPU high resource usage

    - by user119720
    This morning I'm receiving complaints from several end users that saying their whole department network are slow and have intermittent. Therefore I've checked our firewall to see whether if something goes wrong with the device.From my observation in the FortiGate dashboard status, the CPU resources is very high (99 percent). My first assumption is to clear the log since in the alert log the Fortigate log mention that it is already 90% full.Based on my understanding,the log can be cleared by restarting the firewall. After restarting the firewall the network seems okay again but then after several minutes it went up again.The condition still persist until now. Can someone show me where else I can check to fix this issue? I've really appreciate any help that I can get here.Thanks. Edit: diag sys top command

    Read the article

  • How to make AD highly available for applications that use it as an LDAP service

    - by Beaming Mel-Bin
    Our situation We currently have many web applications that use LDAP for authentication. For this, we point the web application to one of our AD domain controllers using the LDAPS port (636). When we have to update the Domain Controller, this has caused us issues because one more web application could depend on any DC. What we want We would like to point our web applications to a cluster "virtual" IP. This cluster will consist of at least two servers (so that each cluster server could be rotated out and updated). The cluster servers would then proxy LDAPS connections to the DCs and be able to figure out which one is available. Questions For anyone that has had experience with this: What software did you use for the cluster? Any caveats? Or perhaps a completely different architecture to accomplish something similar?

    Read the article

  • DRBD Replication failure

    - by user62513
    A couple of weeks ago I setup a 2 nodes CRM system with one of the resources managed being MySQL over DRBD. Today for maintenance reasons I restarted both nodes but now they can't connect to each other anymore. DRBD fell out of sync and I followed this guide to get it back connected but it's only able to run successfully on one node. But this strange thing happens: If I crm node standby both nodes and I try: crm node online node0 before crm node online node1, all the CRM resources start successfully but the DRBD partitions are still running in StandAlone state. crm node online node1 beofre crm node online node0, the DRBD resource fails to start, thus causing mysql not to start. If I standby both resources and call crm node online node0 then it times out and prints this error: Running crm node online node0 produces this output after timing out Error setting standby=off (section=nodes, set=<null>): Remote node did not respond Error performing operation: Remote node did not respond Is there anything I'm doing wrong here? An alternative will be just do MySQL replication but I'm not sure how to promote a slave to master when the master database is not available.

    Read the article

  • Corosync :: Restarting some resources after Lan connectivity issue

    - by moebius_eye
    I am currently looking into corosync to build a two-node cluster. So, I've got it working fine, and it does what I want to do, which is: Lost connectivity between the two nodes gives the first node '10node' both Failover Wan IPs. (aka resources WanCluster100 and WanCluster101 ) '11node' does nothing. He "thinks" he still has his Failover Wan IP. (aka WanCluster101) But it doesn't do this: '11node' should restart the WanCluster101 resource when the connectivity with the other node is back. This is to prevent a condition where node10 simply dies (and thus does not get 11node's Failover Wan IP), resulting in a situation where none of the nodes have 10node's failover IP because 10node is down 11node has "given back" his failover Wan IP. Here's the current configuration I'm working on. node 10sch \ attributes standby="off" node 11sch \ attributes standby="off" primitive LanCluster100 ocf:heartbeat:IPaddr2 \ params ip="172.25.0.100" cidr_netmask="32" nic="eth3" \ op monitor interval="10s" \ meta is-managed="true" target-role="Started" primitive LanCluster101 ocf:heartbeat:IPaddr2 \ params ip="172.25.0.101" cidr_netmask="32" nic="eth3" \ op monitor interval="10s" \ meta is-managed="true" target-role="Started" primitive Ping100 ocf:pacemaker:ping \ params host_list="192.0.2.1" multiplier="500" dampen="15s" \ op monitor interval="5s" \ meta target-role="Started" primitive Ping101 ocf:pacemaker:ping \ params host_list="192.0.2.1" multiplier="500" dampen="15s" \ op monitor interval="5s" \ meta target-role="Started" primitive WanCluster100 ocf:heartbeat:IPaddr2 \ params ip="192.0.2.100" cidr_netmask="32" nic="eth2" \ op monitor interval="10s" \ meta target-role="Started" primitive WanCluster101 ocf:heartbeat:IPaddr2 \ params ip="192.0.2.101" cidr_netmask="32" nic="eth2" \ op monitor interval="10s" \ meta target-role="Started" primitive Website0 ocf:heartbeat:apache \ params configfile="/etc/apache2/apache2.conf" options="-DSSL" \ operations $id="Website-one" \ op start interval="0" timeout="40" \ op stop interval="0" timeout="60" \ op monitor interval="10" timeout="120" start-delay="0" statusurl="http://127.0.0.1/server-status/" \ meta target-role="Started" primitive Website1 ocf:heartbeat:apache \ params configfile="/etc/apache2/apache2.conf.1" options="-DSSL" \ operations $id="Website-two" \ op start interval="0" timeout="40" \ op stop interval="0" timeout="60" \ op monitor interval="10" timeout="120" start-delay="0" statusurl="http://127.0.0.1/server-status/" \ meta target-role="Started" group All100 WanCluster100 LanCluster100 group All101 WanCluster101 LanCluster101 location AlwaysPing100WithNode10 Ping100 \ rule $id="AlWaysPing100WithNode10-rule" inf: #uname eq 10sch location AlwaysPing101WithNode11 Ping101 \ rule $id="AlWaysPing101WithNode11-rule" inf: #uname eq 11sch location NeverLan100WithNode11 LanCluster100 \ rule $id="RAND1083308" -inf: #uname eq 11sch location NeverPing100WithNode11 Ping100 \ rule $id="NeverPing100WithNode11-rule" -inf: #uname eq 11sch location NeverPing101WithNode10 Ping101 \ rule $id="NeverPing101WithNode10-rule" -inf: #uname eq 10sch location Website0NeedsConnectivity Website0 \ rule $id="Website0NeedsConnectivity-rule" -inf: not_defined pingd or pingd lte 0 location Website1NeedsConnectivity Website1 \ rule $id="Website1NeedsConnectivity-rule" -inf: not_defined pingd or pingd lte 0 colocation Never -inf: LanCluster101 LanCluster100 colocation Never2 -inf: WanCluster100 LanCluster101 colocation NeverBothWebsitesTogether -inf: Website0 Website1 property $id="cib-bootstrap-options" \ dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ no-quorum-policy="ignore" \ stonith-enabled="false" \ last-lrm-refresh="1408954702" \ maintenance-mode="false" rsc_defaults $id="rsc-options" \ resource-stickiness="100" \ migration-threshold="3" I also have a less important question concerning this line: colocation NeverBothLans -inf: LanCluster101 LanCluster100 How do I tell it that this collocation only applies to '11node'.

    Read the article

  • Wiki/CMS with synchronization?

    - by Clinton Blackmore
    We're looking into putting up a wiki or CMS for internal use by our IT department. One of the big things we want to use it for is disaster recovery procedures. Given that a disaster, such as a power or network outage, might render the wiki inaccessible, it seems sensible to to host the wiki in two places so that if one is inaccessible, we can fall back to the other. Are there any wikis or CMSes that synchronize (or an alternate way to achieve a similar end)?

    Read the article

  • Tools to manage sql 2008 database mirroring?

    - by lemkepf
    We are going to be moving about 20 databases that live on a single instance of sql 2000 to a sql 2008 r2 environment with database mirroring. What I'm looking for is a tool or scripts that will help me manage the conversion and management of those 20db's onto this new mirrored environment easily. There are many steps in setting each DB up and I want to automate as much as possible. Edit: Here are the steps I've been doing manually: Create the same username/passwords from the old sql 2000 server onto new sql 2008 server. Then sync those users/passwords onto the other sql 2008 server with the same SSID's so when we do the db backup and restore they match up. Take a backup of each sql 2000 db's. Copy them to server A. Restore the backup to server A. Backup from server a, copy to server b, restore there. Run the mirror "configure security" wizard. Start mirroring. I've love to be able to script this out or have a tool that does it for me. Thanks! Paul

    Read the article

  • SOLR high CPU usage in amazon EC2

    - by user644745
    I installed solr-3.6 in my local windows box and it worked fine. I installed solr-4.0 in amazon ec2 linux large instance and the cpu usage shot upto 100%. It maintained at 80-90% average cpu power. I thought it could be because of 4.0, So I installed 3.6 in EC2 again. But again the CPU usage was 80-90% average. With both the versions, solr works in EC2. dont know why CPU usage is so high. i started the solr server using "sudo nohup java -jar start.jar &" In my local box java 1.7 is installed and in EC2 it is 1.6.0_24. I have mapped solr dir to an EBS volume. /dev/mapper/vg1-solr 8361916 1935928 6342128 24% /home/ec2-user/SOLR/solr/example/solr Is there any known issue ?

    Read the article

  • GlusterFS as elastic file storage?

    - by Christopher Vanderlinden
    Is there any way to run GlusterFS in a replicated mode, but with the ability to dynamically scale the volume up and down? Say you have 3 servers all running glusterd. your Gluster volume would have to be setup with replica 3 gluster volume create test-volume replica 3 192.168.0.150:/test-volume 192.168.0.151:/test-volume 192.168.0.152:/test-volume You would then mount it as say \mnt\gfs_test What happens when I want to add 2 more servers to the storage pool and then also use them in this volume? Is there any easy way to expand the volume AND increase that replica count to 5? My end goal is to run this on EC2 instances, say 3 Apache front ends, with the webroot setup on the gluster volume mount. My concern is that if I ever need to spin up a server, I would want the server to not only be an additional Apache front end, but also another server in the gluster file system, adding to fault tolerance as well as possibly giving a slight boost in read speed. Maybe there are better options that would fit the bill here? Thanks.

    Read the article

  • High fan speed after return from suspend (on Ubuntu)

    - by Bolo
    Hi, I've got a HP ProBook 5310m laptop with Ubuntu 10.04 (32 bit). When I return from suspend, the fan speed is usually very high: FDTZ sensor reports "90 °C". Yes, the units are wrong, since FDTZ does not report temperature, but fan speed – that's probably just a small bug in reporting. Interestingly, when I plug or unplug the power cable for a moment, the fan speed returns back to normal. My questions: Where can I report this problem? Is it about ACPI support in the kernel? What is the address of the relevant bug tracker? As a workaround for now, how can I programmatically trigger a behavior equivalent to (un)plugging the power cable. More generally, how can I force ACPI to recalculate fan speed? Ideally, I'm looking for something like echo foo > /proc/bar. Thanks in advance!

    Read the article

  • Very High Network out in ec2 instance

    - by Jatin
    I launched an ubuntu-14.04-64bit instance in Amazon EC2 two days back. And I started Tomcat 7.0.54 in that instance and deployed my application war files. It has no other software installed other than tomcat and the default ones. In the past 2 days, its shows 858 GB of Data Transfer(Network Out) from that instance. I have attached a graph of Amazon CloudWatch Metric "Network Out" My application does not do any data download/upload. Its a Java Spring application and the front end is in HTML&Javascript. My application traffic was very low (less than 20 hits) in those 2 days. Is there a way to find out why these data transfers happened and also to find what data has been transferred. If you can see in graph, network out was 20gb per minute. Some more info: Network in was negligible CPU Utilization was very high Everything else was low

    Read the article

  • BGP Router reccomendations for simple redundancy [closed]

    - by Jona
    We have two sites that each have an internet connection and have a dedicated dark fibre between them. Each site has it's own IP space and we have an AS number. We're looking to be resilient to failure of the internet connection to either site and so need to buy a pair of approriate routers. Requirements are: Able to run 2 bgp sessions (one with the ISP, one with the other site router) Option to take a full table from the upstream ISPs would be nice. Able to provide HA gateways on the LAN side (e.g. 192.168.0.254 will automatically migrate if it's host router lost power) A dedicated device rather than a server running Linux / BSD Not crazy expensive. Any help / advice much appreciated.

    Read the article

  • http server connectivity puzzle

    - by jpmartins
    I have been seeing some strange connection issue in the production environment. The setup has two IBM Http Server's (IHS) and a network IP load-balancer in front of them (round-robin). One instance the system is working fine, the next requests stop arriving at the IHS. Telnet directly to port 80 of the IHS is established sucessfully, but connection to the port 80 through the IP of the load-balancer fails! The puzzle comes next, the network admins say the load-balancer is working fine. When we finally reboot the IHS servers and request start flowing... The situation happened three times the last month and no obvious pattern was found. Any debug ideas?

    Read the article

  • Should an HA failover occur in this scenario?

    - by joeqwerty
    I'm running vSphere 5 in an HA cluster across two hosts (vsphereA and vsphereB). I have the HA cluster configured for host monitoring and datastore heartbeat monitoring with admission control disabled (hopefully I rightfully understand that datastore heartbeat monitoring prevents inadvertent and unwanted HA failovers due to management network isolation). Each host has a single connection to a dedicated iSCSI network and iSCSI target (no MPIO). All vmdk's for all VM's exist on the iSCSI datastore. As a test of HA I disconnected the iSCSI connection on vsphereB and was surprised to see that the running VM's on vsphereB continued to run on vsphereB. The powered off VM's were showing as inaccessible (which I expected due to the fact that they weren't running and the connection from vsphereB to the iSCSI target was severed) but the running VM's continued to run and continued to be "owned" by vsphereB. I expected to see an HA failover occur for those VM's and expected to see them "owned" by vsphereA after the HA failover (which didn't occur). I'm at a loss to understand why an HA failover didn't occur for those VM's. Am I misunderstanding in which cases an HA failover should occur?

    Read the article

  • MySQL HA and Magento DB

    - by Raj
    Is it possible to use MySQL cluster for Magento DB? I have Web app developed in Magento E-commerce platform and I want to make DB highly available using the MySQL cluster. Magento supports only InnoDB database engine and MySQL HA uses it's own engine NDB. The Percona XtraDB Cluster, Does it change the InnoDB storage engine to XtraDB? Can I rollback to the MySQL native replication from Percona XtraDB Cluster?

    Read the article

  • Service haproxy error

    - by user128296
    I want to configure Haproxy for outgoing mail load balancing. my configuration file /etc/haproxy.cfg is. global maxconn 4096 # Total Max Connections. This is dependent on ulimit daemon nbproc 4 # Number of processing cores. Dual Dual-core Opteron is 4 cores for example. defaults mode tcp listen smtp_proxy 199.83.95.71:25 mode tcp option tcplog balance roundrobin # Load Balancing algorithm ## Define your servers to balance server r23.lbsmtp.org 74.117.x.x:25 weight 1 maxconn 512 check server r15.lbsmtp.org 199.71.x.x:25 weight 1 maxconn 512 check And when i start service haproxy i get this error. Starting HAproxy: [ALERT] 244/172148 (7354) : cannot bind socket for proxy smtp_proxy. Aborting. Please tell me where i am doing mistake.help will appreciated.

    Read the article

< Previous Page | 18 19 20 21 22 23 24 25 26 27 28 29  | Next Page >