Search Results

Search found 1419 results on 57 pages for 'availability'.

Page 9/57 | < Previous Page | 5 6 7 8 9 10 11 12 13 14 15 16 | Next Page >

Website not available everywhere

- by Cedric Reichenbach

Today I noticed my website http://mint-nachhilfe.ch/ was down, but other people (located in different networks) said it looks up from there. When I came home, I double-checked, and I can really reach it from here. Also, this website considers it down. Some facts: It's a Tomcat webapp, connected to an Apache2 server. I restarted both, no change. Another (ruby on rails) application is connected to this Apache2, which I couldn't reach either, but is considered online by above check website. At any point, I could directly connect to the Tomcat over http://mint-nachhilfe.ch:8080! I don't know how to go on searching for the root error. I assume it's related to the Apache2 server, but how could that be?

Read the article
Stability, x86 Vs Sparc

- by Jason T

Our project are plan to migrate from Sparc to x86, and our HA requirement is 99.99%, previous on Sparc, we assume the hardware stability would like, hardware failure every 4 month or even one year, and also we have test data for our application, then we have requirement for each unplanned recovery (fail over) to achieve 99.99% (52.6 minutes unplanned downtime per year). But since we are going to use Intel x86, it seems the hardware stability is not so good as Sparc, but we don't have the detail data. So compare with Sparc, how about the stability of the Intel x86, should we assume we have more unplanned downtime? If so, how many, double? Where I can find some more detail of this two type of hardware?

Read the article
distributed, fault-tolerant network block device

- by gucki

I'm looking for a distributed, fault-tolerant network storage system which exposes block devices (not filesystems) on the clients. A client's block device should write simultaneously to several storage nodes A client's block device should not fail as long as not all storage nodes backing it went down The master should automatically redistribute storages' data when a storage node fails or gets added/ removed A single master (which is for metadata only) is fine So ideally the architecture would be very similar to moosefs (http://www.moosefs.org/) but instead of exposing a real filesystem mounted using a fuse client it'd expose block devices on the clients. I know of iscsi and drbd but both don't seem to offer what I'm looking for. Or am I missing something?

Read the article
The downsides of using nginx as a primary web server?

- by FractalizeR

Hello. I've seen millions of websites using nginx as a proxifying webserver working together with Apache. But I've seen very few servers running nginx only as their default webserver. What are the main downsides of such config? I can see some: Inability to use per-directory config files like .htaccess so every configuration change should be done to main server config file and requires server reload. But pecl htscanner can compensate them for php settings Unavailability of mod_php for nginx, which can be compensated by php-fpm for example. What are others? Why don't people just drop Apache and move to nginx or any other lightweight solution? May be, there are some special reasons?

Read the article
reaching 99.9999% uptime

- by user35204

I am currently developing a project which is mission-critical. The actual domain name is registered with 1 & 1 and I plan on purchasing DynDNS Custom DNS service (which has 5 different geographical locations for DNS) and then another secondary DNS service to make sure my DNS is as failover safe as possible. Does it matter that the registration is with 1 & 1 - are they a weak link in the chain? All I really use them for is to say that DynDNS is my primary DNS nameserver and then my secondary DNS is my other nameserver. I can transfer the registration to DynDNS - Im just not sure if it really matters or not. Thanks

Read the article
How do I load balance between two Linux machines?

- by William Hilsum

Inspired by the Stack Overflow network, I am now obsessed with HAProxy and trying to use it myself. At the moment, each HAProxy box has got two network cards (well, two configured, I can have a maximum of 4 and wasn't sure if they needed their own one for management between the boxes). On both machines, the backend one (eth1) is a private IP that goes to a switch connected to the webservers, and the front facing one (eth0) has a public internet IP that is routed straight though. In addition, I have created an additional virtual ip for eth0 called eth0:0 which has got a third public ip address. I just about get how to use it for load balancing between multiple web servers that are behind it, but, I am failing to load balance between the two HAProxy boxes - they appear to fight for the virtual IP, but, this does not appear to be a smart solution. Now, by using the virtual shared IP address, this solution appears to work and does seem to give me maximum uptime, but, is this the correct way to do it, or is there a smarter way? I have been looking at other Linux packages such as keepalived, but, I have only been using Linux (server) for a week now and am at the limits of my understanding. Is there anyone who has done this before and can you advise anything for maximum uptime?

Read the article
Free DNS software with failover support?

- by Lin

I'm looking for DNS software that can accomplish the following: Check health of all A records at set intervals If server is unresponsive after multiple successive checks, replace A record with a working server When a server is down, check it periodically. Once it's up, restore normal A records Here's an equivalent I thought of: Run DNS servers with very low TTL (minutes) Use a cron job to periodically query all webservers Use sed to replace A records if need be, and then restart DNS server I have a hard time believing there isn't already something that can accomplish the above. I'm not looking for a paid service, and I'm restricted to anything I can run with root access to a VPS. Any suggestions would be great. Thanks!

Read the article
master-slave datastore replication, automatic failover, and wackamole

- by z8000

I have 2 dedicated servers provisioned for my next project's datastores. The datastores are configured for master-slave replication. There's no inherent automatic failover but I of course want this. That is, I'd love for access to the master datastore to always just work without having to configure a client library to detect when a master is down and failover to the slave. I've seen Wackamole which is based on the Spread Toolkit. You provide Wackamole with a set of IPs and a bunch of nodes, and regardless of the up/down state of any of the nodes, those IPs will stay available/up. Wackamole detects when a node goes down and ARPs the IP(s) that were up on the now-down node. It's pretty neat actually. So, my thought was to use Wackamole to keep the 2 virtual private IPs available/up. Clients would then just always use the same private IP to access the master datastore and the same but distinct IP for the slave datastore, even if those IPs were hosted on the same node. My datastore servers are accessed over a private network. I am unsure if this messes with Wackamole though. Is this lunacy? How do you generally handle automatic failover of private services like a datastore. FWIW, it shouldn't matter but the datastore is Redis. I don't want to hear "use mySQL" please :) Thanks.

Read the article
DRBD Replication failure

- by user62513

A couple of weeks ago I setup a 2 nodes CRM system with one of the resources managed being MySQL over DRBD. Today for maintenance reasons I restarted both nodes but now they can't connect to each other anymore. DRBD fell out of sync and I followed this guide to get it back connected but it's only able to run successfully on one node. But this strange thing happens: If I crm node standby both nodes and I try: crm node online node0 before crm node online node1, all the CRM resources start successfully but the DRBD partitions are still running in StandAlone state. crm node online node1 beofre crm node online node0, the DRBD resource fails to start, thus causing mysql not to start. If I standby both resources and call crm node online node0 then it times out and prints this error: Running crm node online node0 produces this output after timing out Error setting standby=off (section=nodes, set=<null>): Remote node did not respond Error performing operation: Remote node did not respond Is there anything I'm doing wrong here? An alternative will be just do MySQL replication but I'm not sure how to promote a slave to master when the master database is not available.

Read the article
How to make AD highly available for applications that use it as an LDAP service

- by Beaming Mel-Bin

Our situation We currently have many web applications that use LDAP for authentication. For this, we point the web application to one of our AD domain controllers using the LDAPS port (636). When we have to update the Domain Controller, this has caused us issues because one more web application could depend on any DC. What we want We would like to point our web applications to a cluster "virtual" IP. This cluster will consist of at least two servers (so that each cluster server could be rotated out and updated). The cluster servers would then proxy LDAPS connections to the DCs and be able to figure out which one is available. Questions For anyone that has had experience with this: What software did you use for the cluster? Any caveats? Or perhaps a completely different architecture to accomplish something similar?

Read the article
Corosync :: Restarting some resources after Lan connectivity issue

- by moebius_eye

I am currently looking into corosync to build a two-node cluster. So, I've got it working fine, and it does what I want to do, which is: Lost connectivity between the two nodes gives the first node '10node' both Failover Wan IPs. (aka resources WanCluster100 and WanCluster101 ) '11node' does nothing. He "thinks" he still has his Failover Wan IP. (aka WanCluster101) But it doesn't do this: '11node' should restart the WanCluster101 resource when the connectivity with the other node is back. This is to prevent a condition where node10 simply dies (and thus does not get 11node's Failover Wan IP), resulting in a situation where none of the nodes have 10node's failover IP because 10node is down 11node has "given back" his failover Wan IP. Here's the current configuration I'm working on. node 10sch \ attributes standby="off" node 11sch \ attributes standby="off" primitive LanCluster100 ocf:heartbeat:IPaddr2 \ params ip="172.25.0.100" cidr_netmask="32" nic="eth3" \ op monitor interval="10s" \ meta is-managed="true" target-role="Started" primitive LanCluster101 ocf:heartbeat:IPaddr2 \ params ip="172.25.0.101" cidr_netmask="32" nic="eth3" \ op monitor interval="10s" \ meta is-managed="true" target-role="Started" primitive Ping100 ocf:pacemaker:ping \ params host_list="192.0.2.1" multiplier="500" dampen="15s" \ op monitor interval="5s" \ meta target-role="Started" primitive Ping101 ocf:pacemaker:ping \ params host_list="192.0.2.1" multiplier="500" dampen="15s" \ op monitor interval="5s" \ meta target-role="Started" primitive WanCluster100 ocf:heartbeat:IPaddr2 \ params ip="192.0.2.100" cidr_netmask="32" nic="eth2" \ op monitor interval="10s" \ meta target-role="Started" primitive WanCluster101 ocf:heartbeat:IPaddr2 \ params ip="192.0.2.101" cidr_netmask="32" nic="eth2" \ op monitor interval="10s" \ meta target-role="Started" primitive Website0 ocf:heartbeat:apache \ params configfile="/etc/apache2/apache2.conf" options="-DSSL" \ operations $id="Website-one" \ op start interval="0" timeout="40" \ op stop interval="0" timeout="60" \ op monitor interval="10" timeout="120" start-delay="0" statusurl="http://127.0.0.1/server-status/" \ meta target-role="Started" primitive Website1 ocf:heartbeat:apache \ params configfile="/etc/apache2/apache2.conf.1" options="-DSSL" \ operations $id="Website-two" \ op start interval="0" timeout="40" \ op stop interval="0" timeout="60" \ op monitor interval="10" timeout="120" start-delay="0" statusurl="http://127.0.0.1/server-status/" \ meta target-role="Started" group All100 WanCluster100 LanCluster100 group All101 WanCluster101 LanCluster101 location AlwaysPing100WithNode10 Ping100 \ rule $id="AlWaysPing100WithNode10-rule" inf: #uname eq 10sch location AlwaysPing101WithNode11 Ping101 \ rule $id="AlWaysPing101WithNode11-rule" inf: #uname eq 11sch location NeverLan100WithNode11 LanCluster100 \ rule $id="RAND1083308" -inf: #uname eq 11sch location NeverPing100WithNode11 Ping100 \ rule $id="NeverPing100WithNode11-rule" -inf: #uname eq 11sch location NeverPing101WithNode10 Ping101 \ rule $id="NeverPing101WithNode10-rule" -inf: #uname eq 10sch location Website0NeedsConnectivity Website0 \ rule $id="Website0NeedsConnectivity-rule" -inf: not_defined pingd or pingd lte 0 location Website1NeedsConnectivity Website1 \ rule $id="Website1NeedsConnectivity-rule" -inf: not_defined pingd or pingd lte 0 colocation Never -inf: LanCluster101 LanCluster100 colocation Never2 -inf: WanCluster100 LanCluster101 colocation NeverBothWebsitesTogether -inf: Website0 Website1 property $id="cib-bootstrap-options" \ dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ no-quorum-policy="ignore" \ stonith-enabled="false" \ last-lrm-refresh="1408954702" \ maintenance-mode="false" rsc_defaults $id="rsc-options" \ resource-stickiness="100" \ migration-threshold="3" I also have a less important question concerning this line: colocation NeverBothLans -inf: LanCluster101 LanCluster100 How do I tell it that this collocation only applies to '11node'.

Read the article
SBD killing both cluster nodes when there are even small SAN network problems

- by Wieslaw Herr

I am having problems with stonith SBD in a openais-based cluster. Some background: The active/passive cluster has two nodes, node1 and node2. They are configured to provide an NFS service to users. To avoid problems with split-brain, they are both configured to use SBD. SBD is using two 1MB disks available to the hosts via an multipath fibre-channel network. The problems start if something happens with the SAN network. For example, today one of the brocade switches got rebooted and both nodes lost 2 out of 4 paths to each disks, which resulted in both nodes committing suicide and rebooting. This, of course, was highly undesirable because a) there were paths left b) even if the switch would be out for 10-20 seconds a reboot cycle of both nodes would take 5-10 minutes and all NFS-locks would be lost. I tried increasing the SBD timeout values (to 10sec+ values, dump attached at the end), however a "WARN: Latency: No liveness for 4 s exceeds threshold of 3 s" hints that something isn't working as I would it expect to. Here is what I would like to know: a) Is SBD working as it should killing nodes when 2 paths are available? b) If not, is the multipath.conf file attached correct? The storage controller we use is an IBM SVC (IBM 2145), should there be any specific configuration for it? (as in multipath.conf.defaults) c) How should I go about increasing the timeouts in SBD attachements: Multipath.conf and sbd dump (http://hpaste.org/69537)

Read the article
Insufficient channel capacity of 1GBit

- by Roman S

There is a Caching Server (Varnish): it receives data from Amazon S3 on request, saves it for some time and gives it to the client. We have encountered the problem of insufficient channel capacity of 1GBit. Peak load within 4 hours completely chokes the channel. Server performance is sufficient for now. Approximately 4.5TB of data are transmitted per day. More than 100TB are accumulated per month. The first thought that comes to mind is simply to add one more 1GBit port and sleep peacefully until 2GBit are not enough (it may happen quite quickly) or one server is not able to handle it. And then we just need to add new Caching Servers. But now we need a Load Balancer, which will send requests on one and the same URL, always on one and the same server (to avoid multiple copies of the same cached objects). Here are the questions: Does a Balancer need a band equal to sum of all bands of Caching Servers? What shall we do in case there are no ports in a Balancer? Should we add more Balancers or solve the problem by means of Round robin DNS? What are the standard approaches to such problems? Can anyone advise hosting-companies, which can solve this problem? We are interested in American and European markets.

Read the article
Tools to manage sql 2008 database mirroring?

- by lemkepf

We are going to be moving about 20 databases that live on a single instance of sql 2000 to a sql 2008 r2 environment with database mirroring. What I'm looking for is a tool or scripts that will help me manage the conversion and management of those 20db's onto this new mirrored environment easily. There are many steps in setting each DB up and I want to automate as much as possible. Edit: Here are the steps I've been doing manually: Create the same username/passwords from the old sql 2000 server onto new sql 2008 server. Then sync those users/passwords onto the other sql 2008 server with the same SSID's so when we do the db backup and restore they match up. Take a backup of each sql 2000 db's. Copy them to server A. Restore the backup to server A. Backup from server a, copy to server b, restore there. Run the mirror "configure security" wizard. Start mirroring. I've love to be able to script this out or have a tool that does it for me. Thanks! Paul

Read the article
How should I perform database maintenance on a 24x7 system

- by solublefish

I'm a software developer who inherited a part-time DBA role. I'm responsible for an application backed by a small, high-volume 24x7 database on SQL Server 2008. While there's other stuff in the DB, the critical piece is a 50GB, 7.5M row table that serves 100K requests/sec during peak load, and about half that at "night". This is 99%+ read traffic, but the writes are constant, and required. I need to be able to perform periodic maintenance without a maintenance window. Say an index rebuild, a job to purge old data, Windows Update, or hardware upgrade. Most of the advice I've seen is along the lines of "MAKE a maintenance window." While I appreciate the sentiment, I hope there's another way. If it will solve this problem, I do have the ability to purchase new hardware or modify the database, the clients (a set of web services servers), and much of the application code (ADO.NET + ASP.NET). I've been thinking along the lines of using the warm spare (or a 3rd server) to do the maintenance, and then "swap" it into production. 1 Synchronize the spare by restoring backups, including a current transaction log. 2 Perform the maintenance tasks. 3 Reconfigure clients to connect to the spare server. Existing connections are finished within a minute or so. 4 The spare server is now the production server. The problem remaining is that the new production server is now out of date by however long it took to perform maintenance. Is there some way that the original production server can be made to queue up changes and merge them to the spare between steps 2 and 3? Any other ideas?

Read the article
Wiki/CMS with synchronization?

- by Clinton Blackmore

We're looking into putting up a wiki or CMS for internal use by our IT department. One of the big things we want to use it for is disaster recovery procedures. Given that a disaster, such as a power or network outage, might render the wiki inaccessible, it seems sensible to to host the wiki in two places so that if one is inaccessible, we can fall back to the other. Are there any wikis or CMSes that synchronize (or an alternate way to achieve a similar end)?

Read the article
GlusterFS as elastic file storage?

- by Christopher Vanderlinden

Is there any way to run GlusterFS in a replicated mode, but with the ability to dynamically scale the volume up and down? Say you have 3 servers all running glusterd. your Gluster volume would have to be setup with replica 3 gluster volume create test-volume replica 3 192.168.0.150:/test-volume 192.168.0.151:/test-volume 192.168.0.152:/test-volume You would then mount it as say \mnt\gfs_test What happens when I want to add 2 more servers to the storage pool and then also use them in this volume? Is there any easy way to expand the volume AND increase that replica count to 5? My end goal is to run this on EC2 instances, say 3 Apache front ends, with the webroot setup on the gluster volume mount. My concern is that if I ever need to spin up a server, I would want the server to not only be an additional Apache front end, but also another server in the gluster file system, adding to fault tolerance as well as possibly giving a slight boost in read speed. Maybe there are better options that would fit the bill here? Thanks.

Read the article
MySQL HA and Magento DB

- by Raj

Is it possible to use MySQL cluster for Magento DB? I have Web app developed in Magento E-commerce platform and I want to make DB highly available using the MySQL cluster. Magento supports only InnoDB database engine and MySQL HA uses it's own engine NDB. The Percona XtraDB Cluster, Does it change the InnoDB storage engine to XtraDB? Can I rollback to the MySQL native replication from Percona XtraDB Cluster?

Read the article
Service haproxy error

- by user128296

I want to configure Haproxy for outgoing mail load balancing. my configuration file /etc/haproxy.cfg is. global maxconn 4096 # Total Max Connections. This is dependent on ulimit daemon nbproc 4 # Number of processing cores. Dual Dual-core Opteron is 4 cores for example. defaults mode tcp listen smtp_proxy 199.83.95.71:25 mode tcp option tcplog balance roundrobin # Load Balancing algorithm ## Define your servers to balance server r23.lbsmtp.org 74.117.x.x:25 weight 1 maxconn 512 check server r15.lbsmtp.org 199.71.x.x:25 weight 1 maxconn 512 check And when i start service haproxy i get this error. Starting HAproxy: [ALERT] 244/172148 (7354) : cannot bind socket for proxy smtp_proxy. Aborting. Please tell me where i am doing mistake.help will appreciated.

Read the article
BGP Router reccomendations for simple redundancy [closed]

- by Jona

We have two sites that each have an internet connection and have a dedicated dark fibre between them. Each site has it's own IP space and we have an AS number. We're looking to be resilient to failure of the internet connection to either site and so need to buy a pair of approriate routers. Requirements are: Able to run 2 bgp sessions (one with the ISP, one with the other site router) Option to take a full table from the upstream ISPs would be nice. Able to provide HA gateways on the LAN side (e.g. 192.168.0.254 will automatically migrate if it's host router lost power) A dedicated device rather than a server running Linux / BSD Not crazy expensive. Any help / advice much appreciated.

Read the article
Good Enough Failover Strategy for DNS / MySQL / Email

- by IMB

I've asked and read a lot questions regarding DNS failover but the more I read the more complicated it becomes, some people say it's good enough some say it isn't. No clear answers from what I read. I was wondering if we can set it straight once and for all, at least for the requirements of most websites out there. Right now let's assume the following: We don't need really need load-balancing, what we need is a failover solution. We are running a website based on LAMP on a VPS. We need to make sure that the Web Server, MySQL, Email are always accessible if not 99%. Basically here's my idea and questions about it: Web Server: We need at least one failover server (another VPS on a separate data center). Is DNS Failover via Round Robin good, if not, what's the best? And how do you exactly implement it? How do you make the files you upload/delete on Server A is also on Server B? MySQL: I've only read a brief intro to MySQL replication and I assume that I can replicate Server A to Server B and vice versa on the fly right? So just it case Server A fails and Server B is now running, it will continue to work and replicate to Server A when it becomes available. So in essence Server B is now the primary server, and will later on failover to Server A, should a failure happen again. Email: If we are gonna use DNS Failover, using webmail or relying on emails stored on the server is probably not a good idea right? Since some emails might be on Server A while some might be on Server B? I assume a basic email forwarder to a 3rdparty is good enough (like Gmail for example) to ensure all emails are kept in one place. Here's a basic diagram for a better picture: http://i.stack.imgur.com/KWSIi.png

Read the article
Should an HA failover occur in this scenario?

- by joeqwerty

I'm running vSphere 5 in an HA cluster across two hosts (vsphereA and vsphereB). I have the HA cluster configured for host monitoring and datastore heartbeat monitoring with admission control disabled (hopefully I rightfully understand that datastore heartbeat monitoring prevents inadvertent and unwanted HA failovers due to management network isolation). Each host has a single connection to a dedicated iSCSI network and iSCSI target (no MPIO). All vmdk's for all VM's exist on the iSCSI datastore. As a test of HA I disconnected the iSCSI connection on vsphereB and was surprised to see that the running VM's on vsphereB continued to run on vsphereB. The powered off VM's were showing as inaccessible (which I expected due to the fact that they weren't running and the connection from vsphereB to the iSCSI target was severed) but the running VM's continued to run and continued to be "owned" by vsphereB. I expected to see an HA failover occur for those VM's and expected to see them "owned" by vsphereA after the HA failover (which didn't occur). I'm at a loss to understand why an HA failover didn't occur for those VM's. Am I misunderstanding in which cases an HA failover should occur?

Read the article
http server connectivity puzzle

- by jpmartins

I have been seeing some strange connection issue in the production environment. The setup has two IBM Http Server's (IHS) and a network IP load-balancer in front of them (round-robin). One instance the system is working fine, the next requests stop arriving at the IHS. Telnet directly to port 80 of the IHS is established sucessfully, but connection to the port 80 through the IP of the load-balancer fails! The puzzle comes next, the network admins say the load-balancer is working fine. When we finally reboot the IHS servers and request start flowing... The situation happened three times the last month and no obvious pattern was found. Any debug ideas?

Read the article
Suggestions on providing HA access to an external (fibre) RAID subsystem

- by user145198

We are looking at upgrading our storage capacity with an external RAID subsystem that has redundant (2) fibre controllers, each controller has 4 x 8 Gbps fibre ports. I would like to make access to this storage system occur via HA Linux. Ideally I would connect 2 fibre ports from each controller into each Linux server, and then export either NFS or iSCSI via a 10 Gbe interface. I have seen plenty of references to DRBD, however all of those references tend to use block storage that is solely attached to each machine, rather than having a shared block storage device, so I am unsure if DRBD could (or should) be used in this case. Ideas?

Read the article
How to check if redis master is OK?

- by e-satis

On the documentation, they advice the monitor command. But it has a 50% performance penalty for the whole system, and how should I do that ? Whatching the ouput using SSH until I don't see anything ? Let's say I have 3 servers: 1 with a redis master, 1 with a redis slave, and one with my website querying the redis master. How can I, from my website server, make cleany the decision to fallback to the slave by sending the SLAVEOF NO ONE command ? My first step would be to put some kind of timeout check with a simple ping, just to be sure the server is online. But for redis specifically, I have no clue.

Read the article

< Previous Page | 5 6 7 8 9 10 11 12 13 14 15 16 | Next Page >