Dropped connections between Linux Servers in Data Center
- by Emil H
I have a number of linux servers at a us-based datacenter. The servers were installed by the hosting company, and are running fedora core.
We're experiencing problems with dropped connections. The issue seems to be that when we attempt to connect to one of the other servers after a period of inactivity, the first connection attempt will fail, and sometimes the second. However, after that the connection succeds and it works for a while. This happens for both mysql connections and raw socket connections, but only seems to occur when connecting to some of our servers. The confusing part is that it some of the servers for which we see different behaviors have identical hardware configuration and software. For example, it happens when connecting to a server called mysql2, but not for a server called mysql3. These servers were installed at the same time, but the same specifications.
The problem can be reproduced somewhat reliably, but only after waiting fifteen minutes to half an hour. This makes it hard to diagnose, and even harder since I'm not really sure what to look for.
I realize that connections sometimes failed, and that we should write our applications to compensate for this but these servers all in the same data center. Why would it matter if two servers haven't communicated for a while?
Does anybody have an idea what might be causing this? Is it a server configuration problem or a network problem that I should contact the hosting company about. What do I tell them to look for? Unfortunately our experience has been that the support staff doesn't investigate problems in depth unless we give them detailed directions.