Find slow network nodes between two data centers
- by 2called-chaos
I've got a problem with syncing big amount of data between two data centers. Both machines have got a gigabit connection and are not fully occupied but the fastest that I am able to get is something between 6 and 10 Mbit = not acceptable!
Yesterday I made some traceroute which indicates huge load on a LEVEL3 router but the problem exists for weeks now and the high response time is gone (20ms instead of 300ms).
How can I trace this to find the actual slow node? Thought about a traceroute with bigger packages but will this work?
In addition this problem might not be related to one of our servers as there are much higher transmission rates to other servers or clients. Actually office = server is faster than server <= server!
Any idea is appreciated ;)
Update
We actually use rsync over ssh to copy the files. As encryption tends to have more bottlenecks I tried a HTTP request but unfortunately it is just as slow.
We have a SLA with one of the data centers. They said they already tried to change the routing because they say this is related to a cheap network where the traffic gets routed through. It is true that it will route through a "cheapnet" but only the other way around. Our direction goes through LEVEL3 and the other way goes through lambdanet (which they said is not a good network).
If I got it right (I'm a network intermediate) they simulated a longer path to force routing through LEVEL3 and they announce LEVEL3 in the AS path.
I basically want to know if they're right or they're just trying to abdicate their responsibility. The thing is that the problem exists in both directions (while different routes), so I think it is in the responsibility of our hoster. And honestly, I don't believe that there is a DC2DC connection which only can handle 600kb/s - 1,5 MB/s for weeks! The question is how to detect WHERE this bottleneck is