Winsock tcp/ip Socket listening but connection refused, race condition?
- by Wayne
Hello folks.
This involves two automated unit tests which each start up a tcp/ip server that creates a non-blocking socket then bind()s and listen()s in a loop on select() for a client that connects and downloads some data.
The catch is that they work perfectly when run separately but when run as a test suite, the second test client will fail to connect with WSACONNREFUSED...
UNLESS
there is a Thread.Sleep() of several seconds between them??!!!
Interestingly, there is retry loop every 1 second for connecting after any failure. So the second test loops for a while until timeout after 10 minutes.
During that time, netstat -na shows the correct port number is in the LISTEN state for the server socket. So if it is in the listen state? Why won't it accept the connection?
In the code, there are log messages that show the select NEVER even gets a socket ready to read (which means ready to accept a connection when it applies to a listening socket).
Obviously the problem must be related to some race condition between finishing one test which means close() and shutdown() on each end of the socket, and the start up of the next.
This wouldn't be so bad if the retry logic allowed it to connect eventually after a couple of seconds. However it seems to get "gummed up" and won't even retry.
However, for some strange reason the listening socket SAYS it's in the LISTEN state even through keeps refusing connections.
So that means it's the Windoze O/S which is actually catching the SYN packet and returning a RST packet (which means "Connection Refused").
The only other time I ever saw this error was when the code had a problem that caused hundreds of sockets to get stuck in TIME_WAIT state. But that's not the case here. netstat shows only about a dozen sockets with only 1 or 2 in TIME_WAIT at any given moment.
Please help.