I have a file server running Win2k8R2 on an older HP DL380G4. It has nothing set up on it other than file sharing. All drivers/firmware/updates installed. The file server is used as a dump for a bunch of test machines - so essentially a lot of small files are being written to it. It was working fine until it started showing the following symptoms:
Shares became either very slow/intermittent or could not access them at all.
Logging in the the server, you could use it like normal but windows would start freezing and eventually you had to hard reboot it because nothing was responsive.
After rebooting, it would work fine for 20min-2hours and then degrade into this broken state again.
Some info after investigation:
HP Raid Config utility shows the Raid array as functioning properly (RAID5 btw).
Event log shows a bunch of DoS attacks from the test machines, saying it has disconnected the connection
a. AFAIK (not part of my job) the test machines haven't changed the way they log information to this server or the amount of them hasn't increased.
b. Nothing is infected, this server was scanned fully, and the test machines are re-imaged almost daily.
Nothing in performance monitor shows as anything being pegged at maximum (CPU/HD/Network/RAM)
I installed MS Network Monitor and it is showing a lot of traffic
The server was using one gigabit Ethernet connection, I connected the second one as well with the same results.
Forgot to add - one of the commonly written to dirs on the share has over 16k subdirs in it, with a crapton of small files within those dirs. Some of the OS instability was slow access to the drive which has this directory - perfmon doesn't show much activity on the HD though so I'm not sure if this crowded dir is the cause.
Here is one important fact: I ran into this issue 2-3 months ago, couldn't figure it out, but I had a spare identical machine so I swapped them out (thought it was related to the machine), and now I have the same issue.
Also, the computer will be stable if I turn off file sharing.
So is the server just getting DoS'd by the test machines? I've never dealt with such an issue. Is instability in the server's OS common when getting DoS'd? Is there anything I can do to confirm this before telling the owners of the test machines to optimize their traffic? (I'm not sure what they'll be able to do).
Is there something within Win2k8R2 that can balance the traffic across the two NICs?
Any help would be appreciated.
Update: Another thought - the drive with the share is RAID5 across 6 SCSI320 300GB HDs. They are near full capacity about 100GB from 1TB left. Could the amount of tiny files could be causing some weirdness with the parity in this array? I think I've read something about this in the past but I'm no expert on RAID.