PDU management interface has low availability - product flaw or isolated issue
- by DeanB
Our colocation provider has supplied us with APC AP7932 switched 0U PDUs as part of several cabinets they provide us. We have had a lot of trouble with the network management aspect of these PDUs, which I'll describe below. We are moving to cage space in the same datacenter, and plan to provide our own PDUs, so I'd like to determine which enterprise-grade PDUs have been reliable performers from a remote management perspective.
Our colo-provided PDUs are configured to support management via an SSL web UI and via telnet. We updated the firmware on all of them to the current version as of NOV2011. They respond to pings reliably, and we have no reason to suspect a network layer issue. However, we experience frequent hangs, timeouts, disconnects, and general unavailability from the embedded management host in all of the PDUs. We occasionally have to restart the microcontroller on the PDU to recover from what appears to be an occasional hard fault. The outlets stay powered (thankfully), but the management aspect is so unreliable that it has become an ops liability - we can't be confident that we could get into the PDU to power cycle a host if we needed to. We have 3 PDUs that all exhibit identical behavior.
There are many manufacturers of enterprise-grade 0U switched PDUs, all with comparable features. If I looked at the datasheet for our current PDUs, they would appear to be a good fit -- only with the benefit of suffering through using them do we know to avoid them. I'd like to avoid picking a PDU that looks fine on paper, but has similar reliability issues.
What has been others' experience with switched PDUs? Is this level of flakiness normal?