Is current SATA 6 gb/s equipment simply unreliable?

Posted by korkman on Server Fault See other posts from Server Fault or by korkman
Published on 2012-04-14T13:32:04Z Indexed on 2012/04/14 17:32 UTC
Read the original article Hit count: 362

Filed under:
|
|

I have a 45-disk array of Seagate Barracuda 3 TB ST3000DM001 (yes these are desktop drives I'm aware of that) in a Supermicro sc847 JBOD, connected via LSI 9285. I have found a solution for the problem description below by reducing speed via

MegaCli -PhySetLinkSpeed -phy0 2 -a0;
for i in $(seq 48); do MegaCli -PhySetLinkSpeed -phy${i} 2 -a0; done

and rebooting.

The question remains: Is this typical for current 6 gb/s equipment? Is this the sad state of SATA storage? Or is some of my equipment (the sff-8088 cables come to mind) bad?

The Problem was:

Synchronizing HW RAID-6, disks kept offlining. Fetching SMART values reveiled that those which offlined did not increase powered-on hours anymore. That is, their firmware (CC4C) seems to crash.

Digging into the matter by switching to Software RAID-6, with the disks passed-through, I got tons of kernel messages scattered across all disks, with 6 gb/s:

sd 0:0:9:0: [sdb]  Sense Key : No Sense [current]
Info fld=0x0
sd 0:0:9:0: [sdb]  Add. Sense: No additional sense information

And finally, when a disk offlines:

megasas: [ 5]waiting for 160 commands to complete
...
megasas: [35]waiting for 159 commands to complete
...
megasas: [155]waiting for 156 commands to complete
...
megaraid_sas: pending commands remain after waiting, will reset adapter.

Ugly controller reset here, then minutes later:

megaraid_sas: Reset successful.
sd 0:0:28:0: Device offlined - not ready after error recovery
...
sd 0:0:28:0: [sdu] Unhandled error code
sd 0:0:28:0: [sdu]  Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
sd 0:0:28:0: [sdu] CDB: Read(10): 28 00 23 21 2f 40 00 00 70 00
sd 0:0:28:0: [sdu] killing request

Reduced speed to 3 gb/s like written above, all problems vanished.

© Server Fault or respective owner

Related posts about raid

Related posts about storage