Can enabling a RAID controller's writeback cache harm overall performance?
- by Nathan O'Sullivan
I have an 8 drive RAID 10 setup connected to an Adaptec 5805Z, running Centos 5.5 and deadline scheduler.
A basic dd read test shows 400mb/sec, and a basic dd write test shows about the same.
When I run the two simultaneously, I see the read speed drop to ~5mb/sec while the write speed stays at more or less the same 400mb/sec. The output of iostat -x as you would expect, shows that very few read transactions are being executed while the disk is bombarded with writes.
If i turn the controller's writeback cache off, I dont see a 50:50 split but I do see a marked improvement, somewhere around 100mb/s reads and 300mb/s writes. I've also found if I lower the nr_requests setting on the drive's queue (somewhere around 8 seems optimal) I can end up with 150mb/sec reads and 150mb/sec writes; ie. a reduction in total throughput but certainly more suitable for my workload.
Is this a real phenomenon? Or is my synthetic test too simplistic?
The reason this could happen seems clear enough, when the scheduler switches from reads to writes, it can run heaps of write requests because they all just land in the controllers cache but must be carried out at some point. I would guess the actual disk writes are occuring when the scheduler starts trying to perform reads again, resulting in very few read requests being executed.
This seems a reasonable explanation, but it also seems like a massive drawback to using writeback cache on an system with non-trivial write loads. I've been searching for discussions around this all afternoon and found nothing. What am I missing?