SQL IO and SAN troubles
- by James
We are running two servers with identical software setup but different hardware.
The first one is a VM on VMWare on a normal tower server with dual core xeons, 16 GB RAM and a 7200 RPM drive.
The second one is a VM on XenServer on a powerful brand new rack server, with 4 core xeons and shared storage.
We are running Dynamics AX 2012 and SQL Server 2008 R2. When I insert 15 000 records into a table on the slow tower server (as a test), it does so in 13 seconds. On the fast server it takes 33 seconds. I re-ran these tests several times with the same results.
I have a feeling it is some sort of IO bottleneck, so I ran SQLIO on both.
Here are the results for the slow tower server:
C:\Program Files (x86)\SQLIO>test.bat
C:\Program Files (x86)\SQLIO>sqlio -kW -t8 -s120 -o8 -frandom -b8 -BH -LS C:\Tes
tFile.dat
sqlio v1.5.SG
using system counter for latency timings, 14318180 counts per second
8 threads writing for 120 secs to file C:\TestFile.dat
using 8KB random IOs
enabling multiple I/Os per thread with 8 outstanding
buffering set to use hardware disk cache (but not file cache)
using current size: 5120 MB for file: C:\TestFile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 226.97
MBs/sec: 1.77
latency metrics:
Min_Latency(ms): 0
Avg_Latency(ms): 281
Max_Latency(ms): 467
histogram:
ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 99
C:\Program Files (x86)\SQLIO>sqlio -kR -t8 -s120 -o8 -frandom -b8 -BH -LS C:\Tes
tFile.dat
sqlio v1.5.SG
using system counter for latency timings, 14318180 counts per second
8 threads reading for 120 secs from file C:\TestFile.dat
using 8KB random IOs
enabling multiple I/Os per thread with 8 outstanding
buffering set to use hardware disk cache (but not file cache)
using current size: 5120 MB for file: C:\TestFile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 91.34
MBs/sec: 0.71
latency metrics:
Min_Latency(ms): 14
Avg_Latency(ms): 699
Max_Latency(ms): 1124
histogram:
ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100
C:\Program Files (x86)\SQLIO>sqlio -kW -t8 -s120 -o8 -fsequential -b64 -BH -LS C
:\TestFile.dat
sqlio v1.5.SG
using system counter for latency timings, 14318180 counts per second
8 threads writing for 120 secs to file C:\TestFile.dat
using 64KB sequential IOs
enabling multiple I/Os per thread with 8 outstanding
buffering set to use hardware disk cache (but not file cache)
using current size: 5120 MB for file: C:\TestFile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 1094.50
MBs/sec: 68.40
latency metrics:
Min_Latency(ms): 0
Avg_Latency(ms): 58
Max_Latency(ms): 467
histogram:
ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100
C:\Program Files (x86)\SQLIO>sqlio -kR -t8 -s120 -o8 -fsequential -b64 -BH -LS C
:\TestFile.dat
sqlio v1.5.SG
using system counter for latency timings, 14318180 counts per second
8 threads reading for 120 secs from file C:\TestFile.dat
using 64KB sequential IOs
enabling multiple I/Os per thread with 8 outstanding
buffering set to use hardware disk cache (but not file cache)
using current size: 5120 MB for file: C:\TestFile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 1155.31
MBs/sec: 72.20
latency metrics:
Min_Latency(ms): 17
Avg_Latency(ms): 55
Max_Latency(ms): 205
histogram:
ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100
Here are the results of the fast rack server:
C:\Program Files (x86)\SQLIO>test.bat
C:\Program Files (x86)\SQLIO>sqlio -kW -t8 -s120 -o8 -frandom -b8 -BH -LS E:\Tes
tFile.dat
sqlio v1.5.SG
using system counter for latency timings, 62500000 counts per second
8 threads writing for 120 secs to file E:\TestFile.dat
using 8KB random IOs
enabling multiple I/Os per thread with 8 outstanding
buffering set to use hardware disk cache (but not file cache)
open_file: CreateFile (E:\TestFile.dat for write): The system cannot find the pa
th specified.
exiting
C:\Program Files (x86)\SQLIO>sqlio -kR -t8 -s120 -o8 -frandom -b8 -BH -LS E:\Tes
tFile.dat
sqlio v1.5.SG
using system counter for latency timings, 62500000 counts per second
8 threads reading for 120 secs from file E:\TestFile.dat
using 8KB random IOs
enabling multiple I/Os per thread with 8 outstanding
buffering set to use hardware disk cache (but not file cache)
open_file: CreateFile (E:\TestFile.dat for read): The system cannot find the pat
h specified.
exiting
C:\Program Files (x86)\SQLIO>sqlio -kW -t8 -s120 -o8 -fsequential -b64 -BH -LS E
:\TestFile.dat
sqlio v1.5.SG
using system counter for latency timings, 62500000 counts per second
8 threads writing for 120 secs to file E:\TestFile.dat
using 64KB sequential IOs
enabling multiple I/Os per thread with 8 outstanding
buffering set to use hardware disk cache (but not file cache)
open_file: CreateFile (E:\TestFile.dat for write): The system cannot find the pa
th specified.
exiting
C:\Program Files (x86)\SQLIO>sqlio -kR -t8 -s120 -o8 -fsequential -b64 -BH -LS E
:\TestFile.dat
sqlio v1.5.SG
using system counter for latency timings, 62500000 counts per second
8 threads reading for 120 secs from file E:\TestFile.dat
using 64KB sequential IOs
enabling multiple I/Os per thread with 8 outstanding
buffering set to use hardware disk cache (but not file cache)
open_file: CreateFile (E:\TestFile.dat for read): The system cannot find the pat
h specified.
exiting
C:\Program Files (x86)\SQLIO>test.bat
C:\Program Files (x86)\SQLIO>sqlio -kW -t8 -s120 -o8 -frandom -b8 -BH -LS c:\Tes
tFile.dat
sqlio v1.5.SG
using system counter for latency timings, 62500000 counts per second
8 threads writing for 120 secs to file c:\TestFile.dat
using 8KB random IOs
enabling multiple I/Os per thread with 8 outstanding
buffering set to use hardware disk cache (but not file cache)
using current size: 5120 MB for file: c:\TestFile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 2575.77
MBs/sec: 20.12
latency metrics:
Min_Latency(ms): 1
Avg_Latency(ms): 24
Max_Latency(ms): 655
histogram:
ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 0 0 0 5 8 9 9 9 8 5 3 1 1 1 1 0 0 0 0 0 0 0 0 0 37
C:\Program Files (x86)\SQLIO>sqlio -kR -t8 -s120 -o8 -frandom -b8 -BH -LS c:\Tes
tFile.dat
sqlio v1.5.SG
using system counter for latency timings, 62500000 counts per second
8 threads reading for 120 secs from file c:\TestFile.dat
using 8KB random IOs
enabling multiple I/Os per thread with 8 outstanding
buffering set to use hardware disk cache (but not file cache)
using current size: 5120 MB for file: c:\TestFile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 1141.39
MBs/sec: 8.91
latency metrics:
Min_Latency(ms): 1
Avg_Latency(ms): 55
Max_Latency(ms): 652
histogram:
ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 91
C:\Program Files (x86)\SQLIO>sqlio -kW -t8 -s120 -o8 -fsequential -b64 -BH -LS c
:\TestFile.dat
sqlio v1.5.SG
using system counter for latency timings, 62500000 counts per second
8 threads writing for 120 secs to file c:\TestFile.dat
using 64KB sequential IOs
enabling multiple I/Os per thread with 8 outstanding
buffering set to use hardware disk cache (but not file cache)
using current size: 5120 MB for file: c:\TestFile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 341.37
MBs/sec: 21.33
latency metrics:
Min_Latency(ms): 5
Avg_Latency(ms): 186
Max_Latency(ms): 120037
histogram:
ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100
C:\Program Files (x86)\SQLIO>sqlio -kR -t8 -s120 -o8 -fsequential -b64 -BH -LS c
:\TestFile.dat
sqlio v1.5.SG
using system counter for latency timings, 62500000 counts per second
8 threads reading for 120 secs from file c:\TestFile.dat
using 64KB sequential IOs
enabling multiple I/Os per thread with 8 outstanding
buffering set to use hardware disk cache (but not file cache)
using current size: 5120 MB for file: c:\TestFile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 1024.07
MBs/sec: 64.00
latency metrics:
Min_Latency(ms): 5
Avg_Latency(ms): 61
Max_Latency(ms): 81632
histogram:
ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100
Three of the four tests are, to my mind, within reasonable parameters for the rack server. However, the 64 write test is incredibly slow on the rack server. (68 mb/sec on the slow tower vs 21 mb/s on the rack). The read speed for 64k also seems slow.
Is this enough to say there is some sort of bottleneck with the shared storage? I need to know if I can take this evidence and say we need to launch an investigation into this.
Any help is appreciated.