Monday, April 30. 2007
The discussion on using SAN vs. DASD based storage is nearly a religious war(as can be seen in a lot of discussions on pgsql-performance) and in many ways similar to the infamous emacs vs. vi debate.
However - for some workloads those types of SAN are not really that appropriate. A DS4300(which is a now withdrawn from marketing) can do only a bit above 100MB/s of seq IO(nearly independent on the number of disks!) per controller(about 135MB/s if used together) which is really not much when one considers how fast modern hard drives are.
I recently got a SAN Array to play with that looks quite interesting since while expensive it still seems reasonably priced compared to what companies like IBM or others want for similar gear.
The array I got for testing is basically a non-branded LSI/Engenio 3994 with 16 2Gbit 10k 146FC drives and 2GB of battery backed cache per controller.
Due to the use of disks that are only capable of 2Gbit/s, the speed of the two drive channel loops is also limited to 2Gbit/s (using 4GBit FC drives it can be configured to use 4Gbit/s on the drive channels).
In the following test(test case 1) we use two volume groups - each a RAID10 (8 disks) and a RAID0 in the OS and write cache mirroring between the controllers(keeps both controller caches in sync so in case one controller fails the other one can take over). To utilize both controllers the HBAs are set up so that controller A is using on and controller B the other.
and the same with write mirroring disabled for both logical volumes (test case 2):
so write mirroring seems to have a 20% penalty for sequential writes and rewriting but not much impact for others - so it might be worth keeping it turned on due to the additional data integrity guarantees it provides .
and now for comparison a test using only volume group and a single controller (test case 3):
so let's see what PostgreSQL is able to do in terms of sequential IO on such device:
simple sequential scan on a large table (pgbench schema generated with a scale of 10000) using only a single controller (same setup as in test case 3):
so we are getting about 215MB/s out of 250MB/s which looks ok.so what happens with software raid 0 over two 8 disk RAID10 volume groups on different controllers (same setup as test case 1):
so that is more interesting - it seems that PostgreSQL is getting CPU bottlenecked(the array/file system can do >370MB/s) here and those ~280MB/s are pretty much in line with what Luke usually quotes (PostgreSQL getting CPU bottlenecked at around 300MB/s even on very fast AMD Opteron based boxes).
for those curious here are some other random tests (uncommented so judge by yourself):
single raid 5 with 4 logical volumes (each 500GB) and software RAID0 in the OS - two volumes per channel
A single RAID5 array over all 16 disks and two identically sized logical volumes each around 1TB in size.
on one LUN:
using both LUNs and software RAID0:
with disabled write cache mirroring:
pgcon, pgday and blogging
I just got booked for going to pgday in Prato, Italy. Looks like it's going to be a great gathering of the European people in the PostgreSQL community. Really looking forward to meeting those from the EU group that I haven't already had a chance to meet.
Weblog: Magnus Hagander's PostgreSQL Blog
Tracked: May 16, 21:07
Display comments as (Linear | Threaded)