Saturday, May 19. 2007
Today I spent some time on doing a bit of maintenance work on some of my buildfarm boxes and suddenly I thought it would nice to present some of the more weird ones to others.
I have a total of 14 registered hosts on the buildfarm with nine of them actively reporting and four of them (those that I think are the most weird and interesting ones) are worth to get mentioned in a bit more detail:
issues found by lionfish:
issues found by quagga:
issues found by spoonbill:
issues found by sponge:
While hardware of that kind is not likely to be found in any serious or performance critical production use (at least I hope so!) this summary clearly shows the importance of the buildfarm as well the value of having not-so-mainstream boxes there :-)
Saturday, May 5. 2007
In an apparent moment of brain fade I volunteered to add a 8.3 patch status page to the developer wiki.
Monday, April 30. 2007
The discussion on using SAN vs. DASD based storage is nearly a religious war(as can be seen in a lot of discussions on pgsql-performance) and in many ways similar to the infamous emacs vs. vi debate.
However - for some workloads those types of SAN are not really that appropriate. A DS4300(which is a now withdrawn from marketing) can do only a bit above 100MB/s of seq IO(nearly independent on the number of disks!) per controller(about 135MB/s if used together) which is really not much when one considers how fast modern hard drives are.
I recently got a SAN Array to play with that looks quite interesting since while expensive it still seems reasonably priced compared to what companies like IBM or others want for similar gear.
The array I got for testing is basically a non-branded LSI/Engenio 3994 with 16 2Gbit 10k 146FC drives and 2GB of battery backed cache per controller.
Due to the use of disks that are only capable of 2Gbit/s, the speed of the two drive channel loops is also limited to 2Gbit/s (using 4GBit FC drives it can be configured to use 4Gbit/s on the drive channels).
In the following test(test case 1) we use two volume groups - each a RAID10 (8 disks) and a RAID0 in the OS and write cache mirroring between the controllers(keeps both controller caches in sync so in case one controller fails the other one can take over). To utilize both controllers the HBAs are set up so that controller A is using on and controller B the other.
and the same with write mirroring disabled for both logical volumes (test case 2):
so write mirroring seems to have a 20% penalty for sequential writes and rewriting but not much impact for others - so it might be worth keeping it turned on due to the additional data integrity guarantees it provides .
and now for comparison a test using only volume group and a single controller (test case 3):
so let's see what PostgreSQL is able to do in terms of sequential IO on such device:
simple sequential scan on a large table (pgbench schema generated with a scale of 10000) using only a single controller (same setup as in test case 3):
so we are getting about 215MB/s out of 250MB/s which looks ok.so what happens with software raid 0 over two 8 disk RAID10 volume groups on different controllers (same setup as test case 1):
so that is more interesting - it seems that PostgreSQL is getting CPU bottlenecked(the array/file system can do >370MB/s) here and those ~280MB/s are pretty much in line with what Luke usually quotes (PostgreSQL getting CPU bottlenecked at around 300MB/s even on very fast AMD Opteron based boxes).
for those curious here are some other random tests (uncommented so judge by yourself):
single raid 5 with 4 logical volumes (each 500GB) and software RAID0 in the OS - two volumes per channel
A single RAID5 array over all 16 disks and two identically sized logical volumes each around 1TB in size.
on one LUN:
using both LUNs and software RAID0:
with disabled write cache mirroring:
Friday, April 13. 2007
well it's done ... wwwmaster.postgresql.org has now moved to tribble.
The most important one is probably that we are now running on FreeBSD 6.2/amd64 instead of the more or less EOLed 4.11 we had before.
Some other goodies/improvements we are getting from that move:
*) up-to-date versions of the installed ports and packages
Thursday, March 1. 2007
System monitoring is both an art and a pain. It is nice to have pretty graphs that one can show what's going on with a server or a service as well as having something that does proper notification of current or potential issues, but on the other side there is also a lot of pain and (boring) work involved in getting this up and running in a proper way.
I'm quite a fan of doing proper and detailed monitoring of systems - and after the latest issues with tribble I took a stab at improving the monitoring of that box but - well tribble is running FreeBSD and doing hardware related monitoring (vs. checking for things in the OS) is often more difficult there for various reasons.
The first thing I wanted to get monitored is the hardware itself - modern servers usually carry some sort of BMC (Baseboard Management controller) or some even more sophisticated solutions(RSAII, iLO - just to name a few) that are basically small independent computers on the mainboard.
Accessing the data those BMCs can provide is often done through complex and binary only drivers available only for Microsoft Windows and a limited number of commercially supported linux distributions(and some of them are even bloated java based GUI things) - however in the last few years a standard based solution to that kind of task has appeared - Intelligent Platform Management Interface (IPMI).
IPMI provides a standardized interface to manage and monitor servers even in the absence(!) of an operating system - it is a cool idea though in practice it bears a lot of similarity to ACPI in the sense that every vendor is implementing it a bit different and especially early implementations are buggy like hell.
Sample output for tribble of that script looks like:
which is a bit verbose but I will work on that later ;-)
The script will also check the System Event Log (SEL) - which is basically a small NVRAM backed memory on the BMC holding all kinds of hardware monitoring events - for entries (in this case there are none) and will return a warning if it finds something.
Ok now that we had the basic hardware covered only one major thing is left - the monitoring of the integrated IBM ServeRAID 7k adapter which has two arrays (a 2 disk RAID 1 for the OS and related data and a 4 disk RAID 10 for the VMs).
Monitoring hardware RAID is a delicate thing on most BSDs (though OpenBSD made some promising progress on that front lately) - the lack of vendor support often results in only rudimentary drivers at best and useful tools to check the array status or even initiate rebuilds are often simply not available.
so a the end of the day we have nice hardware monitoring for at least one of the projects servers - but there is still a lot to do in the future ...
Thursday, February 15. 2007
So this is it - my first venture into all this blogging thingy(and I blame magnus for tricking me into starting one).
It is a pretty powerful Server running FreeBSD 6.2 from a respected tier1 vendor but yet it failed two times since it went into production early october 2006 when it got donated to the project by the company I work for.
Both times the same thing happened - the box suddenly stopped responding to the network but would still display (though not react to) the console. The only thing to get in back to life was to power cycle it.
However tribble is one of the fastest dedicated boxes the project has and it seemed critical to investigate why it failed twice before we decide to move other services to it because there is really no value in a box that one cannot trust in.
"Operating System locks, hangs or resets after 76.5 days." - this description sounded familiar but was it really the cause for all the troubles?
the first unexplained crash was around 2006-12-19 17:30 CET and according to our monitoring the box got last powercycled on the 4th of october early in the morning- so let's do some math:
So looks like we just found the reason for the first crash ...
« previous page (Page 2 of 2, totaling 16 entries)