<?xml version="1.0" encoding="utf-8" ?>

<rss version="2.0" 
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:admin="http://webns.net/mvcb/"
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
   xmlns:wfw="http://wellformedweb.org/CommentAPI/"
   xmlns:content="http://purl.org/rss/1.0/modules/content/"
   >
<channel>
    <title>mastermind's weblog - hardware</title>
    <link>http://www.kaltenbrunner.cc/blog/</link>
    <description>random things on life,hardware and postgresql</description>
    <dc:language>en</dc:language>
    <generator>Serendipity 1.5.3-2 - http://www.s9y.org/</generator>
    <pubDate>Thu, 01 Nov 2012 12:06:54 GMT</pubDate>

    <image>
        <url>http://www.kaltenbrunner.cc/blog/templates/default/img/s9y_banner_small.png</url>
        <title>RSS: mastermind's weblog - hardware - random things on life,hardware and postgresql</title>
        <link>http://www.kaltenbrunner.cc/blog/</link>
        <width>100</width>
        <height>21</height>
    </image>

<item>
    <title>Benchmarking 8.4 - Chapter 2/bulk loading</title>
    <link>http://www.kaltenbrunner.cc/blog/index.php?/archives/27-Benchmarking-8.4-Chapter-2bulk-loading.html</link>
            <category>hardware</category>
            <category>PostgreSQL</category>
            <category>work</category>
    
    <comments>http://www.kaltenbrunner.cc/blog/index.php?/archives/27-Benchmarking-8.4-Chapter-2bulk-loading.html#comments</comments>
    <wfw:comment>http://www.kaltenbrunner.cc/blog/wfwcomment.php?cid=27</wfw:comment>

    <wfw:commentRss>http://www.kaltenbrunner.cc/blog/rss.php?version=2.0&amp;type=comments&amp;cid=27</wfw:commentRss>
    

    <author>nospam@example.com (Stefan Kaltenbrunner)</author>
    <content:encoded>
    &lt;p&gt;As promised in the previous &lt;a href=&quot;http://www.kaltenbrunner.cc/blog/index.php?/archives/26-Benchmarking-8.4-Chapter-1Read-Only-workloads.html&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;post&lt;/a&gt; this is the second part in a series of testing/benchmarking 8.4 under various circumstances.&lt;br /&gt;
The topic in this post is bulk loading of data. Knowing about expected and theoretical bulk loading performance is a very important thing for an DBA. It affects not only data warehouse style operations but also plays an important part in disaster recovery scenarios because the total time it takes to restore your database from your backup is directly related to its bulk loading performance.&lt;/p&gt;

 &lt;br /&gt;&lt;a href=&quot;http://www.kaltenbrunner.cc/blog/index.php?/archives/27-Benchmarking-8.4-Chapter-2bulk-loading.html#extended&quot;&gt;Continue reading &quot;Benchmarking 8.4 - Chapter 2/bulk loading&quot;&lt;/a&gt;
    </content:encoded>

    <pubDate>Tue, 16 Jun 2009 09:28:00 +0200</pubDate>
    <guid isPermaLink="false">http://www.kaltenbrunner.cc/blog/index.php?/archives/27-guid.html</guid>
    
</item>
<item>
    <title>Benchmarking 8.4 - Chapter 1/Read-Only workloads</title>
    <link>http://www.kaltenbrunner.cc/blog/index.php?/archives/26-Benchmarking-8.4-Chapter-1Read-Only-workloads.html</link>
            <category>hardware</category>
            <category>PostgreSQL</category>
            <category>work</category>
    
    <comments>http://www.kaltenbrunner.cc/blog/index.php?/archives/26-Benchmarking-8.4-Chapter-1Read-Only-workloads.html#comments</comments>
    <wfw:comment>http://www.kaltenbrunner.cc/blog/wfwcomment.php?cid=26</wfw:comment>

    <wfw:commentRss>http://www.kaltenbrunner.cc/blog/rss.php?version=2.0&amp;type=comments&amp;cid=26</wfw:commentRss>
    

    <author>nospam@example.com (Stefan Kaltenbrunner)</author>
    <content:encoded>
    &lt;p&gt;Computing platforms are constantly changing, evolving and improving. This holds true for both the hardware and the software running on that hardware.&lt;br /&gt;
With PostgreSQL 8.4 just around the &lt;a href=&quot;http://archives.postgresql.org/pgsql-hackers/2009-06/msg00469.php&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;corner&lt;/a&gt; and a 2U Nehalem based IBM x3650M2 at my hands I thought I would do some benchmarking and see how well PostgreSQL does on that kind of hardware under a range of different workloads. This is planned as a series of posts starting with read-only benchmarking, followed up by bulk load testing and maybe some OLTP benchmarks as well.&lt;/p&gt;

 &lt;br /&gt;&lt;a href=&quot;http://www.kaltenbrunner.cc/blog/index.php?/archives/26-Benchmarking-8.4-Chapter-1Read-Only-workloads.html#extended&quot;&gt;Continue reading &quot;Benchmarking 8.4 - Chapter 1/Read-Only workloads&quot;&lt;/a&gt;
    </content:encoded>

    <pubDate>Fri, 12 Jun 2009 22:21:00 +0200</pubDate>
    <guid isPermaLink="false">http://www.kaltenbrunner.cc/blog/index.php?/archives/26-guid.html</guid>
    
</item>
<item>
    <title>new tech toy ...</title>
    <link>http://www.kaltenbrunner.cc/blog/index.php?/archives/14-new-tech-toy-....html</link>
            <category>hardware</category>
            <category>life</category>
    
    <comments>http://www.kaltenbrunner.cc/blog/index.php?/archives/14-new-tech-toy-....html#comments</comments>
    <wfw:comment>http://www.kaltenbrunner.cc/blog/wfwcomment.php?cid=14</wfw:comment>

    <wfw:commentRss>http://www.kaltenbrunner.cc/blog/rss.php?version=2.0&amp;type=comments&amp;cid=14</wfw:commentRss>
    

    <author>nospam@example.com (Stefan Kaltenbrunner)</author>
    <content:encoded>
    &lt;p&gt;ok just bought me a new (and rather expensive) &lt;a href=&quot;http://www.us.playstation.com/PS3/About&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;toy&lt;/a&gt; after some &lt;a href=&quot;http://www.madness.at/&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;people&lt;/a&gt; said &quot;ok if you buy one today I will buy you a game for that&quot;.&lt;/p&gt;

&lt;p&gt;So I went to the electronic shop and bought one (including the optional remote and two games) but I guess that might have only been the start because now a nice new LCD television would suddenly make sense ...&lt;/p&gt;

 
    </content:encoded>

    <pubDate>Tue, 22 May 2007 20:59:00 +0200</pubDate>
    <guid isPermaLink="false">http://www.kaltenbrunner.cc/blog/index.php?/archives/14-guid.html</guid>
    
</item>
<item>
    <title>weird animals on the postgresql buildfarm</title>
    <link>http://www.kaltenbrunner.cc/blog/index.php?/archives/13-weird-animals-on-the-postgresql-buildfarm.html</link>
            <category>hardware</category>
            <category>PostgreSQL</category>
    
    <comments>http://www.kaltenbrunner.cc/blog/index.php?/archives/13-weird-animals-on-the-postgresql-buildfarm.html#comments</comments>
    <wfw:comment>http://www.kaltenbrunner.cc/blog/wfwcomment.php?cid=13</wfw:comment>

    <wfw:commentRss>http://www.kaltenbrunner.cc/blog/rss.php?version=2.0&amp;type=comments&amp;cid=13</wfw:commentRss>
    

    <author>nospam@example.com (Stefan Kaltenbrunner)</author>
    <content:encoded>
    &lt;p&gt;Today I spent some time on doing a bit of maintenance work on some of my &lt;a href=&quot;http://buildfarm.postgresql.org/&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;buildfarm&lt;/a&gt; boxes and suddenly I thought it would nice to present some of the more weird ones to others.&lt;br /&gt;
Of course there is also magnus &lt;a href=&quot;http://people.planetpostgresql.org/mha/index.php?/archives/148-pgcon,-pgday-and-blogging.html&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;complaint&lt;/a&gt; about a lack of blog posts on &lt;a href=&quot;http://www.planetpostgresql.org/&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;planetpostgresql.org&lt;/a&gt; that I just had to react too ;-)&lt;/p&gt;

&lt;p&gt;I have a total of 14 registered hosts on the buildfarm with &lt;a href=&quot;http://buildfarm.postgresql.org/cgi-bin/show_status.pl?member=spoonbill&amp;amp;member=emu&amp;amp;member=cockatoo&amp;amp;member=galah&amp;amp;member=dove&amp;amp;member=lionfish&amp;amp;member=seahorse&amp;amp;member=sponge&amp;amp;member=leveret&amp;amp;member=zebra&amp;amp;member=impala&amp;amp;member=Shad&amp;amp;member=quagga&amp;amp;member=clownfish&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;nine&lt;/a&gt; of them actively reporting and four of them (those that I think are the most weird and interesting ones) are worth to get mentioned in a bit more detail:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href=&quot;http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=lionfish&amp;amp;br=HEAD&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;lionfish&lt;/a&gt;&lt;/strong&gt;:&lt;/p&gt;


&lt;ul&gt;
    &lt;li&gt;Hardware: &lt;a href=&quot;http://en.wikipedia.org/wiki/Cobalt_Qube&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;Cobalt Cube 2&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;CPU: 250Mhz MIPS in little endian mode&lt;/li&gt;
    &lt;li&gt;Bogomips: ~250&lt;/li&gt;
    &lt;li&gt;Memory: 48MB of RAM (+196MB of swap)&lt;/li&gt;
    &lt;li&gt;Disk: 4GB IDE&lt;/li&gt;
    &lt;li&gt;Time to complete a build farm run: ~5,5-6 hours(this makes lionfish by far the slowest box on the farm)&lt;/li&gt;
    &lt;li&gt;On the farm since: 2004&lt;/li&gt;
    &lt;li&gt;Operation System: Debian/Sarge 3.1&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;issues found by lionfish&lt;/em&gt;&lt;em&gt;:&lt;/em&gt;&lt;/p&gt;


&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;http://archives.postgresql.org/pgsql-hackers/2006-07/msg01543.php&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;race condition&lt;/a&gt; in ALTER INDEX  RENAME&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://archives.postgresql.org/pgsql-committers/2007-05/msg00035.php&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;crashes&lt;/a&gt; with the new multiple autovacuum worker code&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://archives.postgresql.org/pgsql-hackers/2005-08/msg00975.php&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;buggy&lt;/a&gt; MIPS spinlock code&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href=&quot;http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=quagga&amp;amp;br=HEAD&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;quagga&lt;/a&gt;&lt;/strong&gt;:&lt;/p&gt;


&lt;ul&gt;
    &lt;li&gt;Hardware: &lt;a href=&quot;http://www.allnet.de/product_info_allnet.php?cPath=_&amp;amp;products_id=99967&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;ALLNET6500&lt;/a&gt; (identically to the &lt;a href=&quot;http://www.thecus.com/products_over.php?cid=1&amp;amp;pid=1&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;Thecus 2100&lt;/a&gt; with 256MB of RAM instead of 128MB)&lt;/li&gt;
    &lt;li&gt;CPU: Intel IOP 80219 ARMv5TE running at 600Mhz&lt;/li&gt;
    &lt;li&gt;Bogomips: ~250&lt;/li&gt;
    &lt;li&gt;Memory: 256MB DDR-SDRAM&lt;/li&gt;
    &lt;li&gt;Disk: 250GB SATA&lt;/li&gt;
    &lt;li&gt;Time to complete a build farm run: ~2,5 hours&lt;/li&gt;
    &lt;li&gt;On the buildfarm since: January 2007&lt;/li&gt;
    &lt;li&gt;Operating System: Debian/Etch 4.0&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;issues found by quagga&lt;/em&gt;&lt;em&gt;:&lt;/em&gt;&lt;/p&gt;


&lt;ul&gt;
    &lt;li&gt;tcl upstream &lt;a href=&quot;http://archives.postgresql.org/pgsql-hackers/2007-01/msg00377.php&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;bug&lt;/a&gt; on ARM and MIPS&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href=&quot;http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=spoonbill&amp;amp;br=HEAD&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;spoonbill&lt;/a&gt;&lt;/strong&gt;:&lt;/p&gt;


&lt;ul&gt;
    &lt;li&gt;hardware: &lt;a href=&quot;http://sunsolve.sun.com/handbook_pub/Systems/U10/spec.html&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;Sun Ultra 10 Workstation&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;CPU: 300Mhz UltraSPARC-IIi&lt;/li&gt;
    &lt;li&gt;Memory: 1GB&lt;/li&gt;
    &lt;li&gt;Disk: 40GB IDE&lt;/li&gt;
    &lt;li&gt;Time to complete a buildfarm run: ~1,5 hours&lt;/li&gt;
    &lt;li&gt;On the buildfarm since: at least autumn 2004&lt;/li&gt;
    &lt;li&gt;Operating System: OpenBSD 3.9/Sparc64 (upgraded a few times though)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;issues found by spoonbill&lt;/em&gt;&lt;em&gt;:&lt;/em&gt;&lt;/p&gt;


&lt;ul&gt;
    &lt;li&gt;gcc(&lt;3.4) &lt;a href=&quot;http://archives.postgresql.org/pgsql-hackers/2004-11/msg00710.php&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt; optimizer bug&lt;/a&gt; on Sparc64 platforms&lt;/li&gt;
    &lt;li&gt;pg_database flushing &lt;a href=&quot;http://archives.postgresql.org/pgsql-committers/2004-11/msg00225.php&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt; race condition&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://archives.postgresql.org/pgsql-hackers/2006-06/msg00902.php&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;portability issues&lt;/a&gt; in the regression test framework on BSD-platforms&lt;/li&gt;
    &lt;li&gt;configure &lt;a href=&quot;http://archives.postgresql.org/pgsql-hackers/2005-08/msg00066.php&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;portability fixes&lt;/a&gt; for OpenBSD&lt;/li&gt;
    &lt;li&gt;tsearch2 &lt;a href=&quot;http://archives.postgresql.org/pgsql-committers/2005-11/msg00475.php&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;failures/crashes&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;helped &lt;a href=&quot;http://archives.postgresql.org/pgsql-hackers/2006-09/msg00564.php&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt; diagnosing&lt;/a&gt; an OpenBSD &lt;a href=&quot;http://marc.info/?l=openbsd-cvs&amp;amp;m=115970282714109&amp;amp;w=2&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;libc bug&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;WAL page split patch &lt;a href=&quot;http://archives.postgresql.org/pgsql-hackers/2007-02/msg00390.php&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;use after free&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href=&quot;http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=sponge&amp;amp;br=HEAD&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;sponge&lt;/a&gt;&lt;/strong&gt;:&lt;/p&gt;


&lt;ul&gt;
    &lt;li&gt;Hardware: IBM RS/6000 7046-B50&lt;/li&gt;
    &lt;li&gt;CPU: PowerPC 604e 375Mhz&lt;/li&gt;
    &lt;li&gt;Bogomips: ~41&lt;/li&gt;
    &lt;li&gt;Memory: 256MB&lt;/li&gt;
    &lt;li&gt;Disk: 18GB SCSI&lt;/li&gt;
    &lt;li&gt;Time to complete a buildfarm run: 1-1,5 hours&lt;/li&gt;
    &lt;li&gt;On the farm since: spring 2006&lt;/li&gt;
    &lt;li&gt;Operating System: Fedora Core 5/ppc&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;issues found by sponge&lt;/em&gt;&lt;em&gt;:&lt;/em&gt;&lt;/p&gt;


&lt;ul&gt;
    &lt;li&gt;regression test &lt;a href=&quot;http://archives.postgresql.org/pgsql-committers/2006-08/msg00091.php&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;race condition&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;XML regression test &lt;a href=&quot;http://archives.postgresql.org/pgsql-hackers/2007-02/msg00828.php&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;problem&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;While hardware of that kind is not likely to be found in any serious or performance critical production use (at least I hope so!) this summary clearly shows the importance of the buildfarm as well the value of having not-so-mainstream boxes there :-)&lt;br /&gt;
I would be interested in getting details on other weird boxes people have on the farm or are are planning to add ...&lt;/p&gt;

 
    </content:encoded>

    <pubDate>Sat, 19 May 2007 14:02:00 +0200</pubDate>
    <guid isPermaLink="false">http://www.kaltenbrunner.cc/blog/index.php?/archives/13-guid.html</guid>
    
</item>
<item>
    <title>&quot;cheap&quot; SAN gear </title>
    <link>http://www.kaltenbrunner.cc/blog/index.php?/archives/10-cheap-SAN-gear.html</link>
            <category>hardware</category>
            <category>PostgreSQL</category>
            <category>work</category>
    
    <comments>http://www.kaltenbrunner.cc/blog/index.php?/archives/10-cheap-SAN-gear.html#comments</comments>
    <wfw:comment>http://www.kaltenbrunner.cc/blog/wfwcomment.php?cid=10</wfw:comment>

    <wfw:commentRss>http://www.kaltenbrunner.cc/blog/rss.php?version=2.0&amp;type=comments&amp;cid=10</wfw:commentRss>
    

    <author>nospam@example.com (Stefan Kaltenbrunner)</author>
    <content:encoded>
    &lt;p&gt;The discussion on using &lt;a href=&quot;http://en.wikipedia.org/wiki/Storage_area_network&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;SAN&lt;/a&gt; vs. &lt;a href=&quot;http://en.wikipedia.org/wiki/Direct_access_storage_device&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;DASD&lt;/a&gt; based storage is nearly a religious war(as can be seen in a lot of discussions on &lt;a href=&quot;http://archives.postgresql.org/pgsql-performance/&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;pgsql-performance&lt;/a&gt;) and in many ways similar to the infamous emacs vs. vi debate.&lt;br /&gt;
From personal experience I have found the &lt;a href=&quot;http://www-304.ibm.com/jct01004c/systems/support/supportsite.wss/supportresources?brandind=5000028&amp;amp;familyind=5329604&amp;amp;taskind=1&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;IBM DS4300&lt;/a&gt; and IBM DS4300 Turbo (basically the same as the DS4300 but with more memory/cache and a hefty markup in price) quite a reliable and basically maintenance free solution.&lt;/p&gt;

&lt;p&gt;However - for some workloads those types of SAN are not really that appropriate. A DS4300(which is a now withdrawn from marketing) can do only a bit above 100MB/s of seq IO(nearly independent on the number of disks!) per controller(about 135MB/s if used together) which is really not much when one considers how fast modern hard drives are.&lt;/p&gt;

&lt;p&gt;I recently got a SAN Array to play with that looks quite interesting since while expensive it still seems reasonably priced compared to what companies like IBM or others want for &lt;a href=&quot;http://www-03.ibm.com/systems/storage/disk/midrange/index.html&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;similar gear&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The array I got for testing is basically a non-branded &lt;a href=&quot;http://www.lsi.com/storage_home/products_home/external_raid/3994_storage_system/index.html&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;LSI/Engenio  3994&lt;/a&gt; with 16 2Gbit 10k 146FC drives and 2GB of battery backed cache per controller.&lt;br /&gt;
It is directly connected via two &lt;a href=&quot;http://support.qlogic.com/support/product_resources.asp?id=934&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;QLogic QLE2460 PCI-Express adapters&lt;/a&gt; to a &lt;a href=&quot;http://h10010.www1.hp.com/wwpc/us/en/en/WF05a/15351-15351-3328412-241644-241475-1121516.html&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;HP DL380 G5&lt;/a&gt; running &lt;a href=&quot;http://www.centos.org&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;CentOS 5&lt;/a&gt; for testing. &lt;br /&gt;
The first impression of the array is a solid one - it looks very familiar for people that are used to the IBM DS4000 storage line and the Management GUI is basically the same (with an Engenio logo in place of the IBM one).&lt;br /&gt;
The controller chassis can hold 16 disks (up from 14 in the older designs) in 3U and the available expansion enclosures have the same capacity and dimensions (up to 6 are supported) and can be added online(untested!) without disruption to ongoing IO.&lt;/p&gt;

&lt;p&gt;Due to the use of disks that are only capable of 2Gbit/s, the speed of the two drive channel loops is also limited to 2Gbit/s (using 4GBit FC drives it can be configured to use 4Gbit/s on the drive channels).&lt;br /&gt;
The following is not meant as a thorough benchmark of neither the array nor PostgreSQL but rather some ad-hoc testing and playing around to get some impression on the overall performance characteristics of the device  and are done using ext3(I&#039;m fully aware of the fact that other file systems - especially XFS - might provide noticeable better streaming performance, but I have a much higher level of trust in ext3 and that&#039;s my choice in production environments) in the default journaling mode.&lt;/p&gt;

&lt;p&gt;In the following test(test case 1) we use two volume groups - each a RAID10 (8 disks) and a RAID0 in the OS and write cache mirroring between the controllers(keeps both controller caches in sync so in case one controller fails the other one can take over). To utilize both controllers the HBAs are set up so that controller A is using on and controller B the other.&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
convm004        16G 51121  98 188347  76 97426  28 58961  98 378240  38 732.5   2
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                512 19599  94 257598 100  8272  27 19270  91 331205  99  4879  17
convm004,16G,51121,98,188347,76,97426,28,58961,98,378240,38,732.5,2,512,19599,94,257598,100,8272,27,19270,91,331205,99,4879,17&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;and the same with write mirroring disabled for both logical volumes (test case 2):&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
convm004        16G 51372  99 235020  96 122183  35 58880  98 369848  37 723.0   1
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                512 19888  95 256732  99 10037  32 19704  93 332286  99  5541  19
convm004,16G,51372,99,235020,96,122183,35,58880,98,369848,37,723.0,1,512,19888,95,256732,99,10037,32,19704,93,332286,99,5541,19&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;so write mirroring seems to have a 20% penalty for sequential writes and rewriting but not much impact for others - so it might be worth keeping it turned on due to the additional data integrity guarantees it provides .&lt;br /&gt;
It further seems that the device seems to be bottlenecked by the speed of the drive channels (there are two loops in the array head and half the drives are on the one and the other half on the other) due to the 2Gbit disks.&lt;br /&gt;
But it also shows that the devices seems to scale fairly well - until it hit&#039;s the bandwidth limit - at least for RAID10.&lt;/p&gt;

&lt;p&gt;and now for comparison a test using only volume group and a single controller (test case 3):&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
convm004        16G 51346  98 134822  56 69414  17 58651  97 251779  23 758.8   1
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                512 20048  91 259694  99  5846  18 18707  84 338671  99  2935   9
convm004,16G,51346,98,134822,56,69414,17,58651,97,251779,23,758.8,1,512,20048,91,259694,99,5846,18,18707,84,338671,99,2935,9&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;so let&#039;s see what PostgreSQL is able to do in terms of sequential IO on such device:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;bench=# select version();
                                                   version                                                    
--------------------------------------------------------------------------------------------------------------
 PostgreSQL 8.3devel on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.1 20070105 (Red Hat 4.1.1-52)
(1 row)

bench=# &lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;simple sequential scan on a large table (pgbench schema generated with a scale of 10000) using only a single controller (same setup as in test case 3):&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;bench=# select count(1) from accounts;
   count    
------------
 1000000000
(1 row)

Time: 619865.939 ms
bench=# select pg_relation_size(&#039;accounts&#039;)/619::float;
     ?column?     
------------------
 216998258.558966
(1 row)&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;so we are getting about 215MB/s out  of 250MB/s which looks ok.&lt;/p&gt;

so what happens with software raid 0 over two 8 disk RAID10 volume groups on different controllers (same setup as test case 1):&lt;br /&gt;
 
&lt;pre&gt;&lt;code&gt;bench=# select count(1) from accounts;
   count    
------------
 1000000000
(1 row)

Time: 478785.617 ms
bench=# select pg_relation_size(&#039;accounts&#039;)/478::float;
     ?column?     
------------------
 281008205.121339
(1 row)

Time: 265.791 ms&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;so that is more interesting - it seems that PostgreSQL is getting CPU bottlenecked(the array/file system can do &gt;370MB/s) here and those ~280MB/s are pretty much in line with what &lt;a href=&quot;http://archives.postgresql.org/pgsql-performance/2006-12/msg00448.php&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;Luke&lt;/a&gt; usually quotes (PostgreSQL getting CPU bottlenecked at around 300MB/s even on very fast AMD Opteron based boxes).&lt;/p&gt;

&lt;p&gt;for those curious here are some other random tests (uncommented so judge by yourself):&lt;/p&gt;

&lt;p&gt;single raid 5 with 4 logical volumes (each 500GB) and software RAID0 in the OS - two volumes per channel&lt;br /&gt;

&lt;pre&gt;&lt;code&gt;Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
convm004        16G 50978  99 208098  82 89274  25 59058  98 236993  24 488.5   1
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                512 20860  94 258368  99  7770  25 21087  95 335918 100  4291  14
convm004,16G,50978,99,208098,82,89274,25,59058,98,236993,24,488.5,1,512,20860,94,258368,99,7770,25,21087,95,335918,100,4291,14
&lt;/code&gt;&lt;/pre&gt;

&lt;/p&gt;

&lt;p&gt;A single RAID5 array over all 16 disks and two identically sized logical volumes each around 1TB in size.&lt;/p&gt;

&lt;p&gt;bonnie++:&lt;/p&gt;

&lt;p&gt;on one LUN:&lt;br /&gt;

&lt;pre&gt;&lt;code&gt;Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
convm004        16G 51245  98 121190  49 69406  17 56902  94 256111  22 840.9   1
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                512 20781  94 235507  91  7233  23 18685  84 338017  99  4125  14
&lt;/code&gt;&lt;/pre&gt;

&lt;/p&gt;

&lt;p&gt;using both LUNs and software RAID0:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
convm004        16G 51423  99 204881  84 83740  23 59040  98 232573  23 554.7   1
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                512 20303  93 259481  99  7230  23 20357  93 337312 100  3793  13
convm004,16G,51423,99,204881,84,83740,23,59040,98,232573,23,554.7,1,512,20303,93,259481,99,7230,23,20357,93,337312,100,3793,13
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;with disabled write cache mirroring:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
convm004        16G 51751  99 242637  97 105392  30 58859  98 235541  23 563.2   1
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                512 21034  95 255377  99  6485  21 19269  88 337095  99  4119  14
convm004,16G,51751,99,242637,97,105392,30,58859,98,235541,23,563.2,1,512,21034,95,255377,99,6485,21,19269,88,337095,99,4119,14&lt;/code&gt;&lt;/pre&gt;

 
    </content:encoded>

    <pubDate>Mon, 30 Apr 2007 16:51:00 +0200</pubDate>
    <guid isPermaLink="false">http://www.kaltenbrunner.cc/blog/index.php?/archives/10-guid.html</guid>
    
</item>
<item>
    <title>fixing e1000 TX transmit timeouts (at least some of them)</title>
    <link>http://www.kaltenbrunner.cc/blog/index.php?/archives/8-fixing-e1000-TX-transmit-timeouts-at-least-some-of-them.html</link>
            <category>hardware</category>
            <category>work</category>
    
    <comments>http://www.kaltenbrunner.cc/blog/index.php?/archives/8-fixing-e1000-TX-transmit-timeouts-at-least-some-of-them.html#comments</comments>
    <wfw:comment>http://www.kaltenbrunner.cc/blog/wfwcomment.php?cid=8</wfw:comment>

    <wfw:commentRss>http://www.kaltenbrunner.cc/blog/rss.php?version=2.0&amp;type=comments&amp;cid=8</wfw:commentRss>
    

    <author>nospam@example.com (Stefan Kaltenbrunner)</author>
    <content:encoded>
    &lt;p&gt;while trying to put new hardware into production today I found that the box (running Debian Etch/i386 with a 2.6.18 based kernel) would start to drop network connections during large transfers at Gigabit speeds. &lt;br /&gt;
Simply scping a large file over from a nearby box would result in stalled scp transfers and a large number of &quot;TX unit hang&quot; errors appearing in the kernel log as well as debugging output similar to:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;Mar 29 17:30:05 xxx kernel:   Tx Queue             &amp;lt;0&amp;gt;
Mar 29 17:30:05 xxx kernel:   TDH                  &amp;lt;56&amp;gt;
Mar 29 17:30:05 xxx kernel:   TDT                  &amp;lt;57&amp;gt;
Mar 29 17:30:05 xxx kernel:   next_to_use          &amp;lt;57&amp;gt;
Mar 29 17:30:05 xxx kernel:   next_to_clean        &amp;lt;56&amp;gt;
Mar 29 17:30:05 xxx kernel: buffer_info[next_to_clean]
Mar 29 17:30:05 xxx kernel:   time_stamp           &amp;lt;1eed17&amp;gt;
Mar 29 17:30:05 xxx kernel:   next_to_watch        &amp;lt;56&amp;gt;
Mar 29 17:30:05 xxx kernel:   jiffies              &amp;lt;1eefd1&amp;gt;
Mar 29 17:30:05 xxx kernel:   next_to_watch.status &amp;lt;0&amp;gt;
Mar 29 17:30:07 xxx kernel:   Tx Queue             &amp;lt;0&amp;gt;
Mar 29 17:30:07 xxx kernel:   TDH                  &amp;lt;56&amp;gt;
Mar 29 17:30:07 xxx kernel:   TDT                  &amp;lt;57&amp;gt;
Mar 29 17:30:07 xxx kernel:   next_to_use          &amp;lt;57&amp;gt;
Mar 29 17:30:07 xxx kernel:   next_to_clean        &amp;lt;56&amp;gt;
Mar 29 17:30:07 xxx kernel: buffer_info[next_to_clean]&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The nic in question is an embedded Intel 82573E/L on a &lt;a href=&quot;http://www.supermicro.com/products/motherboard/Xeon3000/3010/PDSM4+.cfm&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;Supermicro PDSM4+&lt;/a&gt; with the latest BIOS-Update available (1.2):&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;0d:00.0 Ethernet controller: Intel Corporation 82573E Gigabit Ethernet Controller (Copper) (rev 03)
0e:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;A fair bit of research turned this small &lt;a href=&quot;http://e1000.sourceforge.net/files/fixeep-82573-dspd.sh&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;shellscript&lt;/a&gt; up. This script basically greps the output of  ethtool -e interface (which in itself dumps the eeprom contents) and flips a bit in the eeprom:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;~# sh fixeep-82573-dspd.sh eth0
eth0: is a &amp;quot;82573E Gigabit Ethernet Controller&amp;quot;
This fixup is applicable to your hardware
executing command: ethtool -E eth0 magic 0x108c8086 offset 0x1e value 0xdf
Change made. You *MUST* reboot your machine before changes take effect!&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;after the reboot the nic just works fine - no more stalls and transmit timeouts ...&lt;/p&gt;

 
    </content:encoded>

    <pubDate>Thu, 29 Mar 2007 19:38:39 +0200</pubDate>
    <guid isPermaLink="false">http://www.kaltenbrunner.cc/blog/index.php?/archives/8-guid.html</guid>
    
</item>
<item>
    <title>the art^pain of system monitoring</title>
    <link>http://www.kaltenbrunner.cc/blog/index.php?/archives/7-the-artpain-of-system-monitoring.html</link>
            <category>hardware</category>
            <category>PostgreSQL</category>
    
    <comments>http://www.kaltenbrunner.cc/blog/index.php?/archives/7-the-artpain-of-system-monitoring.html#comments</comments>
    <wfw:comment>http://www.kaltenbrunner.cc/blog/wfwcomment.php?cid=7</wfw:comment>

    <wfw:commentRss>http://www.kaltenbrunner.cc/blog/rss.php?version=2.0&amp;type=comments&amp;cid=7</wfw:commentRss>
    

    <author>nospam@example.com (Stefan Kaltenbrunner)</author>
    <content:encoded>
    &lt;p&gt;System monitoring is both an art and a pain. It is nice to have pretty graphs that one can show what&#039;s going on with a server or a service as well as having something that does proper notification of current or potential issues, but on the other side there is also a lot of pain and (boring) work involved in getting this up and running in a proper way.&lt;/p&gt;

&lt;p&gt;I&#039;m quite a fan of doing proper and detailed monitoring of systems - and after the latest issues with &lt;a href=&quot;http://www.kaltenbrunner.cc/blog/index.php?/archives/2-tribble-down-or-the-sudden-freeze-of-a-server.html&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;tribble&lt;/a&gt; I took a stab at improving the monitoring of that box but - well tribble is running FreeBSD and doing hardware related monitoring (vs. checking for things in the OS) is often more difficult there for various reasons.&lt;/p&gt;

&lt;p&gt;The first thing I wanted to get monitored is the hardware itself - modern servers usually carry some sort of BMC (Baseboard Management controller) or some even more sophisticated solutions(&lt;a href=&quot;http://www-1.ibm.com/support/docview.wss?uid=psg1MIGR-50116&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;RSAII&lt;/a&gt;, &lt;a href=&quot;http://h18000.www1.hp.com/products/servers/management/ilo/index.html&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;iLO&lt;/a&gt; - just to name a few) that are basically small independent computers on the mainboard.&lt;/p&gt;

&lt;p&gt;Accessing the data those BMCs can provide is often done through complex and binary only drivers available only for Microsoft Windows and a limited number of commercially supported linux distributions(and some of them are even bloated java based GUI things)  - however in the last few years a standard based solution to that kind of task has appeared - Intelligent Platform Management Interface (IPMI).&lt;/p&gt;

&lt;p&gt;IPMI provides a standardized interface to manage and monitor servers even in the absence(!) of an operating system - it is a cool idea though in practice it bears a lot of similarity to ACPI in the sense that every vendor is implementing it a bit different and especially early implementations are buggy like hell.&lt;br /&gt;
Luckily for us tribble is running &lt;sup&gt;&lt;a href=&quot;http://www.freebsd.org&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; FreeBSD 6.2 with is the first FreeBSD release to support &lt;a href=&quot;http://www.freebsd.org/cgi/man.cgi?query=ipmi&amp;amp;apropos=0&amp;amp;sektion=4&amp;amp;manpath=FreeBSD+6.2-stable&amp;amp;format=html&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;ipmi(4)&lt;/a&gt; despite the fact that the man page claims it got added in 7.0 ...&lt;/p&gt;

&lt;p&gt;For integration into the postgresql.org monitoring infrastructure I hacked up a small &lt;a href=&quot;http://www.nagios.org&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;nagios&lt;/a&gt; check script which is simple calling &lt;a href=&quot;http://ipmitool.sourceforge.net/&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;ipmitool&lt;/a&gt; and looking for interesting output.&lt;/p&gt;

&lt;p&gt;Sample output for tribble of that script looks like:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;[stefan@tribble ~]$ sudo /usr/local/libexec/nagios/check_ipmi
OK - IPMI: (Ambient_Temp = 23 degrees C, CPU_1_Temp = 34 degrees C, CPU_2_Temp = 34 degrees 
C, DASD_Temp = 31 degrees C, Fan_10_Presence = 0x02, Fan_10_Tach = 1830 RPM, Fan_11_Presence 
= 0x02, Fan_11_Tach = 1800 RPM, Fan_12_Presence = 0x02, Fan_12_Tach = 1740 RPM, 
Fan_1_Presence = 0x02, Fan_1_Tach = 1710 RPM, Fan_2_Presence = 0x02, Fan_2_Tach = 1650 RPM, 
Fan_3_Presence = 0x02, Fan_3_Tach = 1830 RPM, Fan_4_Presence = 0x02, Fan_4_Tach = 1830 RPM, 
Fan_5_Presence = 0x02, Fan_5_Tach = 1890 RPM, Fan_6_Presence = 0x02, Fan_6_Tach = 1680 RPM, 
Fan_7_Presence = 0x02, Fan_7_Tach = 1680 RPM, Fan_8_Presence = 0x02, Fan_8_Tach = 1680 RPM, 
Fan_9_Presence = 0x02, Fan_9_Tach = 1800 RPM, PS_1_Fan_Fault = 0x01, PS_1_Status = 0x01, 
PS_2_Fan_Fault = 0x01, PS_2_Status = 0x01)&lt;/code&gt;&lt;/pre&gt;

which is a bit verbose but I will work on that later ;-)&lt;br /&gt;
 &lt;p&gt;The script will also check the System Event Log (SEL) - which is basically a small NVRAM backed memory on the BMC holding all kinds of hardware monitoring events - for entries (in this case there are none) and will return a warning if it finds something.&lt;/p&gt;

&lt;p&gt;Ok now that we had the basic hardware covered only one major thing is left - the monitoring of the integrated IBM ServeRAID 7k adapter which has two arrays (a 2 disk RAID 1 for the OS and related data and a 4 disk RAID 10 for the VMs).&lt;/p&gt;

&lt;p&gt;Monitoring hardware RAID is a delicate thing on most BSDs (though OpenBSD made some promising progress on that front lately) - the lack of vendor support often results in only rudimentary drivers at best and useful tools to check the array status or even initiate rebuilds are often simply not available.&lt;br /&gt;
A bit of research turned the following &lt;a href=&quot;http://lists.freebsd.org/pipermail/freebsd-scsi/2006-February/002304.html&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;post&lt;/a&gt; on the freebsd-scsi mailing list up.&lt;br /&gt;
Once compiled this tool indeed gives basic information about the status of ips(4) based raid controllers on FreeBSD - wrapping it once again into a nagios compatible check script results in:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;[stefan@tribble ~]$ sudo /usr/local/libexec/nagios/check_raid
OK: /dev/ips0 - Volume: 0, ArrayState: OK; Volume: 1, ArrayState: OK; &lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;so a the end of the day we have nice hardware monitoring for at least one of the projects servers - but there is still a lot to do in the future ...&lt;/p&gt;

 
    </content:encoded>

    <pubDate>Thu, 01 Mar 2007 19:59:20 +0100</pubDate>
    <guid isPermaLink="false">http://www.kaltenbrunner.cc/blog/index.php?/archives/7-guid.html</guid>
    
</item>
<item>
    <title>LSIlogic MegaRAID SAS and the self explaining CLI</title>
    <link>http://www.kaltenbrunner.cc/blog/index.php?/archives/4-LSIlogic-MegaRAID-SAS-and-the-self-explaining-CLI.html</link>
            <category>hardware</category>
    
    <comments>http://www.kaltenbrunner.cc/blog/index.php?/archives/4-LSIlogic-MegaRAID-SAS-and-the-self-explaining-CLI.html#comments</comments>
    <wfw:comment>http://www.kaltenbrunner.cc/blog/wfwcomment.php?cid=4</wfw:comment>

    <wfw:commentRss>http://www.kaltenbrunner.cc/blog/rss.php?version=2.0&amp;type=comments&amp;cid=4</wfw:commentRss>
    

    <author>nospam@example.com (Stefan Kaltenbrunner)</author>
    <content:encoded>
    &lt;p&gt;I have been playing with a &lt;a href=&quot;http://www.lsilogic.com/storage_home/products_home/internal_raid/megaraid_sas/megaraid_sas_8480e/index.html&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;LSILogic MegaRAID 8480E&lt;/a&gt; lately (with 24 disks in two IBM EXP3000 Enclosures attached).&lt;br /&gt;
Those who have used SCSI based products from said company before might know that there was a useful litte curses based tool on linux for managing arrays (I think even Dell shipped it for their LSI based PERCs).&lt;br /&gt;
But hey - times have changed and SAS is the new hip thing now and new technology requires new tools ...&lt;br /&gt;
So what got it replaced with ? well there is the inevidable &lt;a href=&quot;http://www.lsi.com/files/support/rsa/MR_SAS_1.0/Linux_LSI_MSM_1.13-07.zip&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;|JAVA based GUI monster&lt;/a&gt; called MegRAID Storage Manager that nobody really wants to have on a server and then there is MegaCLI.&lt;br /&gt;
&lt;a href=&quot;http://www.lsi.com/files/support/rsa/MR_SAS_1.0/Linux_MegaCli_1.01.09.zip&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;MegaCLI&lt;/a&gt; comes as a RPM containing only a single statically linked 32 bit Linux binary (since I&#039;m running Debian here I just used alien to extract the binary).&lt;br /&gt;
The fact that the package comes with NO documentation the &quot;-help&quot; output was a &lt;b&gt;bit&lt;/b&gt; irritating:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;                                     

      MegaCLI SAS RAID Management Tool  Ver 1.01.09 May 25, 2006

    (c)Copyright 2006, LSI Logic Corporation, All Rights Reserved.
MegaCli -v 
MegaCli -help|-h|? 
MegaCli -adpCount 
MegaCli -AdpSetProp {CacheFlushInterval -val}|{ RebuildRate -val} 
    |{PatrolReadRate -val}|{BgiRate -val}|{CCRate -val} 
    |{ReconRate -val}|{SpinupDriveCount -val}|{SpinupDelay -val} 
    |{CoercionMode -val}|{ClusterEnable -val}|{PredFailPollInterval -val} 
    |{BatWarnDsbl -val} |{EccBucketSize -val} | {EccBucketLeakRate -val} 
    | AlarmEnbl | AlarmDsbl | AlarmSilence -aN|-a0,1,2|-aALL 
MegaCli -AdpGetProp CacheFlushInterval | RebuildRate | PatrolReadRate | BgiRate 
    | CCRate | ReconRate | SpinupDriveCount | SpinupDelay | CoercionMode 
    | PredFailPollInterval | EccBucketSize | EccBucketLeakRate | EccBucketCount
    | ClusterEnable | BatWarnDsbl | AlarmDsply -aN|-a0,1,2|-aALL 
MegaCli -AdpAllInfo -aN|-a0,1,2|-aALL  
MegaCli -AdpGetTime -aN|-a0,1,2|-aALL  
MegaCli -AdpSetTime yyyymmdd hh:mm:ss -aN   
MegaCli -AdpSetVerify -f fileName -aN|-a0,1,2|-aALL  
MegaCli -AdpBIOS {-Enbl [SOE|BE]}|-Dsbl|-Dsply -aN|-a0,1,2|-aALL 
MegaCli -AdpBootDrive {-Set -Lx}|-Get -aN|-a0,1,2|-aALL 
MegaCli -AdpAutoRbld -Enbl|-Dsbl|-Dsply -aN|-a0,1,2|-aALL
MegaCli -AdpCacheFlush -aN|-a0,1,2|-aALL
MegaCli -AdpPR -Dsbl|EnblAuto|EnblMan|Start|Stop|Info|{SetDelay Val} 
         -aN|-a0,1,2|-aALL
MegaCli -FwTermLog -BBUoff|BBUoffTemp|BBUon|BBUGet|Dsply|Clear -aN|-a0,1,2|-aALL
MegaCli -AdpDiag [val] -aN|-a0,1,2|-aALL
          val - Time in second.
MegaCli -AdpBatTest -aN|-a0,1,2|-aALL
MegaCli -PDList -aN|-a0,1,2|-aALL 
MegaCli -PDGetNum -aN|-a0,1,2|-aALL 
MegaCli -pdInfo -PhysDrv[E0:S0,E1:S1,...] -aN|-a0,1,2|-aALL  
MegaCli -PDOnline  -PhysDrv[E0:S0,E1:S1,...] -aN|-a0,1,2|-aALL 
MegaCli -PDOffline -PhysDrv[E0:S0,E1:S1,...] -aN|-a0,1,2|-aALL 
MegaCli -PDMakeGood -PhysDrv[E0:S0,E1:S1,...] -aN|-a0,1,2|-aALL 
MegaCli -PDHSP {-Set [-Dedicated [-ArrayN|-Array0,1,2...]] [-EnclAffinity] [-nonRevertible]} 
         |-Rmv -PhysDrv[E0:S0,E1:S1,...] -aN|-a0,1,2|-aALL 
MegaCli -PDRbld -Start|-Stop|-ShowProg |-ProgDsply 
        -PhysDrv [E0:S0,E1:S1,...] -aN|-a0,1,2|-aALL  
MegaCli -PDClear -Start|-Stop|-ShowProg |-ProgDsply 
        -PhysDrv [E0:S0,E1:S1,...] -aN|-a0,1,2|-aALL  
MegaCli -PdLocate {[-start] | -stop} -physdrv[E0:S0,E1:S1,...] -aN|-a0,1,2|-aALL 
MegaCli -PdMarkMissing -physdrv[E0:S0,E1:S1,...] -aN|-a0,1,2|-aALL 
MegaCli -PdGetMissing -aN|-a0,1,2|-aALL 
MegaCli -PdReplaceMissing -physdrv[E0:S0] -arrayA, -rowB -aN 
MegaCli -PdPrpRmv [-UnDo] -physdrv[E0:S0] -aN|-a0,1,2|-aALL  
MegaCli -EncInfo -aN|-a0,1,2|-aALL 
MegaCli -PhyInfo -phyM -aN|-a0,1,2|-aALL  
MegaCli -LDInfo -Lx|-L0,1,2|-Lall -aN|-a0,1,2|-aALL 
MegaCli -LDSetProp  {-Name LdNamestring} | -RW|RO|Blocked | WT|WB|RA|NORA|ADRA 
        | Cached|Direct | -EnDskCache|DisDskCache -Lx|-L0,1,2|-Lall -aN|-a0,1,2|-aALL 
MegaCli -LDGetProp  -Cache | -Access | -Name | -DskCache -Lx|-L0,1,2|-LALL  
        -aN|-a0,1,2|-aALL 
MegaCli -LDInit {-Start [-full]}|-Abort|-ShowProg|-ProgDsply -Lx|-L0,1,2|-LALL -aN|-a0,1,2|-aALL 
MegaCli -LDCC -Start|-Abort|-ShowProg|-ProgDsply -Lx|-L0,1,2|-LALL -aN|-a0,1,2|-aALL 
MegaCli -LDBI -Enbl|-Dsbl|-getSetting|-Abort|-ShowProg|-ProgDsply -Lx|-L0,1,2|-LALL -aN|-a0,1,2|-aALL  
MegaCli -LDRecon {-Start -rX [{-Add | -Rmv} -Physdrv[E0:S0,...]]}|-ShowProg|-ProgDsply 
        -Lx -aN 
MegaCli -LdPdInfo -aN|-a0,1,2|-aALL 
MegaCli -LDGetNum -aN|-a0,1,2|-aALL 
MegaCli -CfgLdAdd -rX[E0:S0,E1:S1,...] [WT|WB] [NORA|RA|ADRA] [Direct|Cached]
        [-szXXX [-szYYY ...]] [-strpszM] [-Hsp[E0:S0,...]] [-AfterLdX] -aN 
MegaCli -CfgEachDskRaid0 [WT|WB] [NORA|RA|ADRA] [Direct|Cached][-strpszM] -aN|-a0,1,2|-aALL
MegaCli -CfgClr -aN|-a0,1,2|-aALL 
MegaCli -CfgDsply -aN|-a0,1,2|-aALL 
MegaCli -CfgLdDel -LX|-L0,2,5...|-LALL -aN|-a0,1,2|-aALL 
MegaCli -CfgFreeSpaceinfo -aN|-a0,1,2|-aALL 
MegaCli -CfgSpanAdd -r10 -Array0[E0:S0,E1:S1] -Array1[E0:S0,E1:S1] [-ArrayX[E0:S0,E1:S1] ...] -aN 
MegaCli -CfgSpanAdd -r50 -Array0[E0:S0,E1:S1,E2:S2,...] -Array1[E0:S0,E1:S1,E2:S2,...] 
        [-ArrayX[E0:S0,E1:S1,E2:S2,...] ...] [WT|WB] [NORA|RA|ADRA] [Direct|Cached] 
        [-strpszM] -aN 
MegaCli -CfgSave -f filename -aN   
MegaCli -CfgRestore -f filename -aN   
MegaCli -CfgForeign -Scan -aN|-a0,1,2|-aALL    
MegaCli -CfgForeign -Dsply [x] -aN|-a0,1,2|-aALL    
MegaCli -CfgForeign -Preview [x] -aN|-a0,1,2|-aALL    
MegaCli -CfgForeign -Import [x] -aN|-a0,1,2|-aALL    
MegaCli -CfgForeign -Clear [x] -aN|-a0,1,2|-aALL    
        x - index of foreign configurations. Optional. All by default. 
MegaCli -AdpEventLog -GetEventLogInfo -aN|-a0,1,2|-aALL 
MegaCli -AdpEventLog -GetEvents         -f &amp;lt;fileName&amp;gt; -aN|-a0,1,2|-aALL 
MegaCli -AdpEventLog -GetSinceShutdown  -f &amp;lt;fileName&amp;gt; -aN|-a0,1,2|-aALL 
MegaCli -AdpEventLog -GetSinceReboot    -f &amp;lt;fileName&amp;gt; -aN|-a0,1,2|-aALL 
MegaCli -AdpEventLog -IncludeDeleted    -f &amp;lt;fileName&amp;gt; -aN|-a0,1,2|-aALL 
MegaCli -AdpEventLog -GetLatest n -f &amp;lt;fileName&amp;gt; -aN|-a0,1,2|-aALL 
MegaCli -AdpEventLog -Clear -aN|-a0,1,2|-aALL 
MegaCli -AdpBbuCmd -aN|-a0,1,2|-aALL  
MegaCli -AdpBbuCmd -GetBbuStatus -aN|-a0,1,2|-aALL  
MegaCli -AdpBbuCmd -GetBbuCapacityInfo -aN|-a0,1,2|-aALL  
MegaCli -AdpBbuCmd -GetBbuDesignInfo -aN|-a0,1,2|-aALL  
MegaCli -AdpBbuCmd -GetBbuProperties -aN|-a0,1,2|-aALL  
MegaCli -AdpBbuCmd -BbuLearn -aN|-a0,1,2|-aALL  
MegaCli -AdpBbuCmd -BbuMfgSleep -aN|-a0,1,2|-aALL  
MegaCli -AdpBbuCmd -BbuMfgSeal -aN|-a0,1,2|-aALL  
MegaCli -AdpBbuCmd -SetBbuProperties -f &amp;lt;fileName&amp;gt; -aN|-a0,1,2|-aALL 
MegaCli -AdpFacDefSet -aN 
MegaCli -AdpFwFlash -f filename [-NoSigChk] [-NoVerChk] -aN|-a0,1,2|-aALL  &lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;For somebody who knows a bit about the underlying technology and has worked with previous SCSI based LSI products one can actually guess on most of the things - but that&#039;s not what one should actually do with enterprise class RAID hardware that is often used to protect valuable data.&lt;/p&gt;

LSI really needs to look into bundling proper docs with this too because &quot;let&#039;s guess what this cryptic switch means&quot; is NOT appropriate at all.&lt;br /&gt;
 &lt;p&gt;Oh - by the way there is a &lt;a href=&quot;http://www.lsilogic.com/files/support/rsa/MR_SAS_1.0/Linux_MegaCli_1.01.09.txt&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;README&lt;/a&gt; linked on the lsilogic website for that tool - but guess?  Except for a bit of revision history it only contains the very same output I showed above ...&lt;/p&gt;

 
    </content:encoded>

    <pubDate>Sun, 18 Feb 2007 09:33:00 +0100</pubDate>
    <guid isPermaLink="false">http://www.kaltenbrunner.cc/blog/index.php?/archives/4-guid.html</guid>
    
</item>
<item>
    <title>factory defaulting an IBM Remote Supervisor II Slimline (RSAII) card</title>
    <link>http://www.kaltenbrunner.cc/blog/index.php?/archives/3-factory-defaulting-an-IBM-Remote-Supervisor-II-Slimline-RSAII-card.html</link>
            <category>hardware</category>
    
    <comments>http://www.kaltenbrunner.cc/blog/index.php?/archives/3-factory-defaulting-an-IBM-Remote-Supervisor-II-Slimline-RSAII-card.html#comments</comments>
    <wfw:comment>http://www.kaltenbrunner.cc/blog/wfwcomment.php?cid=3</wfw:comment>

    <wfw:commentRss>http://www.kaltenbrunner.cc/blog/rss.php?version=2.0&amp;type=comments&amp;cid=3</wfw:commentRss>
    

    <author>nospam@example.com (Stefan Kaltenbrunner)</author>
    <content:encoded>
    &lt;p&gt;This is actually pretty easy to do but the information on how to do it is pretty well hidden on the IBM website.&lt;br /&gt;
The first thing one needs is &lt;a href=&quot;http://www-1.ibm.com/support/docview.wss?uid=psg1MIGR-55020&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;ASU&lt;/a&gt; IBMs advanced settings utility. The other component needed is the &lt;a href=&quot;http://www-1.ibm.com/support/docview.wss?uid=psg1MIGR-64585&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;Remote Supervisor Adapter II USB Daemon&lt;/a&gt;. &lt;br /&gt;
After compiling and loading the driver (though not officially supported it works fine on Debian Etch/amd64) on can simply:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;[root@somewhere ~]# ./asu resetrsa
Resetting RSA/RSA2..........done&lt;/code&gt;&lt;/pre&gt;

 
    </content:encoded>

    <pubDate>Fri, 16 Feb 2007 15:21:59 +0100</pubDate>
    <guid isPermaLink="false">http://www.kaltenbrunner.cc/blog/index.php?/archives/3-guid.html</guid>
    
</item>

</channel>
</rss>