Monday, June 18, 2007

Rebuilding a 3Ware Raid set in linux

This information is specific to the 3Ware 9500 Series controller. (More specifically, the 9500-4LP). However, the 3Ware CLI seems to be the same for other 3Ware 9XXX controllers which I have had experience with. (The 9550 for sure)


Under linux, the 3Ware cards can be manipulated through the "tw_cli" command. (The CLI tools can be downloaded for free from 3Ware's support website)

A healthy RAID set looks like this:

dev306:~# /opt/3Ware/bin/tw_cli
//dev306> info c0

Unit UnitType Status %Cmpl Stripe Size(GB) Cache AVerify IgnECC
------------------------------------------------------------------------------
u0 RAID-5 OK - 256K 1117.56 ON OFF OFF

Port Status Unit Size Blocks Serial
---------------------------------------------------------------
p0 OK u0 372.61 GB 781422768 3PM0Q56Z
p1 OK u0 372.61 GB 781422768 3PM0Q3YY
p2 OK u0 372.61 GB 781422768 3PM0PFT7
p3 OK u0 372.61 GB 781422768 3PM0Q3B7


A failed RAID set looks like this:

dev306:~# /opt/3Ware/bin/tw_cli
//dev306> info c0

Unit UnitType Status %Cmpl Stripe Size(GB) Cache AVerify IgnECC
------------------------------------------------------------------------------
u0 RAID-5 DEGRADED - 256K 1117.56 ON OFF OFF

Port Status Unit Size Blocks Serial
---------------------------------------------------------------
p0 OK u0 372.61 GB 781422768 3PM0Q56Z
p1 OK u0 372.61 GB 781422768 3PM0Q3YY
p2 OK u0 372.61 GB 781422768 3PM0PFT7
p3 DEGRADED u0 372.61 GB 781422768 3PM0Q3B7


Now I will remove this bad disk from the RAID set:


//dev306> maint remove c0 p3
Exporting port /c0/p3 ... Done.




I now need to physically replace the bad drive. Unfortunately since our vendor wired some of our cables cockeyed, I will usually cause some I/O on the disks at this point, to see which of the four disks is "actually" bad. (Hint: The one with no lights on is the bad one.)


dev306:~# find /opt -type f -exec cat '{}' > /dev/null \;


With the bad disk identified and replaced, now I need to go back into the 3Ware CLI and find the new disk, then tell the array to start rebuilding.


dev306:~# /opt/3Ware/bin/tw_cli
//dev306> maint rescan
Rescanning controller /c0 for units and drives ...Done.
Found the following unit(s): [none].
Found the following drive(s): [/c0/p3].


//dev306> maint rebuild c0 u0 p3
Sending rebuild start request to /c0/u0 on 1 disk(s) [3] ... Done.

//dev306> info c0

Unit UnitType Status %Cmpl Stripe Size(GB) Cache AVerify IgnECC
------------------------------------------------------------------------------
u0 RAID-5 REBUILDING 0 256K 1117.56 ON OFF OFF

Port Status Unit Size Blocks Serial
---------------------------------------------------------------
p0 OK u0 372.61 GB 781422768 3PM0Q56Z
p1 OK u0 372.61 GB 781422768 3PM0Q3YY
p2 OK u0 372.61 GB 781422768 3PM0PFT7
p3 DEGRADED u0 372.61 GB 781422768 3PM0Q3B7


Note that p3 still shows a status of "DEGRADED" but now the array itself is "REBUILDING". Under minimal IO load, a RAID-5 with 400GB disks such as this one will take about 2.5 hours to rebuild.

Supermicro H8DAR-T BIOS Settings

We run a lot of Supermicro H8DAR-T motherboards in production. These are the BIOS settings that work well for us. I have not done a lot of tweaking trying to get more performance out of our systems with BIOS settings, since stability is key.

Note that unless specified here, we leave the settings at their default values. (Some of these settings are default values but documented because we need them set that way) Especially important options in BOLD.


Advanced->ACPI Settings->Advanced ACPI Settings
ACPI 2.0 [No]
ACPI APIC Support [Enabled]
ACPI SRAT Table [Enabled]
BIOS->AML ACPI Table [Enabled]
Headless Mode [Enabled]
OS Console Redirection [Always]

Advanced->AMD PowerNow Configuration
PowerNow [Disabled]

Advanced->Remote Access
Remote Access [Enabled]
Serial Port [COM2]
Serial Port Mode [19200,8,N,1]
Flow Control [None]
Redirection After Post [Always]
Terminal Type [vt100]
UT-UTF8 Combo Keys [Enabled]
SRedir Memory Display [No Delay]

Advanced->System Health->System Fan
Fan Speed Control [1) Disable - Full Speed]

PCIPnP
Plug and Play OS [No]
PCI Latency [64]
Allocate IRQ to PCI VGA [Yes]
Pallete Snooping [Disabled]
PCI IDE BusMaster [Disabled]

Boot->Boot Device Priority
1) Floppy
2) PC-CD-244E (cdrom)
3) MBA Slot 218 (first ethernet)
4) 3Ware (or Onbard SATA)
5) MBA Slot 219 (second ethernet)

Chipset->NorthBridge->ECC Configuration
DRAM ECC [Enabled]
MCA ECC Logging [Enabled]
ECC Chipkill [Enabled]
DRAM Scrub Redirect [Enabled]
DRAM BG Scrub [163.8us]
L2 Cache BG Scrub [ 10.2us]
Data Cache BG Scrub [ 5.12us]

Chipset->NorthBridge->IOMMU Options
IOMMU Mode [Best Fit]
Aperture Size [64MB]

Supermicro H8DAR-T version detection

The Supermicro H8DAR-T motherboard comes in (at least) two flavors. The differences that I know about between the two versions are:

* The version 2.01 board will run Opensolaris/Nexenta out of the box. This is because of a difference in the SATA controller hardware. The version 1.01 board will not run Opensolaris without an add-on controller card.

* The 1.01 and 2.01 boards use different hardware sensors (For temperature, fan speed, etc). We get sensor stats through our IPMI cards; because of this the IPMI cards need to be flashed to the specific version of the hardware. The IPMI cards do work for poweron/poweroff and console redirection without this specific firmware, only the sensors do not work if the IPMI firmware mis-matches the motherboard version.

Unfortunately, I do not see enough of a difference at POST time to be able to tell them apart. However, there are two ways I know of to do the detection.

1. With the cover of the machine off, the version can be seen in the back left corner of the board. (Will post pics later)

2. Under linux, use the "dmidecode" command. The system board uses "Handle 0x0002". What works well for me is "dmidecode |grep -A3 'Base Board' ". v1.01 boards report their Version as "1234567890" (way to go Supermicro!). v2.01 boards report as being Version "2.0". Examples:

v1board:~# dmidecode |grep -A3 "Base Board"
Base Board Information
Manufacturer: Supermicro
Product Name: H8DAR-T
Version: 1234567890


v2board:~# dmidecode |grep -A3 "Base Board"
Base Board Information
Manufacturer: Supermicro
Product Name: H8DAR-T
Version: 2.0