Monday, June 18, 2007

Rebuilding a 3Ware Raid set in linux

This information is specific to the 3Ware 9500 Series controller. (More specifically, the 9500-4LP). However, the 3Ware CLI seems to be the same for other 3Ware 9XXX controllers which I have had experience with. (The 9550 for sure)


Under linux, the 3Ware cards can be manipulated through the "tw_cli" command. (The CLI tools can be downloaded for free from 3Ware's support website)

A healthy RAID set looks like this:

dev306:~# /opt/3Ware/bin/tw_cli
//dev306> info c0

Unit UnitType Status %Cmpl Stripe Size(GB) Cache AVerify IgnECC
------------------------------------------------------------------------------
u0 RAID-5 OK - 256K 1117.56 ON OFF OFF

Port Status Unit Size Blocks Serial
---------------------------------------------------------------
p0 OK u0 372.61 GB 781422768 3PM0Q56Z
p1 OK u0 372.61 GB 781422768 3PM0Q3YY
p2 OK u0 372.61 GB 781422768 3PM0PFT7
p3 OK u0 372.61 GB 781422768 3PM0Q3B7


A failed RAID set looks like this:

dev306:~# /opt/3Ware/bin/tw_cli
//dev306> info c0

Unit UnitType Status %Cmpl Stripe Size(GB) Cache AVerify IgnECC
------------------------------------------------------------------------------
u0 RAID-5 DEGRADED - 256K 1117.56 ON OFF OFF

Port Status Unit Size Blocks Serial
---------------------------------------------------------------
p0 OK u0 372.61 GB 781422768 3PM0Q56Z
p1 OK u0 372.61 GB 781422768 3PM0Q3YY
p2 OK u0 372.61 GB 781422768 3PM0PFT7
p3 DEGRADED u0 372.61 GB 781422768 3PM0Q3B7


Now I will remove this bad disk from the RAID set:


//dev306> maint remove c0 p3
Exporting port /c0/p3 ... Done.




I now need to physically replace the bad drive. Unfortunately since our vendor wired some of our cables cockeyed, I will usually cause some I/O on the disks at this point, to see which of the four disks is "actually" bad. (Hint: The one with no lights on is the bad one.)


dev306:~# find /opt -type f -exec cat '{}' > /dev/null \;


With the bad disk identified and replaced, now I need to go back into the 3Ware CLI and find the new disk, then tell the array to start rebuilding.


dev306:~# /opt/3Ware/bin/tw_cli
//dev306> maint rescan
Rescanning controller /c0 for units and drives ...Done.
Found the following unit(s): [none].
Found the following drive(s): [/c0/p3].


//dev306> maint rebuild c0 u0 p3
Sending rebuild start request to /c0/u0 on 1 disk(s) [3] ... Done.

//dev306> info c0

Unit UnitType Status %Cmpl Stripe Size(GB) Cache AVerify IgnECC
------------------------------------------------------------------------------
u0 RAID-5 REBUILDING 0 256K 1117.56 ON OFF OFF

Port Status Unit Size Blocks Serial
---------------------------------------------------------------
p0 OK u0 372.61 GB 781422768 3PM0Q56Z
p1 OK u0 372.61 GB 781422768 3PM0Q3YY
p2 OK u0 372.61 GB 781422768 3PM0PFT7
p3 DEGRADED u0 372.61 GB 781422768 3PM0Q3B7


Note that p3 still shows a status of "DEGRADED" but now the array itself is "REBUILDING". Under minimal IO load, a RAID-5 with 400GB disks such as this one will take about 2.5 hours to rebuild.

7 comments:

  1. We have 9550SX-4LP's and 8000's here, and I had to write a quick-n-dirty bash-script to make sure all of the 9550 RAIDed disks were identically configured. Here's what it looks like:

    #
    # check for 3ware 9550SX-4LP controller and then check and set RAID settings
    #
    CTLR=`/usr/sbin/tw_cli info c0 model | /bin/awk '{print $NF}'`
    if [ "${CTLR:=NULL}" != "NULL" -a "${CTLR:=NULL}" == "9550SX-4LP" ] ; then

    CACHE=`/usr/sbin/tw_cli /c0/u0 show cache | /bin/awk '{print $NF}'`
    if [ "${CACHE:=NULL}" != "NULL" -a "$CACHE" != 'on' ] ; then
    echo "activating cache on 9550SX (incorrectly set to \"$CACHE\" at install-time)" && \
    /usr/sbin/tw_cli /c0/u0 set cache=on quiet
    fi

    VERIFY=`/usr/sbin/tw_cli /c0/u0 show autoverify | /bin/awk '{print $NF}'`
    if [ "${VERIFY:=NULL}" != "NULL" -a "$VERIFY" != 'off' ] ; then
    echo "deactivating auto-verify on 9550SX (incorrectly set to \"$VERIFY\" at install-time)" && \
    /usr/sbin/tw_cli /c0/u0 set autoverify=off quiet
    fi

    IGNOREECC=`/usr/sbin/tw_cli /c0/u0 show ignoreECC | /bin/awk '{print $NF}'`
    if [ "${IGNOREECC:=NULL}" != "NULL" -a "$IGNOREECC" != 'on' ] ; then
    echo "activating ignoreECC on 9550SX (incorrectly set to \"$IGNOREECC\" at install-time)" && \
    /usr/sbin/tw_cli /c0/u0 set ignoreECC=on
    fi

    QUEUE=`/usr/sbin/tw_cli /c0/u0 show qpolicy | /bin/awk '{print $NF}'`
    if [ "${QUEUE:=NULL}" != "NULL" -a "$QUEUE" != 'on' ] ; then
    echo "activating queueing on 9550SX (incorrectly set to \"$QUEUE\" at install-time)" && \
    /usr/sbin/tw_cli /c0/u0 set qpolicy=on
    fi

    STORSAVE=`/usr/sbin/tw_cli /c0/u0 show storsave | /bin/awk '{print $NF}'`
    if [ "${STORSAVE:=NULL}" != "NULL" -a "$STORSAVE" != 'balance' ] ; then
    echo "setting StorSave on 9550SX to Balanced (incorrectly set to \"$STORSAVE\" at install-time)" && \
    /usr/sbin/tw_cli /c0/u0 set storsave=balance quiet
    fi

    else

    echo "failed to find 3ware 9550SX-4LP controller."

    fi

    ReplyDelete
  2. Good stuff, thanks for the quickie script. Does tw_cli work mostly the same way on the 8000 series cards as well?

    ReplyDelete
  3. the 8000 series is a much less capable card. we're running tw_cli version 2.00.00.032b to interact with the 8500's and it seems the only thing you can enable or disable is the "cache" (which indicates to me that the rest of these nifty features are not available in the 8500's).

    as a matter of fact, it seems that the 2.00.03.017 version may work against an 8000 series, but the 'help' feature indicates there is certain functionality only available in the 9000 series cards.

    when you send a simple "tw_cli info c0 unitstatus" you get...

    2.00.00.032b response:
    # of units: 1
    Unit 0: RAID 10 465.77 GB ( 976790016 blocks): OK

    2.00.03.017 response:
    Unit UnitType Status %Cmpl Stripe Size(GB) Cache AVerify IgnECC
    ------------------------------------------------------------------------------
    u0 RAID-10 OK - 64K 465.77 ON - -


    [root@host sbin]# ./tw_cli-2.00.00.032b help set

    This command will adjust and display certain settings on the controllers

    set [c<c> |
    rebuild c<c> <1..5> |
    cache c<c> u<u> [on|off]
    ]
    c - the controller id
    u - the unit id
    - If set is called without a sub-command and only a specified
    controller, it will display all the settings for that controller.

    [root@host sbin]# ./tw_cli-2.00.03.017 help set

    set rebuild c<c> <1..5>
    set cache c<c> u<u> on|off
    set verify c<c> <1..5> (Note: 9000 series)
    set autoverify c<c> u<u> on|off (Note: 9000 series)
    set overwriteECC c<c> u<u> on|off (Note: 9000 series)

    ReplyDelete
  4. An important missing instruction is to unmount all the volumes that comprise the failed raid set if possible as the rebuild will be faster.

    ReplyDelete
  5. Wow ive been struggling with the 9500s-12 and 12 1.5tb hdd. Cant figure it out its in two sets 6x1.5tb and my first raid does fine never drops a drive the 2nd drops a random drive/port every day now not sure what the problem could be

    ReplyDelete
  6. I know this sounds weird, but check all of the fans in your chassis, and make sure you don't have any other source of vibration. If possible, run the computer with the case open, and feel the fans when they're running to see if they're producing any vibration.

    We have a model of Supermicro chassis where the fans fail pretty frequently. The fan starts to vibrate in the chassis, and the hard drive is no longer able to seek in a timely fashion because it has to correct for the vibration so often. This causes the drives to start timing out - and then the rebuild takes for-freaking-ever - sometimes 2-3 days even with 400GB drives. We were having issues where once a drive failed, we could almost never get it to rebuild due to timeouts, even with brand new drives.


    We ended up throwing away probably 30-40 hard drives before we realized that the majority of the time we were getting a timeout detected by the 3Ware card, it was actually caused by bad fans. We have monitoring on the fans, but often times the broken fans keep spinning so they don't trip off the alarm. Now when we see RAID failures on these types of machines, we'll replace the fans before we replace the disks!

    This is an article that talks about it a little bit:
    http://www.zdnet.com/blog/storage/bad-bad-bad-vibrations/896
    This is talking about "normal" datacenter vibrations and not the kind of direct, harsh vibrations that can happen with a broken fan spinning at 10K RPM a few inches away from the disk!

    ReplyDelete
  7. Thanks for the tutorial, Ops Monkey! You just saved me a ton of time!

    ReplyDelete