U
U
uhryab2011-09-28 14:51:18
RAID
uhryab, 2011-09-28 14:51:18

RAID controller does not see the new disk

The server with SUSE Linux Enterprise Server 11 (x86_64) has three 3ware controllers. Each raid-5 controller has 8 disks. On one of them the disk flew out. A new disk is inserted, but the controller does not detect the disk. A broken disk is inserted and the controller sees it with great pleasure. What could be the problem?




teradata:/ # tw_cli show

Ctl Model (V)Ports Drives Units NotOpt RRate VRate BBU
-------------------------------------------------- ----------------------
c2 9690SA-8I 8 8 1 0 1 1 -
c3 9690SA-8I 8 8 1 0 1 1 -
c4 9690SA-8I 7 7 1 1 1 1 -

teradata:~ # tw_cli /c4 show

Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
-------------------------------------------------- ----------------------------
u0 RAID-5 REBUILD-PAUSED 0% - 256K 6519.19 OFF ON

VPort Status Unit Size Type Phy Encl-Slot Model
-------------------------------------------------- ----------------------------
p0 OK u0 931.51 GB SATA 0 - ST31000340NS
p1 OK u0 931.51 GB SATA 1 - ST31000340NS
p2 OK u0 931.51 GB SATA 2 - ST31000340NS
p3 DEGRADED u0 931.51 GB SATA 3 - ST31000524NS
p4 OK u0 931.51 GB SATA 4 - ST31000340NS
p5 OK u0 931.51 GB SATA 5 - ST31000340NS
p6 OK u0 931.51 GB SATA 6 - ST31000340NS
p7 OK u0 931.51 GB SATA 7 - ST31000340NS


teradata:~ # tw_cli maint remove c4 p3
Removing port /c4/p3 ... Done.


We will replace the broken disk with a new disk.

teradata:~ # tw_cli /c4 rescan
Rescanning controller /c4 for units and drives ...Done.
Found the following unit(s): [none].
Found the following drive(s): [none].

teradata:~ # tw_cli /c4 show

Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
-------------------------------------------------- ----------------------------
u0 RAID-5 DEGRADED - - 256K 6519.19 OFF ON

VPort Status Unit Size Type Phy Encl-Slot Model
-------------------------------------------------- ----------------------------
p0 OK u0 931.51 GB SATA 0 - ST31000340NS
p1 OK u0 931.51 GB SATA 1 - ST31000340NS
p2 OK u0 931.51 GB SATA 2 - ST31000340NS
p4 OK u0 931.51 GB SATA 4 - ST31000340NS
p5 OK u0 931.51 GB SATA 5 - ST31000340NS
p6 OK u0 931.51 GB SATA 6 - ST31000340NS
p7 OK u0 931.51 GB SATA 7 - ST31000340NS

The controller does not see the new disk.

Put the broken disk back in. After that we see.


teradata:~ # tw_cli /c4 show

Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
-------------------------------------------------- ----------------------------
u0 RAID-5 DEGRADED - - 256K 6519.19 OFF ON

VPort Status Unit Size Type Phy Encl-Slot Model
-------------------------------------------------- ----------------------------
p0 OK u0 931.51 GB SATA 0 - ST31000340NS
p1 OK u0 931.51 GB SATA 1 - ST31000340NS
p2 OK u0 931.51 GB SATA 2 - ST31000340NS
p3 DEGRADED u0 931.51 GB SATA 3 - ST31000524NS
p4 OK u0 931.51 GB SATA 4 - ST31000340NS
p5 OK u0 931.51 GB SATA 5 - ST31000340NS
p6 OK u0 931.51 GB SATA 6 - ST31000340NS
p7 OK u0 931.51 GB SATA 7 - ST31000340NS

Remove the bad disk and do a rescan. (I do not pull out a physically broken disk)

teradata:~ # tw_cli maint remove c4 p3
Removing port /c4/p3 ... Done.

teradata:/ # tw_cli /c4 rescan
Rescanning controller /c4 for units and drives ...
Done.
Found the following unit(s): [none].
Found the following drive(s): [/c4/p3].

teradata:/#
teradata:/ # tw_cli /c4 show

Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
-------------------------------------------------- ----------------------------
u0 RAID-5 DEGRADED - - 256K 6519.19 OFF ON

VPort Status Unit Size Type Phy Encl-Slot Model
-------------------------------------------------- ----------------------------
p0 OK u0 931.51 GB SATA 0 - ST31000340NS
p1 OK u0 931.51 GB SATA 1 - ST31000340NS
p2 OK u0 931.51 GB SATA 2 - ST31000340NS
p3 OK u? 931.51 GB SATA 3 - ST31000524NS
p4 OK u0 931.51 GB SATA 4 - ST31000340NS
p5 OK u0 931.51 GB SATA 5 - ST31000340NS
p6 OK u0 931.51 GB SATA 6 - ST31000340NS
p7 OK u0 931.51 GB SATA 7 - ST31000340NS

I'm trying to do a rebuild for a broken disk
teradata:/ # tw_cli maint rebuild c4 u0 p3
The following drive(s) cannot be used [3].
Error: (CLI:144) Invalid drive(s) specified.

A broken disk has a status for several hours

p3 OK u? 931.51 GB SATA 3 - ST31000524NS

After a few hours, the disk status changes to DEGRADED

p3 DEGRADED u0 931.51 GB SATA 3 - ST31000524NS

teradata:~ # tw_cli /c4 show

Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
-------------------------------------------------- ----------------------------
u0 RAID-5 REBUILD-PAUSED 0% - 256K 6519.19 OFF ON

VPort Status Unit Size Type Phy Encl-Slot Model
-------------------------------------------------- ----------------------------
p0 OK u0 931.51 GB SATA 0 - ST31000340NS
p1 OK u0 931.51 GB SATA 1 - ST31000340NS
p2 OK u0 931.51 GB SATA 2 - ST31000340NS
p3 DEGRADED u0 931.51 GB SATA 3 - ST31000524NS
p4 OK u0 931.51 GB SATA 4 - ST31000340NS
p5 OK u0 931.51 GB SATA 5 - ST31000340NS
p6 OK u0 931.51 GB SATA 6 - ST31000340NS
p7 OK u0 931.51 GB SATA 7 - ST31000340NS


Five new disks were used for replacement, the controller did not see any of the five new disks. If I insert a broken disk, then its controller sees it.

Broken disk model Segate ST31000524NS changed to the same model. All new discs are good. What could be the problem?

teradata:~ # vgdisplay -v vg_data
    Using volume group(s) on command line
    Finding volume group "vg_data"
  --- Volume group ---
  VG Name vg_data
  System ID
  Format lvm2
  Metadata Areas 3
  Metadata Sequence #2
  VG Access read/write
  VG status resizable
  MAXLV 0
  CurLV 1
  OpenLV 1
  MaxPV0
  CurPV3
  Act PV 3
  VG Size 19.00 TB
  PE Size 64.00 MB
  Total PE 311318
  Alloc PE / Size 311318 / 19.00 TB
  Free PE / Size 0 / 0
  VG UUID zoSzgL-Jkcr-fYEW-Ic4x-33R8-mSqU-Y34Su8

  --- Logical volume ---
  LV Name /dev/vg_data/lv_data
  VG Name vg_data
  LV UUID lp1gcy-ecZI-F5QU-pFIX-77UA-urfv-uKUBi4
  LV Write Access read/write
  LV status available
  # open 1
  LV Size 19.00 TB
  Current LE 311318
  segments 3
  Allocation inherit
  Read ahead sectors auto
  - currently set to 1024
  Block device 253:8

  --- Physical volumes ---
  PV Name /dev/sdb1
  PV UUID e0hITf-ntw8-wzak-vIrk-8J3B-2FST-YqQ03v
  PV Status allocatable
  Total PE / Free PE 102706 / 0

  PV Name /dev/sdc1
  PV UUID owvZVB-9yIz-aA3F-9lLB-oYc6-7UV6-Lu1Lmu
  PV Status allocatable
  Total PE / Free PE 104306 / 0

  PV Name /dev/sdd1
  PV UUID IgtT05-xMXW-Jn1P-Y8H7-kHMn-sfaY-qUpdMf
  PV Status allocatable
  Total PE / Free PE 104306 / 0

There is an idea to rebuild the RAID, but for this you will have to make a backup of 13 TB of data.
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_system-lv_root
                      2.0G 276M 1.6G 15% /
devtmpfs 1.9G 156K 1.9G 1% /dev
tmpfs 1.9G 0 1.9G 0% /dev/shm
/dev/sda1 1004M 43M 911M 5% /boot
/dev/mapper/vg_system-lv_home
                      2.0G 500M 1.4G 27% /home
/dev/mapper/vg_system-lv_opt
                       20G 324M 19G 2% /opt
/dev/mapper/vg_system-lv_srv
                      2.0G 68M 1.9G 4% /srv
/dev/mapper/vg_system-lv_tmp
                      3.0G 1.9G 946M 68% /tmp
/dev/mapper/vg_system-lv_usr
                       15G 2.4G 12G 17%/usr
/dev/mapper/vg_system-lv_var
                       20G 883M 18G 5% /var
/dev/mapper/vg_data-lv_data
                       19T 13T 6.1T 68%/data

OS sees raids as /dev/sdb1, /dev/sdc1, /dev/sdd1. How to find out in which OS the disk crashed?
Suppose we have identified /dev/sdd1 with a bad disk. How can I find out what information is on it? It is necessary in order not to backup 13 TV, but to backup only 6.5 TB.
I would like to hear your advice and comments on this issue. Maybe someone has come across something similar. Thank you in advance and thank you for your replies.

Answer the question

In order to leave comments, you need to log in

4 answer(s)
B
BasilioCat, 2011-09-30
@BasilioCat

Some raids require disk initialization before it can be added to the array, such as on adapters. Perhaps you also

N
nicolnx, 2011-10-01
@nicolnx

for 3ware there is a utility that listens on port 888 at startup and gives a web interface.
I changed disks from there - it shows that the disk was found, but not initialized. I didn’t look at how to initiate it from under the cli, but with that web-based stray it’s done in a couple of clicks.

U
uhryab, 2011-10-10
@uhryab

I used the web interface, but it did not help

U
uhryab, 2011-10-10
@uhryab

But the continuation of the story
Decided to try to insert another disc. My disk was only 2TB. The controller physically saw him immediately. The blue light on it blinked happily several times.
We look at the console.

teradata:~ # tw_cli /c4 show

Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
-------------------------------------------------- ----------------------------
u0 RAID-5 DEGRADED - - 256K 6519.19 Ri ON

VPort Status Unit Size Type Phy Encl-Slot Model
-------------------------------------------------- ----------------------------
p0 OK u0 931.51 GB SATA 0 - ST31000340NS
p1 OK u0 931.51 GB SATA 1 - ST31000340NS
p2 OK u0 931.51 GB SATA 2 - ST31000340NS
p4 OK u0 931.51 GB SATA 4 - ST31000340NS
p5 OK u0 931.51 GB SATA 5 - ST31000340NS
p6 OK u0 931.51 GB SATA 6 - ST31000340NS
p7 OK u0 931.51 GB SATA 7 - ST31000340NS

teradata:~ # tw_cli /c4 show

Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
-------------------------------------------------- ----------------------------
u0 RAID-5 REBUILDING 0% - 256K 6519.19 Ri ON

VPort Status Unit Size Type Phy Encl-Slot Model
-------------------------------------------------- ----------------------------
p0 OK u0 931.51 GB SATA 0 - ST31000340NS
p1 OK u0 931.51 GB SATA 1 - ST31000340NS
p2 OK u0 931.51 GB SATA 2 - ST31000340NS
p3 DEGRADED u0 1.82 TB SATA 3 - WDC WD20EARS-00MVWB0
p4 OK u0 931.51 GB SATA 4 - ST31000340NS
p5 OK u0 931.51 GB SATA 5 - ST31000340NS
p6 OK u0 931.51 GB SATA 6 - ST31000340NS
p7 OK u0 931.51 GB SATA 7 - ST31000340NS

teradata:~ # tw_cli maint rebuild c4 u0 p3
The following drive(s) cannot be used [3].
Error: (CLI:144) Invalid drive(s) specified.


teradata:~ # tw_cli maint remove c4 p3
Removing port /c4/p3 ... Done.


teradata:~ # tw_cli /c4 show

Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
-------------------------------------------------- ----------------------------
u0 RAID-5 DEGRADED - - 256K 6519.19 Ri ON

VPort Status Unit Size Type Phy Encl-Slot Model
-------------------------------------------------- ----------------------------
p0 OK u0 931.51 GB SATA 0 - ST31000340NS
p1 OK u0 931.51 GB SATA 1 - ST31000340NS
p2 OK u0 931.51 GB SATA 2 - ST31000340NS
p4 OK u0 931.51 GB SATA 4 - ST31000340NS
p5 OK u0 931.51 GB SATA 5 - ST31000340NS
p6 OK u0 931.51 GB SATA 6 - ST31000340NS
p7 OK u0 931.51 GB SATA 7 - ST31000340NS

teradata:~ # tw_cli /c4 rescan
Rescanning controller /c4 for units and drives ...Done.
Found the following unit(s): [none].
Found the following drive(s): [/c4/p3].

teradata:~ # tw_cli /c4 show

Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
-------------------------------------------------- ----------------------------
u0 RAID-5 DEGRADED - - 256K 6519.19 Ri ON

VPort Status Unit Size Type Phy Encl-Slot Model
-------------------------------------------------- ----------------------------
p0 OK u0 931.51 GB SATA 0 - ST31000340NS
p1 OK u0 931.51 GB SATA 1 - ST31000340NS
p2 OK u0 931.51 GB SATA 2 - ST31000340NS
p3 OK - 1.82 TB SATA 3 - WDC WD20EARS-00MVWB0
p4 OK u0 931.51 GB SATA 4 - ST31000340NS
p5 OK u0 931.51 GB SATA 5 - ST31000340NS
p6 OK u0 931.51 GB SATA 6 - ST31000340NS
p7 OK u0 931.51 GB SATA 7 - ST31000340NS

teradata:~ # tw_cli maint rebuild c4 u0 p3
Sending rebuild start request to /c4/u0 on 1 disk(s) [3] ... Done.


teradata:~ # tw_cli /c4 show

Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
-------------------------------------------------- ----------------------------
u0 RAID-5 REBUILDING 0% - 256K 6519.19 Ri ON

VPort Status Unit Size Type Phy Encl-Slot Model
-------------------------------------------------- ----------------------------
p0 OK u0 931.51 GB SATA 0 - ST31000340NS
p1 OK u0 931.51 GB SATA 1 - ST31000340NS
p2 OK u0 931.51 GB SATA 2 - ST31000340NS
p3 DEGRADED u0 1.82 TB SATA 3 - WDC WD20EARS-00MVWB0
p4 OK u0 931.51 GB SATA 4 - ST31000340NS
p5 OK u0 931.51 GB SATA 5 - ST31000340NS
p6 OK u0 931.51 GB SATA 6 - ST31000340NS
p7 OK u0 931.51 GB SATA 7 - ST31000340NS

Let's pay attention to the REBUILDING of the array, but the disk is in the DEGRADED state.
REBUILDING of the array took 4 hours 20 minutes
And everything became like this

teradata:~ # tw_cli /c4 show

Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
-------------------------------------------------- ----------------------------
u0 RAID-5 OK - - 256K 6519.19 Ri ON

VPort Status Unit Size Type Phy Encl-Slot Model
-------------------------------------------------- ----------------------------
p0 OK u0 931.51 GB SATA 0 - ST31000340NS
p1 OK u0 931.51 GB SATA 1 - ST31000340NS
p2 OK u0 931.51 GB SATA 2 - ST31000340NS
p3 OK u0 1.82 TB SATA 3 - WDC WD20EARS-00MVWB0
p4 OK u0 931.51 GB SATA 4 - ST31000340NS
p5 OK u0 931.51 GB SATA 5 - ST31000340NS
p6 OK u0 931.51 GB SATA 6 - ST31000340NS
p7 OK u0 931.51 GB SATA 7 - ST31000340NS

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question