raid 1 - how to find out which drive is "Master"

A

alexbyk2013-03-17 23:46:21

linux

alexbyk, 2013-03-17 23:46:21

Good afternoon. Suddenly, raid 1 began to fall apart and synchronize itself (for the second time in 2 days). How can I find out which drive has more up-to-date information and which drive is currently syncing with which drive? To be able to take it out.

mdadm -D

mdadm -D /dev/md2
/dev/md2:
Version: 0.90
Creation Time: Sat Oct 9 11:50:17 2010
Raid Level: raid1
Array Size: 483925888 (461.51 GiB 495.54 GB)
Used Dev Size: 483925888 (461.51 GiB 495.54 GB)
Raid Devices: 2
Total Devices: 2
Preferred Minor: 2
Persistence: Superblock is persistent

Update Time: Sun Mar 17 22:37:41 2013
State: active, resyncing
Active Devices: 2
Working Devices: 2
Failed Devices: 0
Spare Devices: 0

Rebuild Status: 10% complete

UUID: 849b7744:4c4357bc:adf99ef5:44d98796
Events: 0.97

Number Major Minor RaidDevice State
0 8 3 0 active sync /dev/sda3
1 8 19 1 active sync /dev/sdb3

cat /prog/mdstat

cat /proc/mdstat
Personalities: [raid1]
md0: active raid1 sdb1[1] sda1[0]
361344 blocks [2/2] [UU]

md1: active raid1 sdb2[1] sda2[0]
4096448 blocks [2/2] [UU]

md2: active raid1 sdb3[1] sda3[0]
483925888 blocks [2/2] [UU]
[==>..................] resync = 10.4% (50333376/483925888) finish=16164.6min speed=444K/sec

The only thing that comes to mind is to look at iostat - to which disk the bad one writes more, and from which it reads more - that one is more relevant. But doubts prevail?

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

S

Semyon Dubina, 2013-03-18
@alexbyk

I made a simulation of the error, here is the log:

sda1 crashed

[3777928.350815] md/raid1:md0: Disk failure on sda1, disabling device.
[3777928.350816] md/raid1:md0: Operation continuing on 1 devices.
[3777928.383958] RAID1 conf printout:
[3777928.383960] — wd:1 rd:2
[3777928.383963] disk 0, wo:1, o:0, dev:sda1
[3777928.383964] disk 1, wo:0, o:1, dev:sdb1
[3777928.420261] RAID1 conf printout:
[3777928.420263] — wd:1 rd:2
[3777928.420265] disk 1, wo:0, o:1, dev:sdb1
[3778072.565288] md: unbind[sda1]
[3778072.601454] md: export_rdev(sda1)
[3778082.179715] md: export_rdev(sda1)
[3778082.287766] md: bind[sda1]
[3778082.302899] RAID1 conf printout:
[3778082.302902] — wd:1 rd:2
[3778082.302904] disk 0, wo:1, o:1, dev:sda1
[3778082.302905] disk 1, wo:0, o:1, dev:sdb1
[3778082.302948] md: recovery of RAID array md0
[3778082.302950] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
[3778082.302951] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
[3778082.302955] md: using 128k window, over a total of 12581816k.
[3778267.435015] md: md0: recovery done.
[3778267.585332] RAID1 conf printout:
[3778267.585335] — wd:2 rd:2
[3778267.585337] disk 0, wo:0, o:1, dev:sda1
[3778267.585339] disk 1, wo:0, o:1, dev:sdb1

Now sda1 is marked second:

root# mdadm --detail /dev/md0
/dev/md0:
Version: 1.2
Creation Time: Mon Dec 10 11:16:41 2012
Raid Level: raid1
Array Size: 12581816 (12.00 GiB 12.88 GB)
Used Dev Size: 12581816 (12.00 GiB 12.88 GB)
Raid Devices: 2
Total Devices: 2
Persistence: Superblock is persistent
Update Time: Mon Mar 18 00:54:20 2013
State: clean
Active Devices: 2
Working Devices: 2
Failed Devices: 0
Spare Devices: 0
Name: rescue:0
UUID: 06492cd4:f4a865a7:9060d9a1:7f306487
Events: 64
Number Major Minor RaidDevice State
2 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1

here's what man says : Also, if you have a failure, the failed device will be marked with (F) after the [#]. The spare that replaces this device will be the device with the lowest role number n or higher that is not marked (F). Once the resync operation is complete, the device's role numbers are swapped.
As I understand it, sdb fails, since the disk numbering does not change ...

S

Semyon Dubina, 2013-03-18
@sam002

An interesting question, I didn’t think about it myself before, but there is a more suspicious (after reading the man) that it navigates by timestamps, drags them from metadata ...
Here, you can look at each partition from the raid by disks"mdadm -E /dev/****"

Here is the output from one of my drives

root# mdadm -E /dev/sda1
/dev/sda1:
Magic: a92b4efc
Version: 1.2
Feature Map: 0x0
Array UUID: 06492cd4:f4a865a7:9060d9a1:7f306487
Name: rescue:0
Creation Time: Mon Dec 10 11:16:41 2012
Raid Level: raid1
Raid Devices: 2
Avail Dev Size: 25163776 (12.00 GiB 12.88 GB)
Array Size: 12581816 (12.00 GiB 12.88 GB)
Used Dev Size: 25163632 (12.00 GiB 12.88 GB)
Data Offset: 2048 sectors
Super Offset: 8 sectors
State: clean
Device UUID: 9bc877ed:e32304d7:996f7f50:276d80e3
Update Time: Thu Mar 14 23:34:44 2013
Checksum: 7e6e6594 — correct
Events: 25
Device Role: Active device 0
Array State: AA ('A' == active, '.' == missing)

See the “Update Time” field, I won’t ~~check to~~ break up my array, otherwise I won’t collect it later))