How to replace a failed drive in mdadm RAID10?

V

vlarkanov2018-12-04 16:36:32

Debian

vlarkanov, 2018-12-04 16:36:32

So, I had RAID10 from sda1, sdb1, sdc1, sdd1.
SDD1 took and died. After reboot I see:

#cat /proc/mdstat
Personalities : [raid10] [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4]
md0 : active raid10 sdb1[1] sdc1[2] sda1[0]
5860268032 blocks super 1.2 512K chunks 2 near-copies [4/3] [UUU_]
bitmap: 21/44 pages [84KB], 65536KB chunk
unused devices:
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 3.7T 0 disk
└─sda1 8:1 0 2.7T 0 part
└─md0 9:0 0 5.5T 0 raid10 /mnt/hdd-backup
sdb 8:16 0 3.7T 0 disk
└─sdb1 8:17 0 2.7T 0 part
└ ─md0 9:0 0 5.5T 0 raid10 /mnt/hdd-backup
sdc 8:32 0 2.7T 0 disk
└─sdc1 8:33 0 2.7T 0 part
└─md0 9:0 0 5.5T 0 raid10 /mnt/hdd-backup
sdd 8:48 0 1.8T 0 disk
└─sdd1 8:49 0 1.8T 0 part /mnt /OLD
sde 8:64 1 7.5G 0 disk
└─sde1 8:65 1 7.5G 0 part /
# blkid
/dev/sda1: UUID="71dc28fb-c4eb-6bd8-557a-83f4bd8af796" UUID_SUB="220cfe51-b94f- f982-3fa5-a06aa058dacd" LABEL="backup02:0" TYPE="linux_raid_member" PARTUUID="9cdec4d9-3c43-4c46-b4ee-9248827d835c"
/dev/sdb1: UUID="71dc28fb-c4eb-6bd8-557BSUb-83f6D4" ="03abb76d-a067-601a-44f9-4b36126baf3a" LABEL="backup02:0" TYPE="linux_raid_member" PARTUUID="9586288e-7416-4526-ab68-8cff59ff7df3"
/dev/sdc1: UUID="71dc28fb-c4eb-6bd8-557a-83f4bd8af796" 4a4f-b321-074628d3f94b"
/dev/sdd1: UUID="fb396bb5-e210-4e80-9a55-eb59546fdd28" TYPE="ext4"
/dev/sde1: UUID="2ab65528-f2bb-4f57-8d84-6b5e717853d4" TYPE=" ext4" PARTUUID="6bf31661-01"
/dev/md0: UUID="9efb1dc4-d84e-4796-9689-18589abb54c0" TYPE="ext4"

And that's the question. According to the instructions, before adding a new disk to the raid (when I insert it into this system unit and mark it, of course - while it is in the box), you must first throw out the failed disk from the array (partition /dev/sdd1). But the problem is that after a reboot, /dev/sdd is a completely different disk that has nothing to do with the raid.

# mdadm /dev/md0 --fail /dev/sdd1
mdadm: set device faulty failed for /dev/sdd1: No such device
# mdadm /dev/md0 --remove /dev/sdd1
mdadm: hot remove failed for /dev/ sdd1: No such device or address

How to proceed in such a case?

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

Z

Zettabyte, 2018-12-05
@Zettabyte

If you are not 100% sure that you can do everything right, and the array is still available, I would strongly recommend that you make a complete copy of all important data before replacing.
You can also make sector-by-sector copies of each disk (if there is enough free space) - this will give you even more room to work on errors if you suddenly need it.

K

klepiku, 2018-12-04
@klepiku

look at all the basic commands like there is
https://www.stableit.ru/2009/11/linux-raid0.html
or
https://unixmin.com/blog/raid-10-array-recovery