G
G
Grigory Bondarenko2021-09-13 17:16:15
RAID
Grigory Bondarenko, 2021-09-13 17:16:15

The HBA H240 RAID controller hides all volumes if at least one disk is not connected. This is fine?

Good day, colleagues!
At the enterprise, this miracle works in the server: HPE HBA H240. It seems to work well, but there are questions. Right now I’m playing on a test machine with the same controller and I see strange behavior: if you disconnect one of the disks (which are in the mirror) and try to start the server, it turns out that not a single disk is available (not displayed in O / C), but Smart Storage Administrator writes that they say:

"Critical Status Message(s). Bay 6 is bad or missing. To correct this problem, check the data and power connections to the physical drive."

And ssacli gives an even more interesting sentence:
One or more physical drives in array <0ID> on the cache module has been moved, are missing, or have failed. To correct this problem, restore the configuration to its original state or delete the array and save your configuration.

That is, if one of my drives suddenly fails, then my server will completely crash on the next reboot, and I will have only two options:
1. Find a suitable spare drive;
2. Remove the problematic mirror, losing data on it, then the remaining volumes will become available.
And I have a question: what then is the point of mirroring? In theory, the controller should pull up all volumes even if one disk falls off each mirror. Isn't that the point of mirroring? What if I don't have a suitable drive on hand? What if it's New Year's Eve?
Maybe this behavior is configured somewhere? Please share your experience/thoughts.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
S
Saboteur, 2021-09-13
@saboteur_kiev

I did not deal with this controller, but it is strange that it does not start.
But when you turn on the computer in adequate controllers, the choices are usually as follows:
1. Start the OS as it is, with one disk, (ignore that the second one has crashed)
2. Insert a new disk, start the rebuild with it and start the OS after the rebuild.
3. Insert a new disk, run the rebuild in the background, and immediately start the OS. Usually, the percentage of resources that can be used for the background task of the rebuild is configured.
4. If the raid supports hot spare, it can be configured so that if one of the disks fails, the mirror will automatically start a rebuild on the disk specified as hot spare
Removing information about the raid used to usually mean reformatting the disk.
That is, converting a disk from a raid to an Elon stand with saving information, the regular raid utilities did not support it before, even if there is only a difference in the boot sector. Maybe it's not like that anymore. I will support Aleksey Cheremisin
- softtrade is now quite normal, and a hard raid is usually needed only for hi-end solutions, when the hardware has its own large cache and battery and a processor that will solve all this there is a good one, and usually a hard raid for solutions from a large number of disks, with a basket for them.

A
Alexey Cheremisin, 2021-09-13
@leahch

For automatic RAID repair to occur, you need plus one disk (but it all depends on the specific configuration).
This extra drive is commonly referred to as the Hot Spare . And when one of the disks of the array fails, the retired disk is automatically replaced by this one.
In all other cases, yes, have spare parts and replace with handles :)
Well, or give up on the piece of iron and do a software RAID, which I have been practicing for the last 20 years. , but there is no similar brand at hand. Which I advise everyone. (only you don’t need to drive about “speed”, iops, processor unloading, etc., you don’t need to, because you need to choose the right hardware)

G
Grigory Bondarenko, 2021-09-14
@yurybx

I repeated the experiment, but this time I pulled out the SATA cable in the on state, right during operation. The system continued to work as if nothing had happened (as it should be), all volumes, including the degraded one, remained available. Then I rebooted the computer, but nothing changed: all volumes are available. This means that the controller blocks volumes only when the disk falls off in the off state. The situation is extremely unlikely, but still possible: if the disk fails at the time of the next start, we will get a dump of all volumes. Well, I'll keep that in mind.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question