D
D
Dmitry2022-03-03 15:39:03
RAID
Dmitry, 2022-03-03 15:39:03

"Host adapter abort request" error on Adaptec ASR-6805 raid controller. What could be the reason?

There is an old Supermicro X9DRD-7LN4F north, Proxmox 5.1 is installed on it, data storage is provided by the Adaptec ASR-6805 RAID controller (without BBU) and 3 RAID-1 arrays of six SATA drives. Periodically, usually at high disk load (for example, during backups), the following error occurs:

dmesg output snippet:

spoiler
[539389.708095] aacraid: Host adapter abort request.
                aacraid: Outstanding commands on (0,0,0,0):                                                                                                                                  
[539389.709267] aacraid: Host adapter abort request.
                aacraid: Outstanding commands on (0,0,0,0):                                                                                                                                  
[539389.710232] aacraid: Host adapter abort request.
                aacraid: Outstanding commands on (0,0,0,0):                                                                                                                                  
[539389.711192] aacraid: Host adapter abort request.
                aacraid: Outstanding commands on (0,0,0,0):                                                                                                                                  
[539389.712187] aacraid: Host adapter abort request.
                aacraid: Outstanding commands on (0,0,0,0):                                                                                                                                  
[539397.644089] aacraid: Host adapter abort request.
                aacraid: Outstanding commands on (0,0,0,0):                                                                                                                                  
[539397.645791] aacraid: Host adapter abort request.
                aacraid: Outstanding commands on (0,0,0,0):                                                                                                                                  
[539397.647398] aacraid: Host adapter abort request.
                aacraid: Outstanding commands on (0,0,0,0):                                                                                                                                  
[539397.744266] aacraid: Host adapter reset request. SCSI hang ?
[539397.745096] aacraid 0000:04:00.0: outstanding cmd: midlevel-0
[539397.745098] aacraid 0000:04:00.0: outstanding cmd: lowlevel-0
[539397.745100] aacraid 0000:04:00.0: outstanding cmd: error handler-0
[539397.745101] aacraid 0000:04:00.0: outstanding cmd: firmware-306
[539397.745103] aacraid 0000:04:00.0: outstanding cmd: kernel-2
[539397.745149] aacraid 0000:04:00.0: Controller reset type is 3
[539397.745964] aacraid 0000:04:00.0: Issuing IOP reset
[539446.464399] aacraid 0000:04:00.0: IOP reset succeded
[539446.465372] resource sanity check: requesting [mem 0xdf900000-0xdfcfffff], which spans more than PCI Bus 0000:04 [mem 0xdf900000-0xdfafffff]
[539446.465380] caller aac_src_ioremap+0x54/0xe0 [aacraid] mapping multiple BARs
[539446.488170] aacraid: Comm Interface type1 enabled
[539459.311116] aacraid 0000:04:00.0: Scheduling bus rescan


If an error occurs, all virtual machines are frozen, after the controller is reset, their work is restored. We logically assumed that the controller was failing, but after replacing it with a similar ASR-6805E, everything remained the same, the problem also periodically manifests itself. At the same time, the frequency of its occurrence is small, everything can be normal for weeks. The controller can remain in a hung state for up to several minutes.

Where to dig? It was assumed that overheating was possible, but the temperature graph made from the output of the arcconf getconfig 1 command shows that the temperature is stable at 35-40 degrees. Also, all other indicators seem to be normal.

Answer the question

In order to leave comments, you need to log in

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question