G
G
Gaikotsu2011-09-08 14:03:38
Database
Gaikotsu, 2011-09-08 14:03:38

A hard disk and a strange glitch with the rollback of data to the database located on it?

Available:
SSD screw, Ext4 file system on it, the screw is used to store databases under mysql, everything works under Ubuntu.
At one point, due to some kind of failure, the system switched the screw to read-only mode.

Sep 8 05:01:00 new-server kernel: [1233345.472262] pa ffff8805a0f629c0: logic 878, phys. 23922707, len 146<br/>
Sep 8 05:01:00 new-server kernel: [1233345.472546] EXT4-fs error (device sdb1): ext4_mb_release_inode_pa: free 144, pa_free 143<br/>
Sep 8 05:01:00 new-server kernel: [1233345.472986] Aborting journal on device sdb1-8.<br/>
Sep 8 05:01:00 new-server kernel: [1233345.473262] EXT4-fs (sdb1): Remounting filesystem read-only<br/>
Sep 8 05:01:00 new-server kernel: [1233345.473657] EXT4-fs error (device sdb1) in ext4_reserve_inode_write: Journal has aborted<br/>
Sep 8 05:01:00 new-server kernel: [1233345.473991] EXT4-fs error (device sdb1) in ext4_reserve_inode_write: Journal has aborted<br/>
Sep 8 05:01:00 new-server kernel: [1233345.474317] EXT4-fs error (device sdb1) in ext4_orphan_del: Journal has aborted

It seems nothing of the kind, and after remounting everything should work again without problems, well, the maximum loss of data not recorded after a failure.
But here strange things begin - after remounting the screw, the state of the databases on it (i.e. all the data in them) for some reason turned out to be for September 1, i.e. all the data written to the database was rolled back a whole week ago, as if for a whole week no one wrote anything to the databases at all (whereas reading / writing to the databases is very active and constant in large quantities).
And in that copy of the databases, which was made just in case before unmounting and copied to another screw, suddenly there was a lot of damage to tables and loss of records in them.
Once something like this already happened about a month ago (and the data also rolled back for about a week), but then they didn’t begin to figure it out - an isolated case, you never know what the reason was at all and simply restored everything from the last backup. but twice - this is already a pattern ...
Does anyone have any ideas why this could be?

Answer the question

In order to leave comments, you need to log in

4 answer(s)
X
XuMiX, 2011-09-08
@XuMiX

Well, it seems to me that there are several problems here:
1) Ext4 - I would use it VERY carefully in production
2) I don’t trust Ubuntu either, to be honest (I would take debian / centos)

G
Gaikotsu, 2011-09-08
@Gaikotsu

Well, in general, we can say the question is no longer relevant.
In general, an interesting thing turned out: how it happened is not clear, but the state of the data on it was somehow fixed, so to speak, to what it was a week ago, and when unmounting / unmounting or restarting the computer, it was reset back to this very time. how, at the same time, during regular work, he somehow normally saved and gave out new data, and where he managed to write them so much - I’ll never know.
any attempts to delete partitions, or at least just format this screw, alas, ended in failure, so apparently that's it - the screw was tormented.

P
Puma Thailand, 2011-09-08
@opium

And I would think about backups and the fact that ssd screws quickly fail under certain loads, it's easier to transfer the base to another screw and forget about this ssd.

T
Temikus, 2011-09-08
@Temikus

What SSD? Manufacturer? Model? Controller firmware version?
Did you optimize the file system (noatime, nodiratime), kernel (swappiness, vm.vfs_cache_pressure), I/O sheduler, like this ?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question