Just how do I make my disk unmap pending unreadable fields
I have a disk with some pending unreadable fields, according to smartd. What would certainly be the most convenient means to make the disk remap them and also stop smartd from whining?
Today, I get 2 of these every hr:
Sep 10 23:15:35 hylton smartd: Device: /dev/sdc, 1 Currently unreadable (pending) sectors
The system is an x86 system running Ubuntu Linux 9.10 (jaunty). The disk becomes part of an LVM team. This is just how smartctl recognizes the disk:
Model Family: Western Digital Caviar Second Generation Serial ATA family Device Model: WDC WD5000AAKS-00TMA0 Serial Number: WD-WCAPW4207483 Firmware Version: 12.01C01 User Capacity: 500,107,862,016 bytes
A pending unreadable field is one that returned a read mistake and also which the drive has actually noted for remapping at the first feasible possibility. Nonetheless, it can not do the remapping till either points takes place :
- The field is reread efficiently
- The field is revised
Until after that, the field continues to be pending. So you have 2 equivalent means to manage this :
- Keep attempting to go over the field till you do well
- Overwrite that field with new information
Obviously, (1) is non - devastating, so you need to possibly attempt it first, although remember that if the drive is beginning to fall short in a significant means after that constant analysis from a negative location is most likely to make it fall short far more promptly. If you have a great deal of pending fields and also various other mistakes, and also you respect the information on the drive, I advise taking it out of solution and also making use of the superb device ddrescue to recoup as much information as feasible. After that throw out the drive.
If the field concerned has information you uncommitted around, or can recover from a back-up, after that overwriting it is possibly the quickest and also most basic remedy. You can after that watch the reapportioned and also pending matters for the drive to see to it the field was cared for.
Just how do you figure out what the field represents in the filesystem? I located a superb write-up on the smartmontools internet site, here, although it's rather technological and also specifies to ext2/3/4 and also reiser documents systems.
A less complex strategy, which I made use of on among my very own (Mac) drives, is to make use of
find / -xdev -type f -print0 | xargs -0 ... to read every documents on the system. Take down the pending matter prior to running this. If the field is inside a documents, you will certainly get a mistake message from the device you made use of to read the files (eg md5sum) revealing you the course to it. You can after that concentrate your focus on re - analysis simply this documents till it reviews efficiently. Usually this will certainly address the trouble, if it's an occasionally - made use of documents which simply required to be gone over a couple of times. If the mistake vanishes, or you do not run into any kind of mistakes in reviewing all the files, examine the pending matter to see if it's lowered. If it has, the trouble was addressed by analysis.
If the documents can not read efficiently after numerous shots (eg 20) after that you require to overwrite the documents, or the block within the documents, to permit the drive to reapportion the field. You can make use of ddrescue on the documents (as opposed to the dividing) to overwrite simply the one field, by replicating to a short-lived documents and afterwards replicating back once more. Keep in mind that simply getting rid of the documents now is a negative suggestion, due to the fact that the negative field will certainly enter into the free checklist where it will certainly be tougher to locate. Entirely overwriting it misbehaves also, due to the fact that once more the fields will certainly enter into the free checklist. You require to revise the existing blocks. The
notrunc alternative of
dd is one means to do this.
If you run into no mistakes, and also the pending matter did not decrease, after that the field has to remain in the freelist or partly of the filesystem framework (eg an inode table). You can attempt filling out all the vacuum with
cat /dev/zero >tempfile, and afterwards examine the pending matter. If it drops, the trouble remained in the free checklist and also has actually currently vanished.
If the field remains in the framework, you have an extra significant trouble, and also you will possibly run into mistakes simply strolling the directory site tree. In this scenario, I assume the only reasonable remedy is to reformat the drive, additionally making use of ddrescue to recoup information if essential.
Maintain a really close eye on the drive. Field reallocation is a great canary in the coal mine, possibly offering you very early caution of a drive that is falling short. By taking very early activity you can protect against a later tragic and also really excruciating landslide. I'm not recommending that a couple of field reallocations are a sign that you need to throw out the drive. All modern-day drives require to do some reallocation. Nonetheless, if the drive isn't older (< 1 year) or you are obtaining constant new reallocations (> 1/month) after that I advise you change it asap.
I do not have empirical proof to confirm it, yet my experience recommends that disk troubles can be lowered by reviewing the entire disk occasionally, either by a
dd of the raw disk or by reviewing every documents making use of
find. Mostly all the disk troubles I've experienced in the previous numerous years have actually emerged first in hardly ever - made use of files, or on equipments that are not made use of much. This makes good sense heuristically, also, because if a field is being gone over regularly the drive has an opportunity to reapportion it when it first identifies a small trouble with that said field as opposed to waiting till the field is entirely unreadable. The drive is vulnerable to do anything with a field unless the host accesses it in some way, either by reviewing or creating it or by performing among the SMART examinations.
I would certainly such as to trying out the suggestion of an every night or once a week cron work that reviews the entire disk. Presently I'm making use of a "pauper's RAID" in which I have a 2nd disk drive in the equipment and also I back up the major disk to it every evening. Somehow, this is in fact far better than RAID matching, due to the fact that if I goof and also delete a documents by chance I can get the other day's version quickly from the back-up disk. On the various other hand, I think a hardware RAID controller does a great deal of great behind-the-scenes to check, report and also deal with disk troubles as they arise. My existing back-up manuscript makes use of
rsync to stay clear of duplicating information that hasn't transformed, yet because the demand to go over all fields possibly it would certainly be far better to replicate every little thing, or to have a different manuscript that reviews the whole raw disk each week.