Making a corrupt RAID drive readable

From ThinkServer
Revision as of 02:09, 12 June 2012 by >Samthecrazyman (Added more, formatting number list needs sorting)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Sometimes when working with Linux RAID devices, they can become out of sync and ultimately look corrupt to Linux. However, all is not always lost, it maybe just the fact it LOOKS corrupt when in reality, no data has been lost. This article will show you how to attempt to make the drive work so you at least have access to your data locked away on the drive.

What you need to know

  • You will need to know what the file system of the corrupt drive is, where the array was originally mounted and the drives in the array.
  • This guide is for a Linux RAID setup using mdadm. This will not work with any other RAID setups such as hardware RAID.
  • This will not work to my knowledge if you are using a RAID-0 setup. This method is only tested with a RAID-1 setup, but may work with other mirrored RAID setups with a bit of tinkering.
  • Once the procedure is complete, it is advised you copy your data to another source and re-establish a new RAID setup.
  • You should attempt this on the drive that was running in the array the longest before the problems occurred. This will make sure the data is the most current.
  • If you don't feel confident, it is advisable to take an image of the corrupt drive to fall back on if there are problems later.
  • This may not even work at all and you may lose some or all of your data. There is no 100% guarantee this will work, try at your own risk!

Attempt to repair and mount a damaged drive from a RAID array

  1. If you RAID drive is still mounted (which it shouldn't be if it thinks it's corrupt!), unmount the drive as soon as possible. This will stop data becoming more inconsistent than it currently is. In a console, type umount /dev/md#, # being a number under which the array is mounted. If you only have 1 array mounted, popular defaults include /dev/md0 and /dev/md127. Once this step is complete, nothing else can be changed on the drives. No harm in checking by running this step!
  2. Check the array and see what drives are still a part of the array. In a console, type cat /proc/mdstat. This shows the status of all active software RAID arrays. You should see something similar to the following:
    md0: active raid1 sda1[0] sdb1[1]
    976760696 blocks super 1.0 [1/2] [_U]
    bitmap: 3/8 pages [12KB], 65536KB chunk
  3. In brackets, you will see either [_U] or [U_]. This shows a drive has failed as it should be [UU]. Looking at the top where it says the drive names, these are the order of the drives in the array, so if it shows [_U], sda1 has failed.


for guidance later: fsck.ext4 /dev/sda1 -y

helpful source: http://linuxfanboy.com/index.php/Linux_Software_RAID see screenshots