Joined: Aug 13, 2004 Posts: 343 Location: Maysville, KY
Posted: Sat Sep 09, 2006 8:07 am Post subject: e2fsck: unable to set superblock flags on....
Well folks, you see that error message, and you are in for a ride. Why you may ask? Well, more than likely you have a hard drive going bad or you have had a power outage and you have a corrupt file system.
Want to know how I know this? Cause I have one of those nice Toshiba laptops that chews hard drives like bubble gum, and its on its 5th hard drive in about the same many years. Yea, I know...should have sent it back in the class action lawsuit.
Anyways, here is some helpful information....mostly posting this cause I get tired of having to look it up all over the net.
If you can't boot linux due to filesystem corruption or hard drive going out, but you want to try to salvage the system...follow these steps.
Gray your favorite linux rescue disk - if you have any copy of Redhat version 8 or above or Fedora Core 1 or above, the very first cd to these sets has a rescue mode built in. Otherwise, grab some other disks that does rescue mode. These instructions assume redhat or fedora core disk is being used.
Boot the first cd and at the boot prompt, type linux rescue
Press enter
Enter whatever language you use for the language and keyboard options
Skip setting up networking
Skip attempting to mount your disks
You will be presented a shell
At the shell, run the following command fdisk -l /dev/hda substitute whatever your hard disk is, I used hda in the example This command shows all the partitions created on the hard drive. The bootable partition is marked with an * Mine looks like this: Device Boot Start End Blocks ID System /dev/hda1 1 36 289138 83 Linux /dev/hda2 37 7296 58315950 5 Extended /dev/hda5 1557 7296 46106515 83 Linux /dev/hda6 37 167 1048173 82 Linux swap /dev/hda7 * 167 1556 11161225 83 Linux
In the example above, /dev/hda7 is the boot partition. From memory, I can deduce that /dev/hda1 is the /boot partition and /dev/hda5 is the /home partition
Knowing the above, we can attempt to repair the system using the command e2fsck This is done like this: e2fsck /dev/hda7 Which will check /dev/hda7 partition for errors
You need to do the above for each partition that might have corrupt data, I ran the tests on each of these /dev/hda1 /dev/hda5 and /dev/hda7 - the first time I did this, about 4 days ago in order to try to salvage the data that is there, each of the above commands worked flawlessly and allowed me to boot into linux, where I then started recovering data to an external usb drive I bought just for this - A western digital 250GB usb 2 drive.
However, after recovering data for several days, the system locked up again and I started this process over and ran into a problem. Each time I ran e2fsck on each partition, I was presented with this error message: e2fsck: unable to set superblock on /dev/hda7
This is not a google error message, and basically requires us to use some more commands to find out what is going on. The commands we will use are tune2fs with the -l (list) option, and mke2fs with the -n option.
Run the following command first: tune2fs -l /dev/hda7
This will show a lot of information, however, the important part is Block size. Mine says 4096 which means I have a 4k block size. This is helpful for specifying the next superblock to use with e2fsck and the -b option
You can also use mke2fs with the -n option to display all of the saved superblocks on the partition. Mine looks like this for /dev/hda7: mke2fs -n /dev/hda7 Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 29412, 819200, 884736, 1605632, 2654208
Now, I can attempt to have e2fsck use a different superblock file with the -b option, like this: e2fsck -b 98304 /dev/hda7
Joined: Aug 13, 2004 Posts: 343 Location: Maysville, KY
Posted: Tue Sep 12, 2006 2:51 am Post subject:
So, if you have made it this far and still can't get the drive to boot - here is what I would recommend:
have a usb drive of some sort for storing the most important data (I went and purchased a Western Digital 250GB USB drive from wally world simply cause I couldn't wait on one to come in from my distributor)
Power the unit off
Have some sort of rescue cd laying around - Redhat version 8 or 9 and all versions of Fedora Core's first cd serve as rescue disks. A gentoo installation cd, or any type of live cd you may having around (Ubuntu, Kubuntu, Knoppix, etc). If you don't have this type of rescue cd, here is a link to one that is only 80mb in size - http://rescuecd.sourceforge.net/
With the unit powered off, hook up the usb drive and power on the unit and insert the cd into the cdrom drive. Make sure you boot to the cdrom.
For redhat/fedora core - select F5 at the installation screen and then type linux rescue followed by return When it asks about setting up networking, you can select no. When asking about locating your installation, select skip because we do not want to mount any of our partitions.
For gentoo or the cdrom I linked to above, just allow the cd to boot until you have a bash prompt
For ubuntu/kubuntu - just allow the cd to boot into whatever desktop environment the cd uses (gnome for ubuntu, kde for kubuntu)
Once that is finished, create the mount points for your usb hard drive using this command sudo mkdir /mnt/usb
and then mount the drive using this command sudo mount /dev/sda1 /mnt/usb
make sure you substitute /dev/sda1 for whatever your drive is
With your usb drive mounted, now its time to do some serious file system recovery. We will be using the badblocks and debugfs utilities. You should have a look at badblocks manpage and the debugfs manpage. I will go over each option to these in order to show you how to recover files from the drive.
Using the fdisk utility, we need to get a listing of our drive partitions (unless you already know which partitions you need to recover files from). I will use /dev/hda in this example. You will need to modify all steps for your particular hard drive. Using the following fdisk command to show the partition structure of your disk: sudo fdisk -l /dev/hda
Mine is already displayed in the first post above, so I wont show it again, but to recap - I will be working on /dev/hda7 in my examples which is my root directory first. This might not sound logical at the moment, but fortunately I had a recent backup of my home directory. /dev/hda5 is my home directory, and /dev/hda1 is my boot directory.
Next, we will get a listing of all of the bad blocks on the hard drive and determine if the inodes within those blocks are occupied by any files. We need to run the following command, again, make sure you substitute /dev/hda7 for whatever partition you are working on. sudo badblocks /dev/hda7
The command above will print a listing of bad blocks one on each new line. You should save this to a file if it is longer than your shell screen. You can save to a file using the -o filename option to badblocks command like this - badblocks -o blocks.txt will save it to your current working directory within your rescue cd This command performs a read-only test to determine if the blocks are bad.
After the command above completes, we now have a listing of all the bad blocks within that partition.
Next we will use debugfs to work on the actual hard drive to first determine if the bad blocks and inodes within it contain data that we need to recover. If so, we will recover the files to our usb drive. If not, we will mark the block as bad and move to the next one until we are finished.
debugfs is a very powerful program - able to work on drives that are completely inaccessable. It works at the lowest level possible. It can destroy data too if you are not careful. There are many options you should look over in the manpage. However, I will only cover the most important options.
The command debugfs /dev/hda7 opens that partition in read-only mode. This is the way I encourage you to first open the drive. Recover as much data as you can from it, then we can exit and open the partition back up in read-write mode to actually mark the blocks bad. I will cover how to open the blocks in read-write mode later.
When you run the command, you are presented with a prompt that resembles this: debugfs:
You can type a ? and press enter to get a listing of all commands and options available to you.
the command params will show you if the drive is in read-only or read-write mode. Just type params and press enter to see.
The command open will open another filesystem if you are working on more than one. It uses the syntax open /dev/whatever
The command close closes a filesystem and uses the same syntax as open
The stats command will show information about the superblock by group
the command ls wil llist the files and directories on the partition you are working on - its also the easiest way to make sure we are working on the correct partition if you are unsure what each of your partitions from the fdisk -l command returns. So, right now, type ls and press enter. Is the file listing what it should be? If so, continue. If not, either type quit followed by return and re-run debugfs /dev/whatever to open the correct partition or type close /dev/whatever to close the current partition followed by open /dev/whatever to open the next parition.
the option pwd will print the current working directory
This is a basic overview of actually checking the bad blocks listing. The first command we will use is testb followed by the bad block number. In my example, I will use 890624 which is the first bad block listed on my list. My screen looks like this: debugfs: testb 890624
If the test says not in use, we know we have lost any data yet. On a piece of paper, create a column for in-use and another for not-in-use. write down that bad block number in the not-in-use column and then test the next one. Later we will go back in read-write mode and set the block as bad so that it is not used.
If the block is marked as in use, we need to determine what file occupies it and if it is a file we need, we will attempt to back it up or recover it. However, more than likely if the block is bad, so is the file inside it and you will need to restore from backup. If the block is in use, write down that block number on our piece of paper under the column in-use
We will use the command icheck 890624 (make sure you substitute your bad block number instead of 890624) which will give us the inode of the block. Once we have the inode of the block, we will use the command ncheck to get the file name occupying the inode. The format is ncheck inode_number You can also write down the filename if you would like next to the block number in the in-use column on your piece of paper. Later we will go back in read-write mode and mark them as bad.
Assuming you have made it through all of the bad blocks and wrote down on a piece of paper whether they were in-use or not, we can continue to recover as much data as possible. Since my /dev/hda7 is my root partition, I want to recover my entire /etc/ directory and all of its files and folders. Why? Well, this directory contains most of the configuration files I will need when I reinstall linux. Its just helpful to have a backup of those files for things such as my tweaked xorg.conf file and other things I customized.
The command we will use to recover a directory is rdump - which means to recursively dump the directory and all files. Its format is rdump folder path_to_save In my example then, saving to my usb drive, I would use the following command to save the entire /etc/ directory:
debugfs: rdump etc /mnt/usb
When that completes, you can move on to other files and directories. Just make sure you do not try to recover a file or directory that is bad, or the system may hang.
If you need to recover a subfolder, for example, /etc/X11 you can use the command cd /etc to actually change directories to the /etc/ directory of the partition you are working on. You can use ls to make sure that is where you are after you change directories.
If you only want to dump a single file, such as /etc/X11/xorg.conf you can use the dump command followed by the path to the file or the filename and then followed by the directory to save the file. So, using the same file as above, the command would be: dump /etc/X11/xorg.conf /mnt/usb
Now with as much data recovered as possible, its time to make a few choices.....first, is the hard drive so bad that its not usable? If your answer is yes, simply quit debugfs, exit your rescue cd rom, and power off to replace the drive. If you answer is no or maybe, you need to make the blocks as bad and then either attempt to boot the system up or reformat and reinstall or you can also replace damaged files from cdrom, etc to your hard drive.
I will assume you think the drive is still usable and you want to permanently mark the blocks as bad.....its time to mark the blocks as bad and unallocate them. We need to open debugfs in write mode. The easiest way is to use the quit command to exit, and then use the debugfs -w /dev/hda7 to open in write mode. The -w option does just that.
In read-write mode, we can use the clri which clears the block, so you would type clri inode_number (make sure you use your inode number here for the block you are working on) This will deallocate the inode and its corresponding blocks. Remember, you'll have to be in read-write mode to do this. Note that these commands are irrevocable in read-write mode.
Next we use the setb block_number to set the block number as bad and to permanently allocate the block, removing the inode that points to it from the pool of free inodes.
when all blocks are marked bad, you can now quiet and reboot to either reinstall linux or see if you can now boot into a state where you can replace damaged files.
Joined: Aug 13, 2004 Posts: 343 Location: Maysville, KY
Posted: Tue Sep 12, 2006 4:07 am Post subject:
one option I forgot to mention about debugfs is the -c option - this option is for catastrophic mode. In the event you are unable to recursively copy a directory or dump a file using the rdump or dump commands inside debugfs, you can attempt to quit debugfs it if didn't lock up on you, and then open it using this option:
debugfs -c /dev/hda7
again using the same examples as previously.
This allows debugfs to not read inode or group bitmaps and may allow you to complete the rdump or dump commands.
Joined: Aug 13, 2004 Posts: 343 Location: Maysville, KY
Posted: Tue Sep 12, 2006 9:14 am Post subject:
I might also add that when I first got the usb drive, it was partitioned as a single 250gb fat32 drive for windows. When recovering linux files, a fat32 drive has a limitation in several areas for user and group settings and also file sizes are limited to 4GB. If you have files larger than 4GB (very possible, especially if you are having to result to the next steps below to recover entire partitions), you will want to create an ext3 filesystem, or ext2, or whatever filesystem you prefer.
What I did to overcome this was used the qtparted utility included with kubuntu 6.06 to resize the existing partition down to about 100 gb and then creating an ext3 filesystem with the remaining available disk space (not quiet 150gb). I don't ever use windows myself personally, but sometimes I have to work on client/customer computers and it is noted that having a fat32 partition might come in handy during those times.
The ext3 overcomes the file size limitation by allowing me to save very large files to the ext3 filesystem and to also keep file permissions and ownership of the files the same.
When all else fails with the above posts to recover data, what you can do is use the dd command to make an exact image of the partition and save it to a file on the usb drives ext3 filesystem. The dd man page is helpful, but some rescue cd's do not include man pages in order to limit the size of the download. For that reason, here are some helpful information regarding dd:
You can then mount that file like a disk image/cdrom image and work with it there to extract files. The options you use would be (using same hda7 as the partition for me to copy). I also created a second mount point for the ext3 filesystem in /mnt/usb2 and mounted it like so:
sudo mount -t ext3 /dev/sda2 /mnt/usb2
The dd command to copy files to that location are then:
The above option conv=noerror instructs dd to continue in the event of an error (noerror)
Another option is if you are using another hard drive to mirror the image, it is:
dd if=/dev/hda7 of=/dev/hdb7 conv=noerror,sync
This would copy /dev/hda7 partition to /dev/hdb7 partition. The noerror option means the same as above and the sync option means to write 0's in place of the failed copy (sync).
You can then mount the created disk image much like a normal cdrom iso using the command:
mount -t ext3 /mnt/usb2/hda7.img /mountpoint -o loop
And, in the event that dd seems to slow, or it errors, you can also try dd_rescue
dd rescue is used on drives that have a lot of errors.
I borrowed the information below from this web site:
The old and slow method using dd Run dd if=/dev/old_disk of=/dev/new_disk conv=noerror,sync or to create an image file: dd if=/dev/old_disk of=image_file conv=noerror to copy the data.
To speed up the copy process, you can append bs=8k, it will read/write the disk by 16 sectors at a time.
Kurt Garloff's 'dd_rescue' If you believe there are many damaged sectors on the drive, you can try using either Kurt Garloff's 'dd_rescue' (dd_rescue) instead of dd.
The best method: Antonio Diaz's 'ddrescue' The best solution, both faster and more efficient, seems to be Antonio Diaz's 'ddrescue' (ddrescue)
# first, grab most of the error-free areas in a hurry: ddrescue -B -n /dev/old_disk /dev/new_disk rescued.log # then try to recover as much of the dicy areas as possible: ddrescue -B -r 1 /dev/old_disk /dev/new_disk rescued.log
Added 09-21-2006
If the hard drive has a lot of errors and you wish to quickly recover the good information, have a look at this dd_rescue helper script - dd_rhelp
From the dd_rhelp web site - In short, it'll use dd_rescue on your entire disc, but will try to gather the maximum valid data before trying for ages on badsectors. So if you leave dd_rhelp work for infinite time, it'll have the same effect as a simple dd_rescue. But because you might not have this infinite time (this could indeed take really long in some cases... ), dd_rhelp will jump over bad sectors and rescue valid data. In the long run, it'll parse all your device with dd_rescue. You can Ctrl-C it whenever you want, and rerun-it at will, it'll resume it's job as it depends on the log files dd_rescue creates.
In addition, progress will be shown in a ASCII picture of your device beeing rescued.
As stated by Kurt Garloff for his dd_rescue program : "Just one note: It does work. I unfortunately did not just create this program for fun ..."
As it is for dd_rhelp, which has saved me YEARS on my hard drive.
Why do people want to use dd_rhelp ? -----------------------------------
Well, you do not WANT to use dd_rhelp. I hope you'll never have to use it.
Basically, if you have bad sector problems you'll have several solutions depending on the filesystem, the partition table, and what is accessible...
Sometimes, you'll have to copy all the valid partition data in a file on a healthier filesystem. Then mount the file with loopback device to rebuild the damaged filesystem information. This is where dd_rhelp and dd_rescue are meant to be used.
dd_rescue is a great program, but using it sometimes can be time consuming as it won't stop on errors but will take long to get over them. This can really take a long time if you have much bad sectors. (and I had this problem).
As bad sectors tends to be in large groups and these groups seems to tend to be dispatched on drive, and if you just launch dd_rescue on the beginning of your drive and there is a large group of bad sectors coming next, you could be waiting for years ! (and without rescuing any data). And you cannot know if there is any valid data to rescue AFTER this chunk and how long it will take...
So your solution with dd_rescue is to jump ahead randomly and try to copy from a chosen offset. Then you could again fall on a group of bad sectors...and then you should stop dd_rescue and jump somewhere else on your drive. This behavior involves the user's constant presence (you !).
The idea of the dd_rhelp shell script is to do this job : launching dd_rescue for you on the disc while trying to get the max amount of data out of your disc in a minimum of time. It'll be jumping over bad blocks, using the reverse copy option of dd_rescue to pin out bad_sector group and rescue as much data as you could have rescued manualy.
Why use dd_rhelp and not dd_rescue ? ------------------------------------
This is a good question. dd_rhelp uses dd_rescue to compute a recovery path through the device that will focus on valid data recovering. This recovery path will go through all the device, exactly as dd_rescue could do it on its own without any path. This means that dd_rhelp will save you time ONLY IF YOU INTEND TO CANCEL ITS JOB BEFORE THE END of a full recovery.
Why wouldn't you want a full recovery ? because a considerable amount of time is taken to try to rescue badsectors. This amount of time can be mesured in days, month, years, depending on your device capacity and its defectiveness. You might not want to spend this time knowing that 99 percent of this time will be taken to look at badsector and won't lead to any more data recovering...
dd_rhelp shifts this useless waiting time to the end. Using dd_rescue strait throughout your device make your waiting time dependent on the badsector distribution.
Think about dd_rescue standalone if you only intend (and can afford) to wait until a full dd_rescue scan. dd_rhelp optimizes only the order in which this full scan will occur to focus on recovery of what will be recoverable in first. So in the end, launching dd_rhelp for a full scan will take exactly the same time dd_rescue would have taken plus a considerable time which correspond to the overhead of calculating its path.
How should I use it ? ---------------------
First build it from sources, with "./configure && make" Optionnaly run "make install"...
This shell script is very basic and not well written, but it supports the "--help" and "--version" of GNU Coding Standard. It should be quite clear.
so go for a :
dd_rhelp --version
When running dd_rhelp you can safely Ctrl-C, or kill dd_rhelp, it'll resume its job the next time you call it.
Olivier SANTIANO, a french dd_rhelp user shared his experience of complete process of recovering his hard drive with dd_rhelp and post-dd_rhelp recovery work : http://f1efq.free.fr/save.htm (in french)
How do I install this package ? -------------------------------
basic : "./configure && make && make install" should do the trick.
How does it work ? -------------------
dd_rhelp uses log files made by dd_rescue. Precisely, it searches for the "Summary report" that dd_rescue prints when its job is over.
1 - dd_rhelp creates hitself an internal representation of what has been parsed with dd_rescue. 2 - It'll find the greatest part of the disk that hasn't been tested and will launch dd_rescue from the middle of this part backwards and forwards until it rescues without error all data, or until it falls on 5 consecutive read errors. 3 - go back to step 1 unless everything has been dd_rescued...
Requirements ? --------------
If you can ./configure && make... then it should execute fine. It worked fine for me (Home made distrib) on big harddrives (partitions of 15 Gigs). Received positive feedbacks on large partition (60 Gigs and 200 Gigs), and it should only be limited by the linux kernel limitation. Though the bash script could be longuer to compute next position in very large disk with lots of bad sectors scatered all over your disk.
It worked on Debian, and on a Knoppix CD. Tested it with a 1.44M diskette with badsectors and it worked fine. (This was for version 0.0.2)
Darwin/MacOSX should be supported with GNU sed, GNU bc installed. Has been working since 0.0.6 .
Tested 0.0.4 on : (with a 1.44 diskette with badsectors) - Sys Rescue CD (http://www.sysresccd.org) * Note : you'll need to "make" it somewhere else. - KNOPPIX 3.3 (http://www.knoppix.com) * Note : perfect, you can configure && make and download dd_rescue v1.03 and start recovery.
If you have any other experiences of dd_rhelp, please let me know.
IMPORTANT NOTE :
This shell script needs version >= 1.03 of dd_rescue !!!! It won't detect your version in "./configure && make" but at runtime !
Edited: 09-21-2006 to add links about dd and to add all information about dd_rhelp
Joined: Aug 13, 2004 Posts: 343 Location: Maysville, KY
Posted: Tue Sep 19, 2006 3:15 am Post subject:
so, once you have as much data off the drive as possible, now what?
You have a couple of choices. If, like me, your hard drive is still under warranty and you are sending it back to the manufacturer....you might not want to send it back with sensitive information on it. If this is your case, read Scenerio 1.
If your hard drive is not under warranty, read Scenerio 2.
Scenerio 1
Use this if you are sending the hard drive back to the manufacturer and you have sensitive information on the device. This also insures the data is erased.
First thing I did was used dd to erase the mbr, the format was this:
dd if=/dev/urandom of=/dev/hda bs=512 count=1
On some linux bootable disks, /dev/urandom may be /dev/zero so use the correct one for your distribution.
I then chose to selectively erase parts of the partitions with garbage. Why, well, the more times I write to the disk, the chances are that data can't be recovered.
I used this dd command format for doing this:
dd if=/dev/urandom of=/dev/hda1 bs=1024 count=37
dd if=/dev/urandom of=/dev/hda2 bs=4096 count=50
dd if=/dev/urandom of=/dev/hda5 bs=4096 count-150
The important thing to note abou the above is the blocksize is the actual block size for that partition, the count number is just a random number. I did the above for each partition twice using various count numbers.
The next thing I did was erased the entire hard drive. The format I used was:
dd if=/dev/urandom of=/dev/hda bs=1M
Expect up to 1 hour per gig for the above command to finish, i.e. a 60GB hard drive might take as long as 60 hours....or longer.
The above command literally erased the entire hard drive. It removes the mbr and all partition information.
Just out of curiosity, when I was done I recovered a small sample of the boot partition (amounting to about 3/4 of the total partition size since it wasnt full I knew if was almost full....so chances are that there should have been data there if there was any at all. Using the tools mentioned for recovering data in the previous posts, I basically found no recoverable and readable data.
This step is really optional but thought it would make this post complete, I would also create a blank partition that fills the entire disk and format it. Then run the bad blocks program to find and locate all bad blocks....if it will actually run at all and not lock up.
You can now proceed with sending your hard drive back to the manufacturer. You can also, if you haven't already, restore your backed up data to the new hard drive.
Scenerio 2
Your hard drive isn't under warranty but it does have sensitive information that you don't want someone to possibly get. Well, follow the steps above and erase the disk. When you are done, you can either destroy the hard drive or dispose of it in the trash. I would recommend you destroy the drive as an added measure of safety.
Restoring Data
So, you may be asking....how do I restore the data to the new hard drive? It really depends on what you want restored and the way you recovered it. I basically created images of the partitions, then mounted the images and checked what was still good. If I needed it, I copied the data to a usb hard drive. I then installed the new hard drive, installed Linux, and selectively restored the data that I recovered.
If you recovered entire partitions without errors, you could also install the new drive, format the drive the same way as the old one, and restore from the images created using the dd utility. For example, lets assume I was able to recover the entire home partition, which on my drive was hda5. I would reinstall linux on the new drive, create a user account the same as the old failed drive had and boot to make sure it is working. After I verified the system is working, I would then boot to a rescue cdrom and recover the image using dd. The format would be like this (notice that the partitions have to be the exact same size between the failed hard drive and new hard drive for this to work):
dd if=/path_to/ddimage.hda5 of=/dev/hda5 bs=4096
You will also likely need to fix the user and group ids of the files since they are probably different. This is done with a simple command:
chown -R username:group /home/username
Another method is to mount the created image and then cp or cp -R the files back. This requires you to boot into the new hard drive with linux, create a mountpoint for the image, mount the image file on the usb drive, and then cp (selectively) or cp -R (all files and folders) the files you need to the /home partition. You will then need to manually add and fix usernames, passwords, group and user id numbers....or manually create the accounts and then chmod the files to the correct user and groups.
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
All content Copyright 2000 - 2008, Maysville Linux Users Group unless otherwise credited.
All Rights Reserved!
The opinions expressed by visitors to this web site are their own and not necessarily the opinions of the MLUG!