Space aliens ate my HFS headers

Code and wisdom in this article have not been kept up-to-date. Use them at your own peril.

I am pretty good about keeping backups. (I learned the hard way.) Yet, due to a variety of problems, I ended up not being backed up for about two weeks at the end of November 2002.

I was sitting in front of my PowerBook G4. The new backup media was due to arrive tomorrow. I casually unmounted my external FireWire disks, and moved to the living room to watch a DVD.

An hour and a half later, I plugged the external disks back into the laptop.

What do you mean, “Mac OS X doesn’t recognize volumes”?

The dawning comprehension

A trip to Mac OS 9, and I knew that both my external disks had an “error Disk First Aid cannot repair”, that “The directory was too severely damaged for DiskWarrior to repair it”, and that “Norton Utilities encountered an error #-39 and cannot continue”.

Not good.

The program that had the most useful error message was DiskWarrior. Not because I know what directory structure is, nor because I particularly wanted to hear “too severely damaged” that day. It was because it told me to email their support and they might be able to help me.

Alsoft Tech Support

So I did, and they told me to download a copy of Sedit.

Then we got on the phone, and they walked me through looking at various sections of my disk, trying to figure out where the missing bits of the data were.

They weren’t there. It was clear that Mike Rodgers, the Alsoft tech support guy, was doing all he could, but that my disk was beyond his ability to resurrect dead disks.

The optimism

I thanked Mike, hung up, and started thinking about my options. I could wipe the disks and restore from 2-week old backups. I would lose a few days’ worth of code (everything else was in CVS). A few things here and there. Nothing I couldn’t bear to part with.

However, there was one thing that really kept me thinking. When I was on the phone with Alsoft and we looked at the damaged part of the disk, a lot of things in that area seemed valid. It’s just that a whole lot of really important bits were completely bogus.

Hope springs eternal; I figured there’s a slim chance that I can figure out which few bits were trashed, and recover them.

The happy ending

I spent two days reading HFS+ format technote. I analyzed the extent and the nature of the damage, and repaired it by hand. I got all my data back without any problems.

I emerged victorious, and came to two conclusions:

  • Blame most likely falls on Mac OS X. It either caused the damage, or failed to prevent someone else from causing the damage; either way, the operating system should not do this to my disk.
  • This, therefore, may happen to other people.

And hence I present you with…

The super-scary instructions for fixing your trashed volume headers

I cannot emphasize this enough: if you do something wrong while following these instructions, you can easily trash your data beyond repair. Even beyond my ability to repair it. If you don’t know what you are doing, stop here.

The tools

You will need the following tools:

  • Any hexadecimal calculator. I used MacsBug, but that’s really heavy-weight. Find one that you like.
  • Sedit

All numbers below here are hexadecimal, unless noted otherwise.

Finding your HFS+ volume headers

The volume header is the part of your disk which tells Mac OS all sorts of valuable information about the location of the data on the disk. The first step is to find the volume headers to determine whether they are damaged in a way that can be repaired with what I learned.

Start up your computer in Mac OS 9 with the damaged FireWire disk unplugged. Launch Sedit and plug in the disk. From the File menu in Sedit, choose Open Disk Thru Driver. Sedit will present you with a list of disks, one of which will be listed with either the name of your damaged disk or “Macintosh HD”. It is likely to be the last disk in the list. Choose that disk from the list and click OK.

An Sedit window will open showing you a dump of the data on your disk.

From the Block menu, choose Read Block Number, enter the number 2 in the dialog that appears, and click OK.

From the Template menu, choose HFS MDB. The data dump in the window will be labelled and rearranged. Write down the values of the following fields: Sig, ABlkSz, AlBlSt, VCSize, VCBM, CtlCSz.

If Sig has a value other than 4244, or VCSize has value other than 482B, check that “Displaying Block” in the top left corner of the window claims you are looking at block number 2. If it does, stop; these instructions are no good to you. If it doesn’t, start over — you are looking at the wrong block.

Next, calculate the following values:

  • HFS+ extent start: extentStart = VCBM
  • Number of allocation blocks: allocationBlocks = CtlCSz
  • allocation block size: allocationSize = ABlkSz
  • allocation start: wrapperOffset = AlBlSt
  • HFS+ block ratio: blockRatio = allocationSize / 200
  • HFS+ volume offset: volumeOffset = blockRatio * extentStart + wrapperOffset
  • HFS+ volume header: volumeHeader = volumeOffset + 2
  • HFS+ backup header: backupHeader = allocationBlocks * blockRatio + volumeHeader - 4

For example, I got the following values for one of my damaged volumes:

  • extentStart = 5
  • allocationBlocks = FD36
  • allocationSize = 71000
  • wrapperOffset = 18
  • blockRatio = 71000 / 200 = 388
  • volumeOffset = 388 * 5 + 18 = 11C0
  • volumeHeader = 11C0 + 2 = 11C2
  • backupHeader = FD36 * 388 + 11C2 - 4 = 37E386E

and these for the other:

  • extentStart = 5
  • allocationBlocks = FF83
  • allocationSize = 1CD000
  • wrapperOffset = 18
  • blockRatio = 1CD000 / 200 = E68
  • volumeOffset = E68 * 5 + 18 = 4820
  • volumeHeader = 4820 + 2 = 4822
  • backupHeader = FF83 * E68 + 4822 - 4 = E613F56

The last two values you calculated give you the location of your volume header and backup volume header. Next we will look at them and try to figure out whether they can be fixed using this procedure.

Sanity-checking the HFS+ volume headers

Go to the volume header, using the Read Block Number command in Sedit, and giving it volumeHeader as calculated above. Select HFS+ Volume Header from the template menu and write down the following values: Sig, Vers, Attrib, LastVer, BlkSiz, TotBlk, BitBlk, Bit1.

Go to the backup volume header, using the Read Block Number command in Sedit, and giving it backupHeader as calculated above. Write down the following values: Sig, Vers, Attrib, BlkSiz, TotBlk, BitBlk, Bit1.

Compare the corresponding values from your volume header and the backup volume header. If they are not the same, stop now. These instructions do not apply to you. If they are the same, that’s a good sign.

Next, look at the Bit1 value you have for the volume header and the backup volume header. Its first half should be 00000001. Its second half should be equal to the value of BitBlk. If either of these is not true, stop now. These instructions do not apply to you.

Finally, look at the value of LstVer. It should be 31302E30. If it is not, stop. These instructions do not apply to you.

Calculating the correct HFS+ volume size and block size

If you are still with me, then you are ready to calculate the correct values for the damaged fields. First, some intermediate values:

  • blockSizeSquare = allocationBlocks * blockRatio * 40 / BitBlk

Now, round blockSizeSquare up to a power of 4: if the first non-zero digit of blockSizeSquare is 1, 2, or 3, replace it with 4; if it’s 4 or higher, replace it with 10; replace all other digits with zero.

Next, take the square root of blockSizeSquare to calculate the HFS+ block size. If your hexdecimal calculator has a square root function, use it. Otherwise, follow this algorithm:

  • If blockSizeSquare has an even number of zeros after the first non-zero digit, replace the non-zero digit with its square root, and remove half of the zeros that come after it;
  • if it has an odd number of zeros, and it starts with a 10, replace the 10 with a 40, and remove one half of the remaining zeros.
  • if it has an odd number of zeros, and it starts with a 40, replace the 40 with a 80, and remove one half of the remaining zeros.

The number you thus get is the blockSize. Finally, calculate totalBlocks = allocationSize / blockSize * allocationBlocks.

Again, here are the examples of my two disks:

  • BitBlk = E0

  • blockSizeSquare = FD36 * 388 * 40 / E0 = FF78C4

  • blockSizeSquare rounded up to a power of 4 = 1000000

  • blockSize = 1000

  • totalBlocks = 6FC4D6

  • BitBlk = 399

  • blockSizeSquare = FF83 * E68 * 40 / 399 = FFCA05

  • blockSizeSquare rounded up to a power of 4 = 1000000

  • blockSize = 1000

  • totalBlocks = 1CC1EE7

Note that I got the same block size for both; this is expected. You should expect to get 1000 as well. If you don’t, it is much more likely that you did some math wrong than anything else.

Repairing the volume headers

That’s it. Now we need to assemble all the repaired values and write them to disk. They are:

  • Sig = 482B
  • Vers = 0004
  • Attrib = 00000000
  • BlkSize = blockSize, as calculated
  • TotBlk = totalBlocks, as calculated

To be extra safe, you should save the old (bad) values in your volume headers before you overwrite them. Use the following procedure:

  • Choose Save Blocks to File, and save 1 block starting at volumeHeader
  • Choose Save Blocks to File, and save 1 block starting at backupHeader
  • Go to the volume header, using Read Block Number and volumeHeader
  • Edit the five values you calculated
  • Choose Write Block to write the volume header
  • Choose Write to Block Number using backupHeader, to write the backup volume header

You are almost done. Your disk is now in the state where DiskWarrior can repair it. You must run DiskWarrior now to complete the repair. Enjoy.