Every day, I get asked one question over and over—namely:
My backup is failing, and it’s saying that it couldn’t disable ignore permissions! What’s going on?
This started the day Tiger shipped, and has continued on a hundreds-of-times-a-week basis since. And the worst part? It’s a bug in Tiger. So, put on your waders—we’re going in, hip-deep.
Basically, here’s the deal. OSX has a file called “volinfo.database”, which is stored in a normally-hidden-from-you folder named “/var/db”. It’s a simple text file that looks something like this:
290AC3DC4B28C8FC: 00000001
9A081A3451A451C7: 00000001
435F42E92DC994B5: 00000001
C3301F4FEBB1BC41: 00000001
F36BADBD7ACFBD6F: 00000001
AB47F9CD780BF0A8: 00000001
76E8B50DFBBBD9BB: 00000001
893D1C0A293603C7: 00000001
That looks like gobbledegook to most, but it’s really a database of the volumes you’ve got connected to your Mac, listed by UUID (a low-level internal identifier that’s constant, even if you rename the volume), followed by the state of their “Ownership” flag: 0 if ownership is off, 1 if it’s on.
When you attach a new drive to the system, the UUID of the drive’s volumes are added to the database, along with their “ownership” setting. And, when you check/uncheck it in Finder’s Get Info panel, the value changes here, too.
Apple ships a command-line tool, vsdbinfo, that allows programs to check and adjust the ownership state of a given volume.
It’s very important for SuperDuper! to ensure that ownership is turned on when it makes a copy. If it’s off, the system does strange things with the file’s owner and group (see Floating ownership nearly sinks us for the story of that particular horror). So, we check it, with vsdbutil, during the “preparation” phase before starting a copy.
Which brings us to Tiger. When Tiger installs, it goofs up when it manipulates this file. So, instead of what you see above, you get:
290AC3DC4B28C8FC: 00000001
9A081A3451A451C7: 00000001
435F42E92DC994B5: 00000001
C3301F4FEBB1BC41: 00000001F36BADBD7ACFBD6F: 00000001
F36BADBD7ACFBD6F: 00000001
F36BADBD7ACFBD6F: 00000001
F36BADBD7ACFBD6F: 00000001
It’s a really simple bug. And while I’ve reported it, it hasn’t been fixed. We’ve waited, provided our users with a workaround, and waited, and waited… and it’s just not getting fixed.
This is really frustrating for our users, because things don’t work in a mysterious (and ungrammatical) way. And it’s frustrating for us, because it makes us look bad, incompetent and/or lazy. Honestly, we’re not.
So, since it’s not getting fixed by Apple, we decided on an alternate approach. I did the research necessary to ensure the problem was consistent, and to figure out how we could fix it for them. After looking at hundreds of these, and confirming that the problem is as described above, we’ve integrated a fix into v2.0.
Specifically, we examine the structure of this text file. If it’s missing the carriage return, we correct it, leaving the semantics intact (all volumes with ownership on/off stay the way they are).
In our testing, this solves the problem 100% of the time… and I, for one, won’t miss directing people to the FAQ entry that, as of this writing, has been viewed 5325 times since April.
I’m sure our users will be much happier, too!
26 Oct 2005 at 07:10 pm | #
Dave,
Is there a public API which will get me the UUID of a given volume?
26 Oct 2005 at 09:55 pm | #
THANK YOU, DAVE!
For the explanation, the solution, the fight.
We’ve been fighting with them over the fact that if you download a Clarisworks/Appleworks file from a website, using Safari 2.0, instead of the extension being .cwk it is .cwk.txt - maddeneing. Simple if you know to remove the .txt but otherwise ... Finally today they admitted it was their issue.
26 Oct 2005 at 10:26 pm | #
Anonymous—try man hfs.util or man getattrlist. One’ll work from the shell, one from code.
Tim—yeah, we have a similar issue with, of all things, IE/Mac. Most people have stopped using it, but there are still some, and it doesn’t work with our download links on the Shirt Pocket web site…
26 Oct 2005 at 11:32 pm | #
When you post things like this, it’s great if you can include the Radar ID of the bug you’ve reported. That way anyone from Apple who’s browsing can trivially find your bug. (Which brings up the point that I’d like to have that number...not because I’m likely to be able to do anything about it, but I’m curious.)
27 Oct 2005 at 07:26 am | #
Thanks Dave. I couldn’t get hfs.util to work from the shell (the man page says -k, but the actual tool doesn’t provide -k in the help.)
I’ve look at the xattrs on the root of the drive and didn’t see the uuid either.
27 Oct 2005 at 07:28 am | #
Eric—the original follow-up # is 9598976. However, that’s in a parallel system to Radar, contact me directly for clarification. I’ve re-entered the bug as rdar://4318548.
27 Oct 2005 at 11:39 am | #
Yeah, that parallel system unfortunately doesn’t get the but over to the engineering side (or at least didn’t in this case; maybe it does in others). Hopefully now that you have it in Radar it’ll be fixed.
27 Oct 2005 at 11:46 am | #
This did bounce back/forth to engineering a few times, actually, so I think it did make it in some form. I don’t have the associated Radar #, though, as it wasn’t provided to me.
28 Oct 2005 at 11:53 am | #
Honestly, I’m not surprised the bug wasn’t fixed. Think about it. You have potentially millions of lines of code on your system that could be affecting this one file. That’s not “really simple”. Even Dave has admitted that he has no idea what is messing it up (and moreover that it doesn’t happen on all systems). What does that kind of bug report look like to Apple? “Some code somewhere is messing up some file”.
If you can reproduce this bug, the non-lazy and/or non-incompetent thing to do would be to turn on fs_usage or some other form of accounting, and see who the offender is. It might not even be part of OS X, especially seeing that the problem doesn’t happen to everyone.
[john setStatus:@"developer who deals with incomplete bug reports all the time"];
28 Oct 2005 at 03:19 pm | #
Well, John, I’m not sure calling me lazy and/or incompetent is fair (though perhaps it’s “Fair and Balanced"). Nor is your characterization of the nature of the problem, namely that “millions of lines of code” “could be” affecting the one file.
Let’s look at what happens. The problem occurs during install—from all indications, it’s the installer—or the scripts it runs—that are messed up: specifically, the part that imports existing volumes. That limits the actual amount of code involved considerably (and makes it rather difficult to trace with fs_usage or other accounting, since the boot environment used is both limited and read-only).
On top of that, there are only two bits of code authorized to manipulate this database: hfs.util and vsdbutil. Both are relatively small and pretty easy to investigate. As would be any part of the installer that attempts to directly import attached volumes.
And it’s definitely OSX, because it often happens during a clean install of Tiger.
So, the issue is relatively well defined and quite limited: simple, even. And while the report might not be ultra detailed, it’s not really my job to “debug” the bug for the developer, nor point to the specific line of code that’s broken.
As a fellow developer, I don’t expect that in my own reports. Sure, it’s nice when it happens, and saves me time. But most users don’t have the time or ability to “nail down” a bug for me, nor should they: they’re not paying for the privilege to be my QA department. They’re users. Their “job” is to let me know that something went wrong, and give me the details they can. Sometimes, the problem is unique to their system, or the way they used one of my programs, or something they have installed, or whatever. It doesn’t matter. Even if intermittent, I spend the time necessary to figure out what the problem is, and—if I can—to fix it.
Don’t our users deserve that? I certainly think they do.
29 Oct 2005 at 01:01 pm | #
Look, I’m not calling you lazy or incompetent, those words were directed at actions not people, and they aren’t even my words so much as my turning your words away from Apple.
All that about who is authorized to write the file, forget it. Obviously anyone can write the file. You know that, because you do it. If Apple thinks only vsdbinfo and hfs.util write the file, I have to think that they would have poured over those programs upon receiving your bug. I don’t think it happens as a part of clean installs because I have done literally hundreds without ever seeing this problem, and it sounds like a much more substantial percentage of your users have this issue. So what are they doing or running that I don’t?
We’re back at “something somewhere messes up some file”, and that’s just not a useful bug report.
What someone who CAN reproduce it needs to do is go into FW disk mode post-clean-install, pre-macbuddy. This gives us the ability to inspect the file directly after installation (telling us if the bug was part of a clean install or not) as well as alter the system before its first-ever boot (allowing us to enable some kind of accounting). No one is being asked to point to a specific line of code, just being asked to provide a bug report that is useful. Of course, just sharing the reproduction steps in the first place would suffice.
I suppose what I’m getting caught up on here is the simultaneous confession that you don’t know how to reproduce it along with the claim that it is a really simple bug (in bold letters, even). It’s one or the other.
[john attemptAction:@"move forward instead of grandstanding on blogs"];
29 Oct 2005 at 01:47 pm | #
I guess I don’t see why you feel the need to turn my words away from Apple, John. I think Apple can handle it.
Anyway, regardless, the main thrust of the post was to explain the problem and show an example of when this kind of thing can cause significant support issues for 3rd party developers.
Furthermore, I now know that Apple knows what the problem is. It’s complicated by the fact that the bug occurs during installation, and—obviously—is on the distribution DVDs. So, even if fixed, the fix itself won’t propagate to the field… so we have to work around it.
As we have… and with that, we’re back to the end of the original post.
03 Nov 2005 at 08:31 am | #
how goeth this for tiger 10.4.3? i just checked my volinfo.database on 10.4.3 and everything looks sweet :thumbs: giving superduper a go now…
03 Nov 2005 at 08:42 am | #
As far as we can tell, there’s no fix for this problem in 10.4.3. (I removed your email from the comment, as requested, and deleted the request.)
04 Jan 2006 at 08:51 pm | #
Yeah, you’re right, there’s probably no real need for me to turn away your ugly words for Apple, and they can probably handle it. So, call me a stickler for correctness and anal-retentive about setting things straight on the internet. After all, you’re not complaining about Tim in post #2 thanking you for “the fight”. Like I said, at my company I get crappy bug reports all the time, and rather than engaging in “the fight” I’d much rather see the reporter provide information that is actually useful. That way, I can fix the bug, making the user happy. They get the fix instead of grandstanding on the internet and making a stink, and I’m happy. Simple.
So, through some stupid (complicated) shell tricks and a little luck I found the source of the bug. The file is corrupted by Migration Assistant. Yeah, that’s Apple’s software, and ships with OS X. Yes, that makes it their bug, but it’s absolutely NOT corrupt on a clean install as you claimed it was. How do you expect them to fix a bug when you send them on a wild goose hunt like that? There’s a LOT of code in OS X. Finding a bug like that is hard enough without objectively wrong information in bug reports.
And yes, a bug fix won’t fix the corrupted files already out in the wild, so something needs to be done about that.
[john setStatus:@"usually right in the end"];
14 Feb 2006 at 11:12 am | #
Sorry I didn’t respond to you, John—I didn’t get notification of this particular post.
Yes, we’d determined post-post that the bug was in the migration assistant (which we were using, during a clean install, to bring over an account and SuperDuper to speed testing). It wasn’t something we expected, either, which was obviously our own fault.
That said, the problem slipped through Apple’s own QA, and the code paths involved were quite limited. Only a few things touch volinfo.database, and thus the problem was relatively easy to find. Alas, since it’s the Installer on the CD, the fix we implemented was still needed, and has been working great.
I’m sure it’ll be fixed “for real” in whatever follows Tiger.