AaronCrane.co.uk

Unix filesystem semantics are useful

There’s been a minor furore lately about ZFS and Mac OS and laptops.

First, AppleInsider reported that ZFS would play a larger role in future versions of Mac OS X. MWJ took that article to task, contending (once you ignore the snarky asides) that ZFS is inappropriate for laptops. Drew Thaler, previously a filesystem engineer for Apple, chimed in to extol the good qualities of ZFS.

Then MWJ responded, explaining their position in more depth. Which is all well and good, except that some of the things they said made no sense at all.

The point of the operating system is to make the computer easier for the customer to use, not for the programmer to maintain. Writing easy code and pushing the learning curve onto the user is how we got command-line systems in the first place.

Uh, what? When interactive command-line user interfaces first appeared, they were a modern alternative to all the painful punched-card batch-submission systems, and relied on fancy-pants hardware. In very much the same way, that is, as the Mac kick-started the widespread adoption of a better class of user interface, but required cutting-edge hardware to provide its features.

Thaler contends that case-insensitivity could be enforced in the human interface in the “Save” dialog boxes and in the Finder, “which are just about the only two places you actually need it.” And in the custom Adobe file boxes. And in Path Finder. And File Buddy. And in Terminal, unless you’re going to allow people to create case-sensitive filenames that duplicate case-insensitive ones that they then couldn’t access in any other way.

Thaler’s dead right on that.

There are already two distinct user interfaces (Gnome and KDE) designed for systems with case-sensitive filenames, where you can both (a) not notice that the filesystem is case-sensitive, and (b) sneakily create pairs of files with letter-case-only distinctions, and yet access both through the normal user interface. This works in just the same way as having one file called, say, ‘Tea’, and another called ‘Теа’ (which you may not notice is in the Cyrillic script), and the user just copes with that sort of distinction, because humans, it turns out, are quite good at disambiguating things based on context.

Furthermore, the idea of putting case-folding for filenames into the kernel is just broken. Case-folding rules are language-specific. Turkish is the canonically-awkward example; where most Latin-script writing systems have i–I as a case pair, Turkish has i–İ and ı–I, so case-folding text containing either I or i requires knowing the language of that text. This means that the software layer that handles filename case-insensitivity needs to know the current user’s preferred language. Putting that knowledge into the kernel is fundamentally incompatible with the notion of offering file services to other computers nearby.

The central MWJ thesis (that the cool features of ZFS aren’t well suited to laptops) is extremely cogent, but ranting so incorrectly about Teh Evils Of Teh Unix does them no favours

Update, 11 Oct: Thaler also responds to some of the MWJ complaints here.