
Using the Linux kernel's Case-insensitive feature in Ext4 - mfilion
https://www.collabora.com/news-and-blog/blog/2020/08/27/using-the-linux-kernel-case-insensitive-feature-in-ext4/
======
kasabali
> The case insensitivity is just a horribly bad idea, and Applie could have
> pushed fixing it. They didn't. Instead, they doubled down on a bad idea, and
> actively extended it - very very badly - to unicode. And it's not even
> UTF-8, it's UCS2 I think.

> There's some excuse for case insensitivity in a legacy model ("We didn't
> know better"). But people who think unicode equivalency comparisons are a
> good idea in a filesystem shouldn't be allowed to play in that space. Give
> them some paste, and let them sit in a corner eating it. They'll be happy,
> and they won't be messing up your system.

Linus Torvalds, 2014,
[https://web.archive.org/web/20150112214037/https://plus.goog...](https://web.archive.org/web/20150112214037/https://plus.google.com/+JunioCHamano/posts/1Bpaj3e3Rru)

~~~
colejohnson66
The problem with case sensitivity is UX. Sure, one can explain to someone that
“A” and “a” are different to a computer, but will they _understand_?

~~~
mkhpalm
I think so, they understand when writing a sentence you don't start it with
lower case character. They know to capitalize names. They recognize every time
they see it not done its considered a mistake.

Why is this the exception?

~~~
saagarjha
I have met very many people who have trouble doing that–and no, it's not
because they thought it was a username, or they were too lazy to use the shift
key.

~~~
mkhpalm
Are you suggesting that these people would argue that its not wrong if you
pointed at it and asked?

~~~
saagarjha
Yes, I think there is a fair number of people who don’t know that some nouns
should be capitalized.

------
hajile
Linux systems are more than just a kernel. Does every program across the
system now understand that direct comparison of strings doesn't work on
filenames? Does every regex touching or analyzing filenames know that they
must now be case insensitive? How long until people stop getting bitten by
this issue?

That aside, inferring semantic meaning based on a few cultures at this moment
in time is a dumb choice. There are tons of such language-specific semantics
that could be implemented and the result is needing to memorize all those
rules about when it does and doesn't actually matter (and arguing that only
certain cultures and languages should have their own semantics encoded in the
OS is its own problem).

~~~
james412
This was discussed to death on the kernel mailing lists, you should go read
them.

The principal question is whether the tool is more important, or the end user.
Why did I pay for this machine if it weren't intended to facilitate me? That's
the bottom line with most of these kinds of technical "correctness" arguments

And as for whether userspace should catch up, thanks to OS X for the most part
that already happened a long time ago for a ton of open source packages

~~~
asveikau
> Why did I pay for this machine if it weren't intended to facilitate me?

I happen to agree with the idea that the filename should be a dumb blob of
bytes and the kernel should not do case folding, as it is the wrong layer for
that, eg. the user can change their language but it won't update what has been
written to the disk in thousands or millions of places where you could
suddenly have a filename collision somewhere based on those rules changing.

But, I do hope you get that refund for your Linux.

~~~
kochthesecond
> dumb blob of bytes

Well, now your filename is invalid utf8. How should programs display it or
even address such a file?

~~~
jcelerier
> How should programs display it

what's wrong with foo����.txt

> or even address such a file? ... by using the array of bytes ?

~~~
ygra
It's ambiguous, for example.

~~~
jcelerier
so are a file named Hello.txt and another one named Нello.txt

------
qalmakka
Does case folding in the kernel pass the Turkey test? I.e., are different
locales taken into account in order for the correct string to be matched? As
far as I remember this was a big deal for supporting Unicode on case
insensitive filesystems, because it than means that a file stops existing
depending on the current locale.

For instance, take a file named "ivory.txt": `stat("IVORY.TXT")` on a case-
insensitive filesystem would succeed if the locale is en_US but fail on tr_TR
due to the uppercase version of 'i' being 'İ' there instead.

------
mixmastamyk
As someone who grew up on Commodore, DOS/Win, and later Unix, I just never had
a problem with case sensitivity or lack of. _shrug_

99% of the time I use lower-case only filenames, as they are easier to read.
The times I don't shell completion and/or GUI selection obviate the need to
care anyway. Given the significant complexity of Unicode I'm not sure
insensitive is the way of the future.

There was a window of time where insensitive made the most sense, the time of
DOS. 8-bit per character filenames, with a very primitive CLI shell, ie. no
assistance. Now? In the days when a majority of users don't even see
filenames? Meh.

------
muststopmyths
Until I read the comments to TFA I had no idea how passionately people get
their knickers in a twist over case-insensitivity of filenames.

Very interesting feature, of course.

~~~
bxparks
I definitely prefer case-senstive. I didn't realize that MacOS had switched to
case-INsensitve at some point. So the following drove me crazy for several
minutes:

$ ls -l /usr/local/bin/virtualbox

-rwxr-xr-x 1 root wheel 77 Oct 2 2015 /usr/local/bin/virtualbox*

$ ls /usr/local/bin | grep virtualbox

<nothing, WTF??>

$ ls /usr/local/bin | grep -i virtualbox

VirtualBox*

My coworker uses MacOS, I use Linux. Several times, changing the case of a
directory or file caused the MacOS to mess up the local git repo so badly that
we had to blow it away and refetch from remote.

At least the Mac Finder allows changing the case of a filename. Windows File
Explorer simply refuses. I change the case of the file name, add a random
character, save the file name. Then edit the file name again, remove the
random character, save the file name. It's needlessly annoying.

I think case-INsensitive makes more sense for software developers, since there
are many instances where things like "string.h" is _not_ the same as
"String.h". But for normal people, case-INsensitve may be more useful.
However, even for normal users, I think there are situations where "bob.html"
(a noun) is different from "Bob.html" (a person).

~~~
muststopmyths
>At least the Mac Finder allows changing the case of a filename. Windows File
Explorer simply refuses

Just tested on Windows 10 and this is definitely not true. I could have sworn
you could always do it, but my memory of Windows < 10 is foggy. And you can
also do the renaming from any console app (cmd, powershell, etc), FYI.

NTFS is case-preserving, but case-insensitive. That, IMO is a good balance.
When they were first making NT with the posix subsystem, I suppose they found
that the number of cases where two files that only differed in case lived in
the same directory were small enough to not matter.

I personally don't think case-sensitivity in file names adds anything useful.

~~~
bxparks
I swear that my Windows 10 File Explorer had this problem.

Here is a user complaint about this problem from Nov 2019:
[https://answers.microsoft.com/en-
us/windows/forum/all/cant-r...](https://answers.microsoft.com/en-
us/windows/forum/all/cant-rename-case-in-file-
explorer/cc4f69e9-cc3b-44c1-b81c-773470d36b18)

Here's another post from May 2018, explicitly mentioning Windows 10:
[https://answers.microsoft.com/en-us/windows/forum/all/why-
ca...](https://answers.microsoft.com/en-us/windows/forum/all/why-cannot-
change-filename-case/2f5f4e6d-9721-44f4-96ec-882fe66d8aa0)

But I checked again on my own Windows 10 machine (Windows 10 Pro, Version
2004, build 19041.450) and holy shit, it now works. It must have been fixed in
a recent Window 10 update.

------
laurentoget
I have always felt putting case-insensitivity at the filesystem level was a
bad idea which would have died in the 80s if it was not for some Apple product
manager being stubborn.

But making it so that it requires a kernel compile time option to make your
filesystem mountable is a level of absurdity that i would not have believed
possible before 2020 made unbelievable the new normal.

------
akdor1154
What are the benefits to doing this? From the article I got two, "wine can get
the kernel to do case insensitive path stuff instead of emulating it itself",
and "users don't need special userspace magic to treat their filenames as
strings, not bytes".

I have never seen anyone at all mope over lack of the latter, and the former
seems quite specific to get such a big feature landed over. What other use-
cases are there that benefit from this being in the FS?

~~~
myself248
I mope at the lack of the latter, and it kept me off linux for twentysome
years. Only now that I do most of my work in a GUI and I just click on
filenames rather than typing them, does a case-sensitive filesystem not grind
my gears. Every time I have to cd Downloads instead of cd downloads, I wonder
who thought that was a good idea.

Case-sensitivity is a classic case of users being forced to comply with the
computer's needs, rather than the other way around. I contend that that is
Wrong, period.

To put it another way: Unicode is the opposite. It is computers adopting
complexity to serve the needs of humans. If we can do unicode, we can do case-
insensitive filename matching. If we're going to insist on case-sensitivity
and ignore the needs of human language, we should just go back to plain old
ASCII and force the humans to comply with that too.

~~~
mixmastamyk
Try setting the insensitive option in your shell.

With hidden filenames on mobile, desktop GUIs, terminal CLI completion, and
shells with insensitive matching like fish (bash via option), the benefits of
insensitive fs are not as high as they used to be.

Meanwhile, the complexity of matching modern Unicode causes performance
degradation and exposes many edge cases.

In short, the window of time where insensitive made sense has largely closed.

~~~
lmm
> With hidden filenames on mobile, desktop GUIs, terminal CLI completion, and
> shells with insensitive matching like fish (bash via option), the benefits
> of insensitive fs are not as high as they used to be.

We've got better at working around case sensitivity, sure. I don't see that as
a good reason not to fix the problem properly.

~~~
mixmastamyk
> Meanwhile, the complexity of matching modern Unicode causes performance
> degradation and exposes many edge cases.

As mentioned, there is no way to "fix the problem properly" and a close-enough
solution, rarely needed these days, slows performance.

------
tzs
Almost 20 years ago, the place I worked decided it wanted to make something
like Wine, except going the other way--it would run Linux binaries on Windows.

We got it working fairly well. We expected that we'd have to put in some kind
of hack to deal with filename case, and I think we eventually did.

But before that, when it was still just passing Linux case-sensitive filenames
through to the case-insensitive Windows filesystem, I tried installing most of
whatever was the current release of Red Hat at the time.

It almost all worked fine. The only thing I remember being a problem was that
some things, such as some Perl modules from CPAN, had both "makefile" and
"Makefile" in the same directory.

~~~
colejohnson66
Curious: Cygwin’s been available since 1995. Why’d your company go about their
own way?

~~~
tzs
They wanted to be able to run binaries that were built for Linux, including
some commercial binaries for which we did not have source. Things have to be
recompiled for Cygwin.

------
jasoneckert
Case sensitivity is actually a powerful feature of Unix systems as it allows
for multiple valid string variants of a single word.

For example, common Unix convention in the 1980s-1990s was to name user-
created directories with a capital to make them easier to see in a regular
directory listing without color (e.g. Poems is a directory, while poems is
just a file).

I've used it heavily with content revision (e.g. CATHENA02.yaml is the second
cathena configuration file in testing while cathena02.yaml is the production
version of it).

Plus, making a filename case-insensitive for processing purposes in a
scripting language is very easy.

Consequently, I can't imagine a reason why I'd use case-insensitivity in ext4.

~~~
saagarjha
This sounds like it could get quite confusing.

