> In German, it's the letter "A" with an umlaut mark, which is not a distinct letter of the alphabet, but it is a distinct vowel.
The umlauts are distinct letters of the German alphabet.
The English Wikipedia says so but still lists them as special characters which is confusing at best. The German Wikipedia just lists the umlauts as ordinary characters.
>German uses three letter-diacritic combinations (Ä/ä, Ö/ö, Ü/ü) using the umlaut and one ligature (ß (called Eszett (sz) or scharfes S, sharp s)) which are officially considered distinct letters of the alphabet.[1]
I believe there are just two cases:
- A umlaut: German, Swedish, etc.
- A with diaeresis: French, Catalan, etc.
It is important to distinguish both cases because they differ at least in collation and typography.
In a German dictionary A with diaeresis should be always treated like an A, while A umlaut should be treated according to the different collation standards that exist in German.
In a German font the dots are closer to the letter than in a French font.
Confusingly, there are two different collation standards when treating German umlauts: One for lists of names and one for everything else.
When reciting the alphabet German school kids don't add the umlauts (at least I didn't – I don't think that has changed in the last decades) and if you ask someone how many letters the alphabet has they will answer "26" without hesitation while in Sweden they are treated as distinct characters of the alphabet in every sense.
> Confusingly, there are two different collation standards when treating German umlauts: One for lists of names and one for everything else.
Yes, this alone is confusing. My point is that it is even more complicated because strictly the standard only applies to umlauts but not to letters with diaeresis.
Unicode offers a way to treat these two cases differently utilizing the
combining grapheme joiner (CGJ).
>> The CGJ can also be used in German, for example, to distinguish in sorting between “ü” in the meaning of u-umlaut, which is the more common case and often sorted like <u,e>, and “ü” in the meaning u-diaeresis, which is comparatively rare and sorted like “u” with a secondary key weight. [1, page 850]
> When reciting the alphabet German school kids don't add the umlauts (at least I didn't – I don't think that has changed in the last decades)
and if you ask someone how many letters the alphabet has they will answer "26" without hesitation while in Sweden they are treated as distinct characters of the alphabet in every sense.
You have a point here and maybe the English Wikipedia isn't so wrong in listing them as special characters. Being special doesn't make A umlaut a funky A though. It still is a letter in it's own right.
For what it worth, Dutch has "ij", which could be a letter, or diphthong, or two unrelated letters (in words borrowed from French, for example), and could also be written as "ij", or as "ÿ", or even "y".
There is no dedicated key on computer keyboard for it, but you are supposed to remember it's a unit when capitalizing, for example IJmuiden.
> When reciting the alphabet German school kids don't add the umlauts (at least I didn't – I don't think that has changed in the last decades) and if you ask someone how many letters the alphabet has they will answer "26" without hesitation while in Sweden they are treated as distinct characters of the alphabet in every sense.
Meanwhile, the Swedish alphabet has 29 letters. Since 2006. Before 2006, "W" wasn't considered a letter of the alphabet, since it was just a double "V", and only existed in German and English loan-words and names.
So before that, this would have been the correct sort order of some names: "Valter, William, Viktor", but these days they would probably be sorted like "Valter, Viktor, William"
Because in Swedish, ÅÄÖ are distinct letters of the alphabet, they are not treated different than any other vowel, they're not variants of base letters or anything like that.
I was wrong about ÄÖÜ not being part of the German alphabet. They are, but they're not letters (Buchstaben), they're umlauts. So the Swedish alphabet contains 29 letters, but the German alphabet contains 26 letter, 3 umlauts, and the Eszett.
In the end, it's all arbitrary as fuck, with tons of historical reasons for the way things are, and god help the localization engineer who thinks it's all easy peasy. :-)
The umlauts are distinct letters of the German alphabet. The English Wikipedia says so but still lists them as special characters which is confusing at best. The German Wikipedia just lists the umlauts as ordinary characters.
>German uses three letter-diacritic combinations (Ä/ä, Ö/ö, Ü/ü) using the umlaut and one ligature (ß (called Eszett (sz) or scharfes S, sharp s)) which are officially considered distinct letters of the alphabet.[1]
I believe there are just two cases:
- A umlaut: German, Swedish, etc.
- A with diaeresis: French, Catalan, etc.
It is important to distinguish both cases because they differ at least in collation and typography.
In a German dictionary A with diaeresis should be always treated like an A, while A umlaut should be treated according to the different collation standards that exist in German. In a German font the dots are closer to the letter than in a French font.
[1] https://en.wikipedia.org/wiki/German_orthography#Alphabet