I have never understood the idea of protecting a language against foreign words....

jacquesm · on July 3, 2017

I can see a lot of trouble coming from this. Sort order for one. If a language this old didn't need a glyph up to 2017 it can be done without, especially if the use case is limited to all caps words. But I guess in that case you could convert its occurrence to SS first and then do the sort.

germanier · on July 3, 2017

Actually I was a bit surprised when I learned that Unicode only added a codepoint for it in 2008 (and even rejected an earlier proposal). While the spelling rules have said for some time to replace it with "SS" the glyph was used quite often, including official documents such as passports.

For sorting there is standard (DIN 5007). Replacing ß with ss is correct (even in the lower case variant). The other letters are more fun: ä is replaced with a for sorting, except if you sort a list of names. In that case you need to replace it by ae. Probably not something international software is aware of. The Austrian sorting of names is different and other languages that have the same glyph (e.g. Swedish) also have other rules (e.g. by placing ä at the very end of the alphabet).

Freak_NL · on July 4, 2017

> Probably not something international software is aware of.

Collation rules that vary by locale exist for this reason, and all major programming languages and OS'es support this. Of course whether the software you use does this or not depends on the developers writing the software.

germanier · on July 4, 2017

Thanks for the term, I wasn't aware of it. This looks to be a standard problem in DBMSs at least. While not all of the rules I mentioned seem to be shipped by default, it looks fairly straightforward to add them. I just remember our development team moaning quite a bit after I added the requirement to sort names differently.

matt4077 · on July 4, 2017

Funny you mention Swedish. Let's all bond over the shared experience of "why tf is this using latin1_swedish_ci collation?"

ygra · on July 4, 2017

Actually there was a time where many typefaces actually included a capital sharp s and the letter was in somewhat common use, cf. https://de.m.wikipedia.org/wiki/Gro%C3%9Fes_%C3%9F, especially the second section. Limited availability of a glyph and no way of representing the letter in any digital text encoding probably meant it fell out of use and was more or less forgotten.