Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have never understood the idea of protecting a language against foreign words. I think languages naturally strive to be as expressive as possible and will adopt words accordingly.

However, I think it is important to note that the ẞ is not a letter they just made up. Heck, even the picture in the article shows an old book cover, which uses this never standardized letter. To me this is just correcting a previous oversight and formally allowing (not forcing) people to use the 'capital sharp s', for example in print, where they want to typeset a title in capitals.



I can see a lot of trouble coming from this. Sort order for one. If a language this old didn't need a glyph up to 2017 it can be done without, especially if the use case is limited to all caps words. But I guess in that case you could convert its occurrence to SS first and then do the sort.


Actually I was a bit surprised when I learned that Unicode only added a codepoint for it in 2008 (and even rejected an earlier proposal). While the spelling rules have said for some time to replace it with "SS" the glyph was used quite often, including official documents such as passports.

For sorting there is standard (DIN 5007). Replacing ß with ss is correct (even in the lower case variant). The other letters are more fun: ä is replaced with a for sorting, except if you sort a list of names. In that case you need to replace it by ae. Probably not something international software is aware of. The Austrian sorting of names is different and other languages that have the same glyph (e.g. Swedish) also have other rules (e.g. by placing ä at the very end of the alphabet).


> Probably not something international software is aware of.

Collation rules that vary by locale exist for this reason, and all major programming languages and OS'es support this. Of course whether the software you use does this or not depends on the developers writing the software.


Thanks for the term, I wasn't aware of it. This looks to be a standard problem in DBMSs at least. While not all of the rules I mentioned seem to be shipped by default, it looks fairly straightforward to add them. I just remember our development team moaning quite a bit after I added the requirement to sort names differently.


Funny you mention Swedish. Let's all bond over the shared experience of "why tf is this using latin1_swedish_ci collation?"


Actually there was a time where many typefaces actually included a capital sharp s and the letter was in somewhat common use, cf. https://de.m.wikipedia.org/wiki/Gro%C3%9Fes_%C3%9F, especially the second section. Limited availability of a glyph and no way of representing the letter in any digital text encoding probably meant it fell out of use and was more or less forgotten.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: