Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I keep alternating between those positions

And I use FreePascal. Its stdlib is vastly larger than the C++ stdlib, but also completely untested/unusable, because no one is using it.

I wrote my own unicode handling functions for everything.

Yesterday I found a new test case, and noticed that my convert utf8 to lowercase function did not handle the Turkish İ correctly (it should turn into two codepoints rather than one for reasons. although I had a check for that symbol, but it was in the wrong branch). And my function had quadratic runtime, so it was also nearly unusable.

So I fixed my function. Then I thought, why am I even writing a convert to lowercase function? FreePascal already has a convert to lowercase function. There is a lesson here, do not write your own functions, you will miss cases.

So I loaded the stdlib Unicode functions to compare my function to their function. And, segmentation fault. Not in the convert to lowercase function, but just loading that Unicode part of the stdlib broke something.

Although while writing this post, I thought, perhaps I test it again, just the convert to lowercase function, without the crashing part. It also fails to handle the İ. And worse, it returns a string that is one byte too large, like a garbage null terminator. There is a lesson here, do not use the stdlib, it is just broken.

Then I compared it to a compare to lowercase function from another library I had included. Twice as fast as even my new fixed function. But it also does not handle the İ symbol. There is a lesson here, do not use other libraries, they do not do what you need them to do.



Sorry for the İ/i I/ı bug. Even modern software in other languages fall to that trap time to time, so probably it's not handled well in other languages either.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: