Hacker News new | past | comments | ask | show | jobs | submit login

This has the advantage that your database operations run a heck of a lot faster, but has the potential disadvantage that primary key uniqueness may not be maintained, if you think that alternate ways of writing the same characters in unicode matters for that.



As you say, I'd question the technical motivation for enforcing uniqueness on unicode data in the first place (and as a primary key on top of that???)

However, if someone really wanted to accomplish this, they could probably use PostgreSQL's functional indices and unicode normalization to do it.


Normalization is a separate issue, you can normalize and then use the C collation order.


Sure but either you are talking about a fixed normalization algorithm which is not locale aware, in which case it doesn't solve the locale-specific unique key issue, or it is locale-aware and hence suffers the same problem with time-varying behavior.


You are using string/text values as a pk and trying to sort on em? I'd say this is another reason not to do that.


Well I am not doing any of the things in the comment chain leading up to this, but probably the mention of primary key was a red herring. The GP's [1] point was that some index constraint (they mentioned PK uniqueness, but it could really be any constraint) might not be correctly maintained if the DB was not aware of the correct collation order. So from the point of view of the renderer, which is locale aware and uses locale-based collation, the DB is violating the constraints.

---

[1] The GP relative to my reply




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: