Hacker News new | past | comments | ask | show | jobs | submit login

Interestingly, the javascript scratchpad in firefox requires two deletes to fully delete that character. This is consistent with the fact that its length is 2. The first delete actually changes it to some other character.

From what I understand this is not a case of a unicode character with a second character applied as an accent. Although it's possible it may be behaving the same way.

In any case, it's interesting to see the delete key change a character to something else rather than fully delete it.




JavaScript's access to individual parts of a string works on code units, which are distinct from code points, which in turn are distinct from characters. A single (visible) character can be composed of multiple code points, such as a character plus an accent. However, a single code point can also be represented by multiple code units, where a code unit is a fixed number of bits. UTF-16 is a variable-width encoding, so a given code point can be either one code unit or two; in this case, the cat face is a single character, composed of a single code point, composed of two code units.

(UTF-8 is another variable-width encoding where a code point can be anywhere from one to four 8-bit code units, and UTF-32 is a fixed-width encoding where every code point is exactly 32-bit one code unit. Because code points can be accents or modifiers, UTF-32 is still variable-width with respect to characters, because there's no guarantee that a given character is composed of a single code unit.)

Because UTF-16 is a variable-width encoding, and because JavaScript exposes UTF-16 code units (instead of code points or characters), it is possible to delete half of a code point and even end up with an invalid UTF-16 string in some cases (IIRC). As another comment mentions, some languages (e.g. Python 3) expose code points instead, which still isn't the same as characters.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: