Hacker News new | past | comments | ask | show | jobs | submit login

Having been around for the creation of Unicode, there were plenty of people who argued that 16 bits was insufficient for encoding all the characters and fought for the competing ISO standard which was 32 bits. A final compromise made Unicode a subset of ISO-10646 and now the two character sets are synchronized and act as parallel standards. UTF-8/16/32 were part of things from the beginning not because of the 16-bit assumption behind Unicode 1.0, but for the sake of keeping data reasonably sized. UTF-8 allows for 7-bit ascii files (which were the majority of text files back in the day) to be compliant without change and kept the default size of text files from being doubled or quadrupled in an era in which that was a serious concern for both bandwidth (dial-up connections at 2400baud or lower were quite common) and storage (20MB was considered a HUGE amount of storage).

Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact