Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> We collectively decided that Unicode is a thing now (including terminal emulator support), and that emoji are a part of it.

That sounds like democracy applied to the IT world, and we all know that "democracy is the least worst system". Democracy doesn't necessarily imply rational thinking; e.g., we are building websites using PHP and WordPress, not Lisp (well, except HN, perhaps).

My understanding (probably erroneous) of the article is: we, as developers, are condemned to use not-so-well-designed software because other developers don't know better and keep a) building bad software, b) using bad software.

As usual, the majority is the driving force; the question is: are you part of the majority or are you like Ryan? As of me: I'm like Ryan only on Mondays, mostly.



Sorry, I edited my comment. I realised that wasn't what I quite wanted to say.

But to respond to your comment, I treat Unicode as a special case, because it is the result of a reform of text encoding from the ground up. Before Unicode, we had proprietary character sets and codepages, mostly mutually unintelligible.

Unicode brought order, interoperability, and the ability to represent in digital writing whatever might conceivably need such a representation, now or in a hundred years. It's forwards-extensible, so future cultures and alphabets will be also well-served.

I think it's just complex enough for the size of the problem, so it's actually a counter-example of the contrived complexity represented by PHP and WordPress.


U+0007 represents the character for ‘make the bell on the teletypewriter ring.’ This isn’t practically used by hardly anyone today, but it remains a part of the Unicode standard and probably will forever. I don’t want to disparage Unicode, because it does provide a good solution to the problem. But it has its own issues and its own degree of complexity.

This issue is something I’ve been thinking about lately since since beginning to use Gemini. Gemini is a minimalistic file transfer protocol, like HTTP but for transferring plaintext. But it builds on top of Unicode, of course, and so it inherits all of Unicode’s complexity. Suddenly, the most difficult part for Gemini client authors is not implementing the protocol itself, but correctly handling the complexity of Unicode text rendering. (E.g. https://gmi.skyjake.fi/gemlog/2021-07_lagrange-1.6.gmi)

Obviously, I’m not proposing replacing Unicode. But Unicode is complex, and that complexity ends up being inherited by all projects built on top of it. That’s an example of complexity that’s super easy to miss unless you’re looking for it. Which I think is the original author’s point. Unless you actively try to decrease complexity, complexity only increases.


U+0007 represents the character for ‘make the bell on the teletypewriter ring.’ This isn’t practically used by hardly anyone today

Sure it is. In some (many?) terminal emulators it makes the screen flash.

Some programs use it to signal a significant error has occurred that might be noticed amongst all the blather that command line programs barf of these days.


A better example might be country flags. When the political situation changes or the flag is redesigned, how should Unicode and the various fonts handle this? Presumably text written in the past shouldn't change for viewers in the future.

Unicode mixes abstract concepts and concrete representations together on a number of levels. It seems like a legitimately difficult problem to me though. An 'a' character might be written differently but remains fundamentally the same thing since ancient times. Meanwhile the spelling and meaning of words shifts over time. So how are you supposed to handle pictograms that can change simultaneously in both commonly accepted representation and fundamental meaning?

It's also ... weird ... that a TTY bell has a dedicated not-glyph in a writing system, isn't it? Colors are (appropriately, imo) handled via escape codes. I certainly don't want support for HTML flashing text to be added to Unicode! I do appreciate the historical context that led to the current situation though.


Unicode treats country flags very well because it's controversial topic. They don't define what flags is available, but ISO country code, font vendor, and OS vendor (or emoji picker vendor) defines. As a fallback, we can see country code instead of flag.


I'm aware, and don't think I'd agree that flags (or emoji more generally) were handled well at all. I generally think that such things should be handled via an escape code that invokes some other (higher level) standard, the same as colored or bold font in the terminal. It would arguably make much more sense to encode flags (and emoji-like things in general) as some sort of compressed SVG representation. At least then (among other things) I wouldn't have to wonder how things would look to message recipients or someone reading old messages in the future.


>Gemini is a minimalistic file transfer protocol, like HTTP but for transferring plaintext

It is my understanding that Gemini allowed you to transfer any file type and came with one simple file type defined. Is this an incorrect understanding?


No, you’re correct. I was oversimplifying in an attempt to ensure my comment didn’t devolve into a description of Gemini. For people interested in learning more, the Gemini project page is very readable.

=> https://gemini.circumlunar.space/docs/


maybe that was a tradeoff to reduce complexity in converting ASCII to Unicode?


Unicode is basically a superset of ASCII that can encode original ASCII text losslessly, so I imagine that's the case




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: