Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Why doesn't HN scrub this train emoji?
8 points by isoprophlex 29 days ago | hide | past | favorite | 5 comments
See: https://news.ycombinator.com/item?id=28633934

I came across this post that accidentally contains a train emoji. Copy/pasting into a new post worked, even though I thought emoji were scrubbed from submitted text. And indeed when I select a train emoji on my phone keyboard... no trains!

I can't figure out what makes the train in that post special, does anyone know what's going on here? User Zokier mentions "private area unicode" but I'm having a hard time grokking how that causes this issue.

Train from post: 

Train from my phone keyboard: Gets redacted

It’s not a defined emoji. In fact it’s not a defined code point. Private Use Area means that it is specified as having undefined semantics and you can use it for whatever you like. You just happen to have a font that puts a train glyph there. For me, I see a box containing the characters “E01F”, corresponding to the scalar value U+E01F.


It would probably make sense for HN to strip out PUA stuff, given its decision to strip emoji.

Ah thanks! Now I get the Private Use unicode stuff.

So the question should rather be addressed at my ios device, "why the hell does ios render a train glyph for U+E01F?"

Same reason it renders U+F8FF as something (I've seen it render as the Klingon logo, as Tux; currently some font on my system is using it for a stylized 't' surrounded by brackets), because someone wanted some glyph available not in Unicode. You'll have to ask Apple about the why.

Just bear in mind that those glyphs have no meaning at all without the font that includes them. Without the font, you get a □ or better, a block of Unicode tofu — a square filled with the numbers and letters of the hex code for that codepoint.

Interestingly, my Mac does not…

Characters in the Private Use Area do not have any predefined meaning. You happen to have a font which shows a train when those characters are used, but most people do not.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact