
Comparing how different devices display the SSID “á̶̛̛̓̿̈͐͆̐̇̒̑̈́͘͝aaa” - herohamp
https://hamptonmoore.com/posts/weird-wifi-name-display/
======
mitchs
Fun story about one of the devices mentioned there that I worked on. We used
to store the saved wifi creds in a file named exactly what the SSID was.

Some user managed to break things, and with their permission we gathered
detailed wifi logs and found they were connected to an SSID that was an ASCII
depiction of the equation: boobs plus penis equals a smiley face. The issue
was the forward slashes, presumably there to add fingers to the scene. Must
have been an awkward customer service follow up when we told them to change
their SSID while they waited for an update.

~~~
SCHiM
Sounds like a directory traversal to me :)

It's generally a bad idea to have the user in control of filenames you create
if those files are not on a device they own.

~~~
thaumasiotes
In this case, it sounds like the files were on a device owned by the user?

~~~
account42
The user in control here is the one configuring the SSID, which is not
necessarily the same one owning the device used to connect to it.

------
GekkePrutser
I used to have something like this as my SSID: ʕ•̫͡•ʕ _̫͡_ ʕ•͓͡•ʔ-̫͡-ʕ•̫͡•ʔ
_̫͡_ ʔ-̫͡-ʔ (Not this particular one as it was too long though!) Many nice
examples at:
[https://1lineart.kulaone.com/#/](https://1lineart.kulaone.com/#/)

It was fun but some OSes didn't show it correctly, in particular Windows. It
would just show it in HEX. And more annoyingly, some devices refused to
connect to it at all, especially IoT crap like those WiFi power sockets.

So eventually I gave up.

PS: Something with more vertical stuff would also be really fun, some of these
can write across multiple lines of unrelated content! Unfortunately most OSes
block this from happening now. Example:

Ỏ̷͖͈̞̩͎̻̫̫̜͉̠̫͕̭̭̫̫̹̗̹͈̼̠̖͍͚̥͈̮̼͕̠̤̯̻̥̬̗̼̳̤̳̬̪̹͚̞̼̠͕̼̠̦͚̫͔̯̹͉͉̘͎͕̼̣̝͙̱̟̹̩̟̳̦̭͉̮̖̭̣̣̞̙̗̜̺̭̻̥͚͙̝̦̲̱͉͖͉̰̦͎̫̣̼͎͍̠̮͓̹̹͉̤̰̗̙͕͇͔̱͕̭͈̳̗̭͔̘̖̺̮̜̠͖̘͓̳͕̟̠̱̫̤͓͔̘̰̲͙͍͇̙͎̣̼̗̖͙̯͉̠̟͈͍͕̪͓̝̩̦̖̹̼̠̘̮͚̟͉̺̜͍͓̯̳̱̻͕̣̳͉̻̭̭̱͍̪̩̭̺͕̺̼̥̪͖̦̟͎̻̰_Ỏ̷͖͈̞̩͎̻̫̫̜͉̠̫͕̭̭̫̫̹̗̹͈̼̠̖͍͚̥͈̮̼͕̠̤̯̻̥̬̗̼̳̤̳̬̪̹͚̞̼̠͕̼̠̦͚̫͔̯̹͉͉̘͎͕̼̣̝͙̱̟̹̩̟̳̦̭͉̮̖̭̣̣̞̙̗̜̺̭̻̥͚͙̝̦̲̱͉͖͉̰̦͎̫̣̼͎͍̠̮͓̹̹͉̤̰̗̙͕͇͔̱͕̭͈̳̗̭͔̘̖̺̮̜̠͖̘͓̳͕̟̠̱̫̤͓͔̘̰̲͙͍͇̙͎̣̼̗̖͙̯͉̠̟͈͍͕̪͓̝̩̦̖̹̼̠̘̮͚̟͉̺̜͍͓̯̳̱̻͕̣̳͉̻̭̭̱͍̪̩̭̺͕̺̼̥̪͖̦̟͎̻̰

So the Unicode above this would write through the next lines on some
platforms, even system screens like the wifi chooser :)

But these quickly get too long for an SSID too.

~~~
joshschreuder
This worked for me on Windows + Chrome :)

[https://i.imgur.com/bJHlPb9.png](https://i.imgur.com/bJHlPb9.png)

EDIT: though not on native HN, I think it might be a result of having HNES
installed.

~~~
GekkePrutser
Lol, that's really bad!

On native HN in Firefox Windows it also works but it stops at the "reply"
button under my post.

And on native HN on Firefox Mac it doesn't work at all, strange enough.
Firefox must rely strongly on the platform rendering.

------
barbegal
The 802.11 standards have always allowed up to 32 bytes which can be filled
with any data, it does not have to be in a particular encoding. In 802.11-2012
there is a separate tag SSIDEncoding which can be used to specify if these
bytes are in UTF-8 or "unspecified". If the UTF-8 option is set, the SSID
should be interpreted as UTF-8.

It is not clear in this case if the router sets this flag or not. Either way
there is no stipulation in the spec about how the UTF-8 characters should be
displayed so many of these options are potentially valid.

~~~
ynik
The bytestring was truncated after 32 bytes, in the middle of a UTF-8 byte
sequence. This means the resulting truncated string is not valid UTF-8
anymore. So my guess is that most devices decide "if it's not valid UTF-8, it
must $LEGACY_ENCODING".

~~~
tialaramex
Unicode offers two ways forward when you can't decode what you have, one
alternative is an exception, you just fail because you weren't able to decode
something.

The other is for any code unit that won't decode you emit U+FFFD the Unicode
Replacement Character and then you carry on decoding.

For humans U+FFFD makes it obvious something is wrong, it's typically
visualised as a black diamond with a white question mark. And for a machine it
shouldn't match parsing rules, it isn't an alphanumeric, it isn't any of the
common separator or spacing characters, so it's unlikely to be of use in an
attack.

~~~
account42
That is a reasonable approach if you know that what you are decoding is
supposed to be UTF-8.

If you don't know the text encoding because there is no information to
indicate it (or you don't trust that information to be correct) then you will
have to guess and "decode as UTF-8 for valid UTF-8, use some legacy encoding
otherwise" is a common approach (used e.g. by many text editors).

------
saagarjha
> Both the s8 and the Firestick are rendering the result in what I deem as the
> correct way with it showing the name just with some of the vertical
> characters cutoff.

At least one is doing a poor job, though, because the diacritics look nothing
alike…

> After asking around on the Apple discord server someone said it might be
> using the Mac OS Roman character set. It turns out it which is strange
> because iOS used UTF-8 internally and not Mac OS Roman as that was phased
> out with the release of Mac OS X.

I would guess that some part of IOKit is passing a C or C++ string to
CoreFoundation using an inappropriate function or using the “system encoding”.
I can’t remember of the top of my head, but Mac OS Roman might also be
encoding 0. In any case there’s certainly a convention going on there with a
poor default or some sort of strange compatibility story.

(I’m actually curious if there is “supposed” to be an encoding for this.
Perhaps Mac OS Roman is just as correct and more convenient?)

~~~
kalleboo
The first Apple Airport routers predate MacOS X, so it wouldn’t be crazy for
the initial MacOS X implementation to fall back to MacOS Roman as backcompat
to routers configured with MacOS 8.6/9\. And then if they never changed it
since for 99% of users the UTF8 auto detect works fine...

------
unnouinceput
And out of curiosity, taking some from this:

[https://github.com/minimaxir/big-list-of-naughty-
strings/blo...](https://github.com/minimaxir/big-list-of-naughty-
strings/blob/master/blns.txt)

especially the Asian ones, seems to varying from mildly amusing to interesting
effects, when you try to set them as SSID.

~~~
unphased
What's so naughty about Lightwater Country Park?

~~~
lozf
twat

(I'm not calling you a twat, I'm pointing out that twat is probably the
problem.)

------
nix23
Ok i am a bit angry, first i was thinking that a fly shit is on my screen,
then that my GPU has a problem, then i read the Title ;)

It's really crazy, looks completely different on my bsd-box compared to my
linux-laptop LOVE IT!!

~~~
herohamp
If you want to share screenshots Ill happily put it up on the site. My email
is me (at) hampton {dot} pw

~~~
nix23
Sure, here you go:

[https://imgur.com/a/Sqjh2TZ](https://imgur.com/a/Sqjh2TZ)

------
bouke
My Canon printer won’t join my SSID containing an emoji, helpfully throws
generic E36 (or something like that). All Apple devices show and connect to
the SSID just fine.

------
sdedovic
I'd be curious to see how a car may display that.

I've paired my phone with a family members Volkswagen SUV and it could not
display the SSID properly, an emoji.

Most laptops are capable of displaying emoji SSIDs (bluetooth and wifi).

------
tarsius
In my firefox it looks like four "a"s with a little rat sitting on top of the
first of them.

~~~
seanalltogether
Same, I actually assumed that's what the string was supposed to be, now
opening this in chrome I can see wildly different it looks

~~~
sundarurfriend
This[1] is how it looks to me on my Firefox, is that the Chrome version or the
Firefox version on your side?

[1]
[https://i.postimg.cc/nVPBqXjV/fireunicode.png](https://i.postimg.cc/nVPBqXjV/fireunicode.png)

~~~
seanalltogether
Oh weird, are you on windows? I'm on mac and I see all the diacritics squished
down into a small pile in firefox

------
bobowzki
I've got some dirt on my screen. Be right back.

~~~
jyriand
I tried to use my nail to scratch it off, no luck.

------
devadvance
Very cool. It's pretty interesting to see the various failure modes. Some seem
straightforward (e.g., the font is missing the glyphs) while others seem to be
parsing limitations.

As an aside, this finally convinced me to explore using additional SSIDs in
creative ways with emojis.

------
GlitchMr
Out of curiosity, I ran this test on Nintendo Switch:
[https://i.imgur.com/8o2LLUm.png](https://i.imgur.com/8o2LLUm.png)

It seems like its OS doesn't support combining characters.

~~~
laken
My SSID is a single emoji, and the Switch displays just the missing char/"box"
for my SSID as well.

------
tzs
For most of the Western world, if you take the set of all commonly used
characters in the language(s) that are widely recognized in each country and
form their intersection, you'll have at least the Arabic numerals and plain
A-Z.

If SSIDs were restricted to just those characters, it would be fine in the
Western World. But of course there is more to the world than the West.

Question: do most or all non-Western languages also have small subsets of
characters that would be fine to restrict SSIDs to? For instance, Wikipedia
tells me that Persian is written with a 32 character alphabet, and Arabic uses
28 characters for its alphabet.

I'd expect that for every alphabet-based language, there is a similar base set
of characters you could reasonably limit SSIDs too, and so avoid all the
problems you get with allowing full Unicode.

How about the languages that use logographic writing systems, such as Chinese,
Japanese, Korean, and Vietnamese? Do they all have reasonable (albeit probably
very large) subsets SSIDs could be limited to that would avoid all their weird
stuff that can happen in Unicode but still allow most reasonable names to be
used?

~~~
jmiserez
Don't forget that some of these are left-to-right (e.g. Hebrew, Arabic). Words
are rendered left-to-right, and early email software would just expect each
word to be sent reversed so that simple RTL rendering could be used. UTF
solves this (and many other issues) quite nicely.

------
yunruse
I tested this out of curiosity, and all iPhones I could find in my household
rendered correctly in UTF-8 with only 12 octets [0]. This is replicated on
iPhone 7, SE and XR, all running 13.5.1. So it may well be the issue was fixed
in 6s or 7.

[0] [https://i.imgur.com/KDau4PP.jpg](https://i.imgur.com/KDau4PP.jpg)

------
tinus_hn
At least it nowhere caused an exploitable crash

~~~
dannyw
On popular, actively maintained operating systems.

Plug in your cheap Chinese IoT device and see what happens...

~~~
saagarjha
It might actually do well if you feed it Chinese…

------
yrwwywtywsrty
Last I checked late last year, my PlayStation 4 was unable to connect to my
network when I used a single emoji in the SSID.

------
bravoetch
My Logitech device won't even acknowledge an SSID with Japanese katakana.

------
worewood
Tried to set the SSID of an Android Phone Wi-Fi thetering, it said it exceeds
the maximum character limit and does not let it set. Bummer

------
yrlf
This is a really good post that shines some light on how the insanity of
encodings still isn't fixed today, since so many operating systems still don't
completely use Unicode everywhere.

Some of the reasonings behind why the characters are displayed like that are
slightly incorrect, though, so here are some corrections:

I'm going to supply each example here with some python3 code to reproduce
with, with the following definition:

`data =
b"a\xcc\xb6\xcc\x81\xcc\x93\xcc\xbf\xcc\x88\xcc\x9b\xcc\x9b\xcd\x90\xcd\x98\xcd\x86\xcc\x90\xcd\x9d\xcc\x87\xcc\x92\xcc\x91\xcd"`

First, let's start at the beginning:

> My router just cut the name down to 32 octets though to stay complient >
> This was what was being sent according to iw >
> `a\xcc\xb6\xcc\x81\xcc\x93\xcc\xbf\xcc\x88\xcc\x9b\xcc\x9b\xcd\x90\xcd\x98\xcd\x86\xcc\x90\xcd\x9d\xcc\x87\xcc\x92\xcc\x91\xcd`

If you look at this closely, the last byte in this sequence is `\xcd`, which
is an incomplete UTF-8 character. It's missing the final `\x84` that the
router cut off (along with the three additional `a` characters).

> with the raw hex being >
> `97ccb6cc81cc93ccbfcc88cc9bcc9bcd90cd98cd86cc90cd9dcc87cc92cc91cd`

small mistake: the hex of `a` is `61`, not `97` (that's decimal), but
otherwise correct.

> Galaxy S8 running Android 9 with Kernel 4.4.153 > Amazon Firestick

Everything correct, except for a small detail:

These two devices render the result of UTF-8 decoding while ignoring bytes
that are invalid unicode (in python3: `data.decode('utf-8', 'ignore')`)

> iPhone 6 running iOS 13.5.1 > Apple TV Second Generation

Completely correct. This is definitely Mac OS Roman (in python3:
`data.decode('mac_roman')`)

> Windows 10 Pro 10.0.19041

This one is a incorrect again:

Windows is interpreting the characters in the "Windows Codepage 1252" (also
known as "Western") encoding and ignoring invalid characters (in python3:
`data.decode('cp1252', 'ignore')`)

Decoding every character separately as UTF-8 would fail (since every byte that
can be a continuation of a UTF-8 character is not a valid start byte).

Interpreting every character as a Unicode code-point number would give
something very similar, but not exactly the same: What Windows decodes as
quote, caret-y thing, angle bracket-y thing, tilde, dagger, double dagger, and
single quote fall into a control character block at the start of the Unicode
"Latin-1 Supplement" block (`\x80` to `\x9f`).

> Chromebook running ChromeOS 83.0.4103.97

Correct.

The Chromebook seems to have rendered the ASCII a, but replaced all other 31
characters with question marks.

> Kindle Paperwhite running Firmware 5.10.2 > Vizio M55-C2 TV

Also correct.

Those two devices seem to opt to display hex instead of falling back to
question marks as the Chromebook does.

I hope this comment gave some useful insight into why these devices decoded it
this way :)

~~~
herohamp
Hey, I am the OP. Thank you so much I will go through and amend what I got
wrong, anyway that you wish for me to credit you?

~~~
yrlf
If you want to credit me, just tag my twitter :)

(@theFerdi265)

------
cl3misch
How are you running iOS 13 on an iPhone 6? Or did you mean 6S?

------
app4soft
> _Comparing how different devices display the SSID “á̶̛̛̓̿̈͐͆̐̇̒̑̈́͘͝aaa”_

I always though that such Unicode characters not allowed in the HN titles.

------
tonetheman
This is a wonderful article and great work. I love this type of content.
Brilliant!

