Hacker News new | past | comments | ask | show | jobs | submit login
Pokemon Sword and Shield Are Crashing Roku Devices (gamerant.com)
195 points by chromaton on Nov 18, 2019 | hide | past | favorite | 100 comments



> Most people familiar with the situation are attributing the problem to a Roku, in that it's the Roku devices that are malfunctioning and not the Nintendo Switch.

I would certainly hope that "entering an endless boot loop" is considered a malfunction no matter what network traffic is occurring.


It's a good example of not following Postel's law:

TCP implementations should follow a general principle of robustness: be conservative in what you do, be liberal in what you accept from others.

As originally mentioned here in RFC 761 but credit to Wikipedia for having the citation. https://tools.ietf.org/html/rfc761


> I would certainly hope that "entering an endless boot loop" is considered a malfunction no matter what network traffic is occurring.

Once the switch is in airplane mode, the endlessly looping device seems fixed.

The fact that it reboots is probably a bug.

The fact that it reboots endlessly is caused by the nintendo switch endlessly broadcasting the same signal.


That's still not really Nintendo's fault though. Network services are allowed to repeatedly broadcast the same packets, and it does so for legitimate purposes. If competently written, the Roku should have no trouble discarding invalid input.


>Network services are allowed to repeatedly broadcast the same packets, and it does so for legitimate purposes.

To add to this, even if it wouldn't be for legitimate purposes, a device shouldn't enter a boot loop because another device is malfunctioning.


Malfunctioning or malicious.


Counterexample:

  1. Hello!
  2. Three-way handshake.
  3. Authentication.
  4. Please reboot.
  GOTO 1


If you can auth without intending to that seems like a bug to me.


Oh, in all likelihood the actual system doesn't have any auth at all! But there are legitimate reasons for network traffic to toggle the "power state" of a machine (not that I think such features are useful). Some computers implement it. It's just unfortunate they didn't implement it right in the Rokus.


From GP: > no matter what network traffic is occurring.

The chain described there is "possible network traffic", and would be a valid type of network traffic, "while-meaning-to" if we so please, but there is no way to distinguish that and it is thus meaningless to code.

A device that would never reboot, no matter what network traffic is occurring, would never allow you to reboot it remotely... SSH is network traffic. Your SSH authentication also is.

The point wasn't about the Roku, it was a pointed reply to that particular point, I think. :)


I really wish we could stamp out this excessive pedantry. It is completely clear what OP meant. There is no need to pretend we are robots.


I really wish people would be more precise in their language. Often, it is completely unclear to me what people mean. I'm too autistic to properly read "beween the lines".


Pedantry is necessary if we are going to talk about computer bugs and flaws in the intended behaviour of systems (AKA the "hacker" part of "hackernews"). I hope we never change.


I know many hackers who can communicate just fine without redirecting conversation to a debate over the meaning of words. Semantics are trivial to clarify. Pedantry honestly has very little to do with building things except when resolving miscommunication.

Besides, words have multiple meanings, and there are many floating signifiers in the world. Correcting diction without an acknowledgment of intent might as well be pissing into the wind.

On a personal level, aimless pedantry is a terrible attribute in people you work with, and these people can be toxic to productivity.


We may not be robots, but our code is still run by machines which are incredibly pedantic. Any imprecissions which require human intuition to untangle won't help solve practical problems.


Hackernews comments don’t get run on anything. Not to mention that they are imprecise anyway because they are written in English.


Explanation, from Reddit:

> Pokémon sends a network discovery packet to each device on port 26037. Roku also listen on that port for LAN based updates so that multiple devices on the same network can update each other. It was an obvious decision. Saved Roku around a quarter million dollars in CDN traffic costs. Roku is popular in the commercial space where it’s often used as a media source to control sometimes 100s of TVs on the same network. It just so happens that Pokémon’s network discovery packet shares the exact same bytes as Roku’s signed bytecode to reboot.

> The odds are astronomically low. We could have wound up with an alien planet full of Justin Timberlake clones, but the universe decided this was our colossal fluke.


I'm skeptical of this. Did they provide any proof to that effect? The odds are indeed INCREDIBLY low, low enough for me to suspect Roku doing something lazy in their network traffic sanitization.

EDIT: Nope, nothing. They just stated it as fact and left it it at that. I'm calling shenanigans. Amusing theory, but it really is too incredulous to believe without any evidence.


yeah I mean why don't you also provide some kind of lengthy code (probably 128 characters for that?) I mean in no way does nintendo send multiple big packets for network discovery. and a reboot command should not be a single udp/tcp packet, I mean wtf.


Look at the name of the subreddit.


Yes, it's called ProgrammerHumor. No, that doesn't automatically mean someone's posting something sarcastic/bullshit.

I'd find a collision of such low probability very humorous if it actually happened.


It’s more than humorous - assuming the signature is secure, nothing that amazing has yet happened in the history of the universe.


It probably happens all the time. We just realize its astronomically low and assume the easier to understand, safer explanation in our own incompetence instead and go back to saying we never see that sort of thing.


"There are 10^11 stars in the galaxy. That used to be a huge number. But it's only a hundred billion. It's less than the national deficit! We used to call them astronomical numbers. Now we should call them economical numbers." - Richard Feynman


It makes it very likely it's a joke. There's really no reason to treat it as anything else, just because it got mistaken for not-a-joke.


If it is a joke, it is very elaborate. Roku published a service alert about it: https://support.roku.com/article/115012351368


The joke is the explanation not the fact that this issue exists.


What is the size of an accepted message in bytes?


What use is signing a reboot request if the bytes are the same every time? At that point it's just a secret and might as well be random, rather than a signed request. (If you're going to use signing, include a timestamp and/or request id for replay prevention.)


> > The odds are astronomically low

Not if both teams, or lazy developers on both teams, picked some sample code from a book, blog or public GitHub repo and didn't bother to make obvious changes.


Given that it's posted in /r/ProgrammerHumor, I don't think this is an explanation or intended to be one.

https://www.reddit.com/r/ProgrammerHumor/comments/dy0p86/how...


/r/ProgrammerHumor occasionally used to have good technical inside behind the jokes; I used to enjoy reading them and writing my own. Sadly I've stopped visiting because the subreddit's been overrun by "JavaScript bad" and bad volume controls (which isn't even programmer humor?)


I mean, I'm sure /r/ProgrammerHumor has had real knee-slappers of great technical sophistication. This, obviously, is not one of them. Still, it's not that unusual for a joke to be mistaken for a real thing, reported in the media and then further repeated by others - as has happened here. It boggles my mind, though, that instead of having a friendly chuckle at this perfectly human mistake, we have it as a top comment on a thread - complete with subthread where 'shenanigans' are being called, it's subject to some earnest (albeit quite iffy) speculative analysis and, most astoundingly, a claim is made (by the shenanigans-caller) that even though it's plainly a joke posted on what is plainly a joke forum, it might contain a true description of the bug.

The only thing we're missing are some back-of-the-envelope estimates of whether Roku really saved $250k on CDN fees and whether Roku devices are, in fact, used in commercial settings.

It's one of the weirdassest things I've seen on HN (a forum, it is worth recalling, on which the intellectual rigour the discipline of programming instills in its practitioners is frequently extolled) and I've been here a while.


Amen!


> It just so happens that Pokémon’s network discovery packet shares the exact same bytes as Roku’s signed bytecode to reboot.

> The odds are astronomically low.

Interesting. Any chance anyone here would know exactly how many bytes we're talking about?


Assuming by "signed" they mean cryptographically signed, and they're not doing something dumb, then the signature itself would be at least 16 bytes. It'd be ~impossible for a collision to ever occur by accident.

So either "signed" means something different in this context or GameFreak replicated Roku's reboot packet for some reason.


> then the signature itself would be at least 16 bytes.

Indeed, and 16-bytes is SHA1 or MD5, both considered insecure at this point. 32-bytes (SHA256) seems more likely to me.

So I'm going to bet that they aren't "signing" the packet.


AES-128 is also 16 bytes (128 bits), which is still (and likely will be for many, many years) completely secure, for all intents and purposes.

(SHA1 is 20 bytes, BTW.)


AES-128 is encryption, not a hash. AES-128 will be broken when computers get enough compute power to calculate 2^127 keys (the brute force attack: after covering 50% of the keyspace).

Cryptographic hashes are prone to the birthday attack instead. MD5 hashes (128-bits) are broken when a computer has enough power to calculate 2^64 keys (birthday attack: they found a hash collision).


SHA-1 is 20 bytes.


Maybe he means some leading magic bytes which match for both protocols. The Roku could then fail parsing the package with an inappropriate failure mode since the actual signature is bad or missing.

I think it's funny.


They probably both copypasted the same code from StackOverflow. /s


Cmon guys, if its sharing the port AND the signing request packet then the same libraries are being used

Jfc


The title wrongfully is accusatory. The switch is very bad when it comes to network "cleanliness" with its propensity to broadcast for anything. But the fact that some roku devices are unable to handle proprietary network calls is on Roku not on Nintendo...


What is wrong with the Switch broadcasting?


It is indiscriminately connecting to every IP on the local network on port 26037. Should be using mdns/bonjour or other well established local discovery protocols. Or at the minimum sending a packet to the broadcast address.


Why? All that assumes a home network is setup correctly, which is almost never the case. One huge use case would be all the shitty SOHO routers that don't/can't bridge wired and wireless networks so broadcast and high level discovery/advertising are completely useless between wired and wireless segments. In some cases useless between cells in a mesh, even with 'enterprise' gear this is not as reliable as it should be.

Nintendo made an informed decision and their hardware is welcome on my networks.


> all the shitty SOHO routers that don't/can't bridge wired and wireless networks so broadcast and high level discovery/advertising are completely useless

If you assume basic _routing_ doesn't work, why do you assume a TCP connection from one to another would work? If an AP implements client isolation (as many home models do now for "guest" networks) broadcast would be the only thing that works.

Pretty much every printer you buy now uses Bonjour for printing, so your average SOHO router is going to make sure that works.

I can't fault them, at one point I thought this was the right way to do autodiscovery on a LAN too. Then I learned from my mistakes.


Being pedantic, because there's all sorts of reasons a TCP connection works in the face of buggered routing, not the least of which is that the endpoints are on the same L3 network. Yeah, guest networks are a thing, but a postit note with the WLAN PW is an even bigger thing if most of the home networks I'm familiar with are any indication.

Bottom line is that nothing on an IP network should do something st00pid if it gets a random weird packet from a random IP, because the whole history of IP shows that you will, eventually, get one.


Have you considered what kind of networks a Nintendo device is expecting to interact with?


it probably sends a packet to the broadcast address. I mean "sends a network discovery packet to each device on port" probably is a fucking broadcast.


Especially considering malicious actors don't play by network "cleanliness" rules either. Being able to handle malformed packets is essential to the security of pretty much anything.


Allowing videogames to make network calls using arbitrary port numbers seem like poor engineering by Nintendo, videogames should have to use an API to do network calls regardless of what port the console uses to connect to such networks (internet or other network).


Peer to peer protocols use arbitrary ports in almost every implementation.

Edit to add: and Pokémon games have used peer to peer protocols before on different systems, so this use isn’t new to Pokémon or Nintendo systems.


Says who?


Nobody you consider important.


Well, yeah...anyone who doesn't bother or isn't capable of explaining why they make absolutist pronouncements as to how other people should do thing aren't particularly important to anyone.


Pokemon is using a local WiFi feature I didn’t know existed until I started playing. The WiFi symbol on the switch has L on it. Meaning it’s searching for users connected to WiFi but it’s not connected to the internet.

I might be able to sniff the packets later today.

To me it looks like a Roku issue. A device shouldn’t go into a reboot loop if it encounters something unexpected. I also heard the switch uses its own version of Bluetooth to add 8 players but since its nonstandard they won’t let you connect any BT devices to it.


What about going into a reboot loop if you see a signed request to reboot? [0] It's unverified but this story makes sense in a cosmic coincidence sort of way.

[0] https://www.reddit.com/r/ProgrammerHumor/comments/dy0p86/how...


Why sign the reboot code if you're always sending the same code? Like, the signing is useless. The signed code is now the new reboot code and it's effectively unsigned.


True but omitting any challenge response means the master node can just push the info out without having to communicate with any of the child nodes. It's true it's basically just a reboot code but a cryptographically random number to a randomly chosen port is honestly a fairly safe choice Roku just got cosmically unlucky if the theory is correct.


I agree with your point, but there's just no way that's true. A bare minimum MD5 signature is 16-bytes (Obviously not secure, but this isn't safe from replay attacks anyway), with a more acceptable SHA256 obviously being 32-bytes. Any type of signature should be sufficiently random that Nintendo is never going to accidentally match them, so that means the odds of matching a random 64-bit integer is already 16 quintillion to one, and just for MD5 we're talking 16 quintillino times 16 quintillion - that's beyond hopeless. It's the same thing as calling heads or tails correctly either 128 or 256 times in a row.

If I had to guess it's probably something silly like not actually checking the signature for validity, or (more likely, IMO) incorrectly checking the length of the packet and getting a buffer overflow/underflow that eventually crashes the Roku.


Maybe. Could also be they did include some half-hearted validation like the message includes "reboot after" with some long or variable validity period. That would increase the number of possible valid codes.

Also I get the huge unlikeliness of this happening but massively unlikely things do occasionally happen.


> Also I get the huge unlikeliness of this happening but massively unlikely things do occasionally happen.

This is less likely then two people generating the same random GUID. For SHA256, it's the same as generating two GUIDs in the same message and having them both be identical.


More like "Roku Devices Are Letting Themselves Crash When They Receive Packets They Don't Understand"


According to a reddit post it's actually that the Roku understands perfectly but the packet happens to exactly match the request for a reboot that Roku uses for managing P2P updates:

https://www.reddit.com/r/ProgrammerHumor/comments/dy0p86/how...


Roku has released an update to hopefully fix this.

https://www.theverge.com/2019/11/18/20970743/roku-pokemon-sw...


And this boys and girls is why they don't let us run vuln scans on production during work hours (or unplanned). I've worked at at least two companies where a scan caused some important device to crash and have business impact because the scan packets were unusual. They will fire you if you mistakenly ran the scan against a prod asset.


I understand that it should be structured to minimize risk, but if your prod stack can be taken down by a scan... you really do want to find that out in a controlled manner and not because a script kiddie decided to do the same test.


You don’t know what you don’t know. There was a UNIX bug that caused the system to crash or execute code just by being fingered.[1]

> Connect to your fingerd daemon and type more than 528 (= 512 + 16) characters (any will do). If your daemon crashes or terminates the connection with no data sent back, you probably have the vulnerability.

[1] http://seclab.cs.ucdavis.edu/projects/vulnerabilities/doves/...


Isn't this an attack vector? Why would even a device respond to wireless signals in such way that they reboot? So now you can send annoying packets instead of jamming the network.


I imagine that was not the exact intent. Perhaps it was designed to respond to wireless signals, but something triggered a fault.




I'm not clear if this happens only if the Switch is on the same network as the Roku, or if it can affect Rokus in range even on different networks. The first seems most plausible, but the article seems to imply the second in a couple of places.


The Switch is capable of communicating both over an existing network through a wired or wireless router, and through wireless peer-to-peer communications with other nearby Switches. Previous Pokemon games (and other games on the Nintendo 3DS) have used local wireless as a constant background feature in this manner, but to my knowledge, most Switch games that use local wireless only do it at specific times (when playing multiplayer) instead of always-on in the background, so it doesn't seem unreasonable that only the Pokemon game causes this problem, or that it would be heard by nearby devices on other networks. Other games would break it only when someone is actually playing local multi-system multiplayer Smash Bros or something, which is going to be rarer and you're probably less likely to be trying to use the Roku at the same time.

The only thing I would find implausible about breaking Rokus by proximity would be why the Rokus are picking up the communication from a Switch, when presumably they didn't break whenever someone used a 3DS within ten yards (or we'd have heard these complaints years ago). But that could easily be down to changes in protocol by Nintendo between the two systems, such that one is ignored and the other is mistaken for relevant data.

I don't have a Roku or an easy way to inspect nearby wifi packets, so I can't easily test this theory.


Sounds like a pesky Rotom to me ;)


Unrelated: Sword and Shield still have the Rotomdex, except it's a smartphone rather than a 3DS. In French, Rotom is called Motisma; the Rotomdex is called a Motismart (as in, smartphone). Pretty neat, I thought.


Pokémon localisation for European languages (and English too) has always been full of puns and jokes like that; they really put a lot of care into them.


This has to be one of the most random issues of all time.


At One Laptop Per Child we found an issue where the mesh networking wifi packets crashed (actually caused a packet storm) Linksys routers, so just bringing an OLPC laptop somewhere would DoS wifi there. This kind of bug where one device's packets cause another device that it was never tested against to fail is not that uncommon.

Of course, it's funny that a specific game is involved here. Perhaps there are other Switch games that do it too?


When the first-gen iPhone was released, they could break certain wireless networks built on Cisco hardware.

https://hardware.slashdot.org/story/07/07/21/1212217/duke-wi...


TBH legendary issues like mails failing over 500 miles [0] or recent reddit post about MRI disaing all iPhones in the hospital [1] take the cake

[0] http://web.mit.edu/jemorris/humor/500-miles [1]https://www.reddit.com/r/sysadmin/comments/9mk2o7/mri_disabl...


Your first story, in turn, reminds me of this: https://www.reddit.com/r/talesfromtechsupport/comments/420oa...


Here's a good list of debugging stories: https://github.com/danluu/debugging-stories


I've seen similar. Very curious about the fix and post-mortem though


This device complies with part 15 of the FCC Rules. Operation is subject to the following two conditions: (1) This device may not cause harmful interference, and (2) this device must accept any interference received, including interference that may cause undesired operation.


That's not what the FCC means by interference: https://en.wikipedia.org/wiki/Title_47_CFR_Part_15


Yes, this is not what FCC means by interference, and no Part 15 regulation is violated here. But there's no need to downvote. If we review the terms of Part 15,

> (1) This device may not cause harmful interference

> (2) this device must accept any interference received, including interference that may cause undesired operation.

You'll see it's actually a general, universal principle of good communication, and applies to many cases other than RF spectrum. Internet pioneer Jon Postel once said TCP implementations should follow a general principle of robustness:

> (1) Be conservative in what you send

> (2) Be liberal in what you accept.

It's almost identical to the principles of Part 15.

A few years ago, when the Mirai botnet launched its massive DDoS attack, a HN user used Part 15 as an analogy of the future directions of IoT's security model. And in this example of protocol conflict between Pokemon and Roku, it also applies.


So, is the Switch (in certain circumstances caused primarily/only by this game) breaking 1, or is the Roku breaking 2? Neither seem to have broken either until this particular interaction.


Neither the FCC literally means radio interference not random protocol clashes [0] like seems to be happening here. That rule basically means 'don't pollute the radio spectrum' and 'you have to accept the spectrum you receive'.

[0] https://www.reddit.com/r/ProgrammerHumor/comments/dy0p86/how...


I bet there was someone on the Pokemon team that wanted to exercise esoteric, obscure networking features just because they read about it in some spec and wanted to try it out.

It's like the person on your team that insists on shoehorning language features into the application from the dustiest corners of the language instead of sticking to the tried-and-true idioms.


Using a broadcast address for service discovery is a perfectly normal use case, the Roku should be able to handle it.


These kind of features have been in pokemon games for over a decade. All the DS pokemon games search for other local Nintendo DSs’


And the later ones (and all the 3DS ones) have the option to do that perpetually in the background, which would probably be why this issue comes up here instead of on other local multisystem Switch games that only do it at specific times.


They've always been pretty conservative as far as game design goes; somehow I doubt they'd be pushing any boundaries on the backend.


> They've always been pretty conservative as far as game design goes

Uh, considering that Nintendo became popular by breaking the arcade game mold with Donkey Kong, and continued that trend with Super Mario Bros, The Legend of Zelda, Pokemon, etc. then went on to create things like Wii Music, not to mention the Wiimotes, building a tablet into the WiiU, Amiibos, Labo...

Conservative isn't the word I'd associate Nintendo with.


Specifically on Pokemon (so Game Freak), not everything Nintendo's done. Grandparent specifically said "the Pokemon team"


Announcement using datagram broadcast is not esoteric or obscure. It's how mDNS and a multitude of LAN games work. Likely reason that the Roku reboots is that it accepts the packet and fails gracelessly to parse it instead of just discarding it and carrying on. That's what not sticking to the tried-and-true idioms is like.


Why do you bet that?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: