Had the same problem with Raspi 4B.
Problem was dependent on screen resolution (!!).
With 1920x1080, wlan0 became disconnected after it was
ok at lower screen resolutions. Was in 2.4GHz band.
After turning on 5GHz in the router and going into
the network preferences (right click the network icon top
right on screen) and SSID ... and checking "automatically
configure options" the connection remains stable (so far :D ).
Had the same problem with Raspi 4B. Problem was dependent on screen resolution (!!). With 1920x1080, wlan0 became disconnected after it was ok at lower screen resolutions. Was in 2.4GHz band. After turning on 5GHz in the router and going into the network preferences (right click the network icon top right on screen) and SSID ... and checking "automatically configure options" the connection remains stable (so far :D ).
This is a good reminder that every digital circuit is really a fancy analog circuit. Your program may be perfectly correct, but when it runs on faulty hardware, you are hosed no matter how many times you’ve formally proven it.
Reminds me of the MacBook Air that I have (2018 model).
When I connect my USB 3 hub to it, I lose my WiFi :(
I Googled it when it happened a while back, and apparently other people have this problem with the MacBook Air too.
Some choose to shield the USB 3 cable with tinfoil. Personally I opted to connect a USB 3 Ethernet interface to the hub and use wired Ethernet when I use the hub.
In the very distant past (late 1980s, early 90s), Olivetti changed their PC keyboard design; it went from having the keyboard PCB assembly inside a metal clamshell to a bare board with a metallised plastic sheet on the back of the PCB (they 'cheaped out').
We got to know about the design change when a rather large and sprawling local leisure centre reported that often when someone used their walkie talkie near a PC, the screen filled with random characters and sometimes the dot matrix printers would 'go haywire' (CTRL-P = Print what's displayed on the screen).
In effect, the rf signal from the walkie talkies was 'mashing the keys'.
The immediate fix was to swap in some older keyboards, the longer fix was down to Olivetti using better shielding and some appropriately-placed capacitor decoupling on the power and signal lines.
First time I saw a room full of non-crt computer monitors, found out Merrill Lynch moved to a new building to save money. What they discovered is the CRTs would weird out every time the subway train went by. All the cash saved by moving was lost buying the stupid expensive monitors.
I wouldn't have considered trains to be a source of EM interference but in retrospect it makes sense. one that's amazed me is that military jets and warships cause some of the nearby consumer electronics go wonky, but as they literally run systems that are trying to jam radio waves, it makes a ton of sense that eg garage doors have trouble keeping up.
Doesn't have to be. If I TX with 5W near my new Lattitude laptop, the cursor goes all over the place.
I've heard of electricians using uhf radios tripping GFI breakers.
I guess what is normal power levels in ham / professional radios are much more than most other stuff is designed to withstand.
Similar problem with my 2014 MBP: I lose wifi if I plug it into my UHD monitor with a DisplayPort cable, but not if I use HDMI. Probably has to do with cable quality (bought it on Amazon).
I have a thunderbolt to 4 port usb 3 hub that when connected to a 2019 MacBook Pro interferes with the mouse pointer (it sticks and the pointer goes large every few seconds). Could this be related?
Yeah, we have crap power where I work and have cut down a lot of weird, random issues by buying a decent UPS for each worker machine. Even the people with laptops.
5GHz wifi is a lot more stable. Mine used to disconnect with microwave running. Also your hub might have leaky cables. I had an issue like this with bad hdmi cables too.
Microwave ovens and wifi use the same frequency range for the same reason: 2.4 GHz is available as unlicensed spectrum. In theory someone could build a 5 GHz microwave oven, but it wouldn't be cost-effective for a consumer appliance.
I once heard something about that frequency working better with water molecules, but after looking it up I think it's a myth.
A 5 GHz microwave would be more efficient but you don't necessarily want more efficient for cooking - that would just heat the outside of the food fastest. Commercial microwaves apparently run somewhere in the 900 MHz band. There are some neat graphs here:
Yes, although Eben Upton suggests trying a better grade of HDMI cable... the frequency of the higher resolution on screen will radiate noise out of a poorly shielded cable enough to interfere with Wi-Fi in the 2.4Ghz range.
Don't use a cheap cable, use one rated for 4k, and see if that helps.
I will really like to hear what the root cause of this is.
This makes me think of the bug that the QCA AR9331 SoC has. The AR9331 is extremely common in small travel routers, but it has a fun bug where one of it's clock sources is shared between the 802.11 wifi and the USB port. IF the USB port is negotiated at USB 1.x speeds AND the 802.11 radio is scanning, the USB will freak out and die. This generally requires the 802.11 radio to be in client mode rather than AP mode. You can read some details about this on the old OpenWRT forum, if it survived the great forum purge of 2018.
Unfortunately not only with cheap external USB 3.x drives and not only with old MacBook Airs. Connecting an Anker USB-C dock/hub to my MacBook Pro 2016 and later 2018 would consistently cause interference with the Microsoft Sculpt Ergonomic and Apple Magic Trackpad (both on 2.4GHz). Then I got an Aukey hub (since my wife was happy with one) and the problems have vanished.
> the noise from USB 3.0 data spectrum can be high (in the 2.4–2.5 GHz range). This noise can radiate from the USB 3.0 connector on a PC platform, the USB 3.0 connector on the peripheral device or the USB 3.0 cable. If the antenna of a wireless device operating in this band is placed close to any of the above USB 3.0 radiation channels, it can pick up the broadband noise. The broadband noise emitted from a USB 3.0 device can affect the SNR and limit the sensitivity of any wireless receiver whose antenna is physically located close to the USB 3.0 device. This may result in a drop in throughput on the wireless link.
The money quote:
> With the HDD connected, the noise floor in the 2.4 GHz band is raised by nearly 20 dB. This could impact wireless device sensitivity significantly.
Besides having properly shielded devices and cables (which manufacturers often don't bother doing), they also recommend that the plug in the laptop be fully shielded or enclosed in a metal chassis (which is fulfilled by having an entirely metal case).
I don't know of a cheap RF analyzer but I'd like to get one at some point. I'm curious how many common devices actually adhere to the FCC regulations and/or standards like USB 3, compared to how many are just cheaply made.
> I don't know of a cheap RF analyzer but I'd like to get one at some point.
That’s one of the features I love about my Aruba APs/controller. Not only does the APs dedicated as AirMonitors/SpectrumMonitors monitor at both the Wi-Fi level and radio spectrum level and shift frequencies as needed, but I can also get a high quality live visualization of radio spectrum interference. Definitely not cheap though!
Aw, I was hoping the explanation would be in the thread. Is it high-frequency noise from an unshielded clock? Surpassing a current or temperature limit due to the stress of the high resolution and resulting high memory bandwidth? Time will tell!
> Is it high-frequency noise from an unshielded clock?
I bet that's it.
2560x1440 @ 60 Hz with CVT-RB timings has a pixel clock of 241.5 MHz. The TMDS bit rate is 10x the pixel clock, and 2415 MHz is right in the lower end of the 802.11 band.
If the Pi can be convinced to use CVT blanking, that'll raise the pixel clock to 312 MHz, which should be fine.
This is the answer. That mode puts the second harmonic going out of the HDMI cable smack in the 2.4GHz WiFi band (the fundamental would be at 1/2 the bit rate).
Now the question is who screwed up. Is it a leaky cable? Is it bad PCB design? Is it a problem internal to the SoC? Is it a power delivery issue? Knowing the RPi foundation and Broadcom, I bet one of them screwed this up for all RPis and it isn't just a bad cable.
I thought something similar: Pixel clock or some harmonic of it feeding back into the wifi chip via the power lines (image search says the chip is in a metal box, so should not be via over-the-air-RF). Thanks for doing the math.
Forcing a different pixel clock is probably the easiest fix, and since the ports claim to do 4kp60, that should be possible (if the display supports it).
Not the numbers the calculator I found online came up with, but I had the same thought. I'm thinking that changing the refresh rate could solve the problem.
Former communications semiconductor FAE here. We would troubleshoot issued like this all-day every-day for years, usually under NDA before the product is ever released to production. The solutions are routinely as weird as some of the "voodoo" hypothesis tossed around here - wait for it and you'll see. After a while it seems normal that all unverified combinations are broken and the moments of delight are when an unverified configuration completely works.
Why do all the proposed avenues of future investigation, and all of the current comments on this thread, focus on voodoo instead of the far more likely explanation that the display driver is just stomping on the memory of the network interface? If there's software anywhere in a system, 99% of the time that's the problem.
This is not true when radios are involved. In my experience, wireless connectivity issues are rarely caused by software; the problem is much more often caused by interference.
The interference can be internal interference in the device, or interference from other wireless devices. In many cases, the problem are even devices that shouldn't emit RF at all, like power supplies, switches, light bulbs...
Another common issue is poor antenna design (eg. attenuation when you hold the device, or strong directionality of an antenna that should not be directional).
And, last but not least, physical obstacles. Most people understand that concrete walls with rebar will block signal, but a surprisingly large number of people try to use aluminum stands or cases for devices with wireless radios.
All those factors will cause connection issues, and they are really common because debugging them is so hard (who has a spectrum analyzer at home? How do you find out which one of dozens of electronic devices is emitting RF that it shouldn't?)
In addition, the linked forum thread includes a user describing how high resolutions break 2.4GHz networks for them, but 5GHz networks work fine. The display driver is stomping on memory responsible for 2.4GHz, but not 5GHz? I'm really not seeing that as the more likely problem here.
5GHz WiFi has more bandwidth than 2.4GHz, so typically will involve larger IO buffers in the driver, which could easily be enough to expose a memory scribbler (I imagine there's a bunch of other features that are enabled/disabled by the frequency band switch too). However, I think
asdfasgasdgasdg's answer is the correct reason not to suspect a memory scribbler - ie a memory scribbler would cause the driver to crash/fail and the kernel would log a message.
Remember the Pi has an odd architecture and all the IO passes through the GPU. The GPU doesn't log human readable messages anywhere. There's a good chance the GPU did log a crash or failure, but only broadcoms engineers can see it.
It's a BCM2711, and the datasheet is NDA only - typical Broadcom!
The VideoCore (Broadcoms GPU) is the main processor on the thing, and the cluster of ARM cores that run Linux are more of a coprocessor which can only see some of RAM.
> 5GHz WiFi has more bandwidth than 2.4GHz, so typically will involve larger IO buffers in the driver, which could easily be enough to expose a memory scribbler
He's saying 5 GHz will expose the scribbler, and the opposite is happening, only 2.4 GHz fails.
@StavrosK Thanks for wading in in my defence, but I had actually mis-understood the situation :-)
Although, if my theory that the IO buffers are different sizes is true, then that could perturb memory layout enough to expose/hide the bug in either direction.
So the display driver is meant to be mutating memory also owned by the network controller, but not in a way that causes a crash, log messages, or a kernel panic? That doesn't seem so likely to me. I mean it's not impossible but it's rare to see memory corruption/interference cause a clean breakage like this. In my experience it usually causes things to become extremely funky for a short while, then a crash.
Every SoC I've dealt with containing a WiFi core has a dedicated coprocessor (RPU is a common name, depending on vendor) running its' own firmware. So more likely, _that_ core would go funky, then crash. The kernel might have code to recover that, but I doubt it, and it certainly would complain the whole way as you say.
In the Pi, the coprocessor is the GPU, and it is the first to initialize on boot and runs all the firmware-like stuff and handles all IO and does memory allocations/mappings.
Because what if it's not? My first thought is that the HDMI is radiating and interfering with the wifi antenna.
As an embedded engineer, it was a hard lesson for me to learn that not all issues are software issues and the hardware may need to be investigated.
This is especially true where there is different behaviour between units. You can't just assume that your 99% estimation (plucked out of thin air) is correct and discredit other potential explanations.
Then, after you done ruling out the most likely and easiest explanation to test, you can then start exploring the remaining possibilities. Skipping to the more exotic explanations sounds more interesting but it's poor use of time if there's still low-hanging fruit out there.
Improper shielding is an assumption with no evidence as yet. I also mentioned that the ease of verifying the explanation should be a factor. Changing software is usually very easy.
It's so common that it's not an unlikely starting point. EMC is a major issue in high frequency electronics design and the raspberry pi had a history of having to redesign certain parts because of not having enough shielding.
Absolutely, and this was before the Pi had built-in Wifi. The norms you have to comply for are immediately a lot stricter as your device falls into a different category (telecommunications devices).
wrapping tinfoil around an hdmi plug/cable isn't particularly hard either :) chips are harder but at least you rule out the cable. HDMI cables are ridiculously finicky if you've ever tried to get anything more than the lowest common denominator 1080p going on them.
I don't agree that wrapping foil is a great way to 100% rule that out as there is room for error. Using different cables/dongles would be better and they already tried that.
There's several small scale WiFi chips that share clock source with USB - it would be unsurprising to find that the WiFi and video interface are sharing the same clock, so drawing too much from either could directly effect the other.
These kinds of problems are common in embedded computers, like the Pi. Just as common as software.
I don't know much about the Raspberry Pi, but it looks like they chose an ARM core variant without IOMMU, so this might actually be plausible, even though it's such a computer architecture anti-pattern to share system memory DMA across devices.
Can you list which ARM cores you know of that include an IOMMU? I’m personally unaware of any, as that is typically bundled as a separate IP package that must be integrated separately into the system, and is usually customized based on the number of supported masters that require virtualization.
E.g. the Xilinx ZynqMP includes the same Cortex-A53 complex the Raspberry Pi 3 has. They also included CCI-400 coherent interconnect switch to it, and also included the SMMU-500 IOMMU that partially interfaces with the A53 interconnect, but is effectively independently programmed and also controls access to DDR3/4 from the SATA, Displayport and PCIe controllers.
Per the original topic, have they released a full datasheet/reference manual for the Pi 4 SoC yet? I’ve yet to see one other than a VERY high-level overview of it’s new pieces.
Huh, so that's why the iPhone 6s's SecureROM memory regions weren't MMU-locked... IOMMU doesn't come in ARM by default! So you have to wire it up yourself (in your own IP blocks), and then hook it up in software everywhere you want it to work.
And all that costs extra developer time -and money.
"kernel module" together with "should absolutely not be able to interact with each other" are an impossible requirement with Linux.
I think the other operating systems available for the Pi are roughly in the same boat (Windows & RiscOS). There was a nascent Minix port at some point, I wonder if it was abandoned.
Maybe the misbehaving driver is writing past the end of its requested space though, inadvertently? (I don't know if this is always called a "heap overflow" or if that's just Clang AddressSanitizer.)
That resulted in a wide variety of different failures, from the kernel oopsing to various userspace components crashing. It would be very unusual to have unexpected DMA trigger such a specific failure.
Mostly because of a known history over the past couple years of USB, WiFi, and/or HDMI causing direct interference with each other. See lots of other comments upthread about similar RF issues people have had, stretching all the way back to 486 laptop keyboards :)
EMI is a headache I deal with daily, on far more sensitive receivers, so voodoo is likely. Though just moving the unit next to the AP (increasing RX signal strength) is an easy diagnosis.
It certainly sounds more like a software issue than some arcane effect from RF interference or the like. Could be memory getting smashed, a bus getting saturated, an interrupt not getting serviced, or any similar thing.
Meh. I've done low-level embedded/mobile for a long time now. This actually sounds like a totally reasonable RF interference issue. 2.4Ghz is funky & has desense issues with lots of internal busses (not a HW engineer so not sure why that band specifically). Also radios typically have to accept interference which means the radio would "stop working" rather than causing the display to work weirdly (ironically a much easier failure mode to display/diagnose/notice).
when the late 2016 macbook pro came out with only usb-c i had to buy a usb dongle from amazon (the one included had not enough ports). if i booted the macbook in windows, with the dongle connected the wifi would stop working (the 2.4ghz one) and the 5ghz would work.
Duly noted! I've been out of the embedded space for a long time (I think the last board I worked with was i386EX based) but I'm getting back into it now with an ESP32 so this might actually come in handy. Thanks! :)
That was my first thought as well, but scanning out that resolution should use less than 1 GB/s of memory bandwidth which is nowhere close to the DRAM speed. And usually in that situation you get horizontal speckles in the video output.
I know PIs soft-require more power than the supplies most people give them, and they do things like CPU throttling when undersupplied. I think I've also read that WiFi and/or Bluetooth may stop working when underpowered. So it may not be a balancing thing so much as graceful degradation.
I don't know why it became such a piece of breaking news.
When a current flows through a p&n junction, photons are emitted (and an LED is just a diode that happens to emit photons at the wavelengths of visible light). And it works in both ways, if you hit a p&n junction with photons, you produce a current, not only LEDs - any diode will do that, they're all potential photodiodes, it's just that some are more sensitive than others.
You can cause a lot of chips to reset if you shine a bright beam of light to its exposed die, a common way to test chips.
It's also one reason (in addition to cost) that most diodes are sealed in plastic package, not glass package. Fun experiment: buy some 1N4148 small-signal diodes in glass package, connect it to a Darlington-pair transistor amplifier, and shoot the flashlight, you'll see some funny thing on the oscilloscope.
I think the newsworthy part of all that wasn't the physics involved, but more the design-choices on the Rpi 2.
Putting a light-sensitive chip (I think it was a wafer-scale package with no casing) on a board that's intended for use outside of an enclosure was a really big oversight.
> I don't know why it became such a piece of breaking news.
Popular buzzword (Raspberry Pi), surprising unexpected outcome (most consumer electronics people are familiar with don’t react to light), manufactured outrage/schadenfreude (look how they screwed this up!)
> Popular buzzword + Surprising unexpected outcome
Good analysis. In both cases, I see the popular press reports them in personified language, "Xenon Death Flash, or Why the Raspberry Pi 2 is Camera Shy", and "Why the new iPhone is Allergic to Helium". If we replace "Raspberry Pi 2" with "semiconductor p&n junctions", and replace "new iPhone" with "MEMS oscillators", it probably won't be news anymore.
> manufactured outrage/schadenfreude
An interesting case as well. I see both incidents as undesirable side-effects that better to be prevented, but I don't think they are major design flaws.
Your version doesn't just remove any personification, it gets rid of the entire concept that these are exposed parts on consumer devices causing them to fail.
One of my professors told a story of reporters crashing one of AT&T's new digital exchanges because their camera flashes erased some of the EPROMS used in the system.
the whole thing, actually, not just the gyro. the clock is nanomechanical and helium breaks it, apparently. the newer models have a version that fixed that particular problem. it was in the fine print in the manual somewhere.
While it might seem a bit overkill try restarting your router’s WiFi to see if it magically works. I had a war with a pi zero W not that long ago...turns out the 2 band wireless would just die sometimes. Turned out to be an issue with the router ️
It was ultimately with the router. Specifically With NAT and 2.4 band. 5g kept going just fine. After looking at the manufacturers website the solution was to either buy a new router or disable port mapping.
Monitors creating harmonic interference with wifi is a very common problem, and most computers don't give any warning of this. I don't see why it wouldn't be possible to say "the wifi won't connect, it may be because of your refresh rate".
The plane can't take off because the carpet is the wrong kind of orange.
ObligAnecdote:
I once had a keyboard that wouldn't work when the monitor was outputting at 75hz. Had to be 60hz or else nothing. The joys of wireless keyboards.
I have this issue with a Zero W. Wlan0 would just disappear. I tried jessie, stretch, and now buster. It had been connected to a ultra-wide monitor (but not at 2560 width, obviously).
I can't get it to fail now, but it is not connected to a monitor anymore.
> It is a very slow computer for “normal computing purposes” though.
Can you elaborate what you're doing that is slow? In my experience, CPU wise it's plenty fast, matching typical desktop from 2008. Although memory bandwidth could be better...
Please don't use code blocks to quote text, they are impossible to read on mobile.
> Had the same problem with Raspi 4B. Problem was dependent on screen resolution (!!). With 1920x1080, wlan0 became disconnected after it was ok at lower screen resolutions. Was in 2.4GHz band. After turning on 5GHz in the router and going into the network preferences (right click the network icon top right on screen) and SSID ... and checking "automatically configure options" the connection remains stable (so far :D ).
Don’t worry about it too much. A large part of the problem is that HN comments don’t have a method of delineating quotes, so people default to this because it looks fine on desktop.
I wish the RPi was this standalone microserver that had its own flash memory (SD cards are known to fail) that you could plug into your home network and act as a personal server.
Maybe one day it will be possible to do some basic GSM data with a RPi.
If they didn't use SD cards, the storage would be more reliable, but users would spend a lot more time fixing bricked boards. By allowing for removeable storage (in a format that can be plugged into any other computer natively) they solved that problem because I can just re image the card and get going again.
The philosophy of the RPi is that they won't really add features to it unless a bulk of the user base would use the feature. For example they were hesitant to even build the WiFi into it because users who wanted that could always get a USB chipset, and building it in adds BOM cost.
Because with GSM you'd also need a plan for it, I don't really predict they'd add that. Especially since you can get it in hat format already.
You're talking about a $40 tool just to flash the storage vs a part (micro SD adapter) that they give you with the SD card for free because it's so cheap.
Also doing a quick survey for the rated cycle counts on M.2 vs SD card slots:
M.2: I found this one [1] which is $0.768 for only 60 cycles
SD: This one [2] is $0.6256 for 5,000 cycles
I'm not sure why you'd say M.2 is more reliable, considering users often cycle storage dozens if not hundreds of times.