Linus is right. Don't use spinlocks in user-land. The scheduler can't know you're spinning on a lock, so it can't help you. Don't do dumb things -- spinning on a lock in user-land is dumb. Mutexes are as fast as spinlocks in the uncontended case, so use mutexes.
Linus is right (for the kinds of code Linus tends to think about). There are good reasons videogame engines tend to use spinlocks, and if the takeaway is "Linux pessimizes for that use case," the implication is "Linux pessimizes for high-performance games atop its kernel."
No, linux pessimizes for that case if you do not cooperate with the scheduler.
You can implement spinlocks in userspace under specific circumstances. You will need to mark your threads as realtime threads[0] and have a fallback to futex if the fast path (spinning) doesn't work out. And even then you need to benchmark on multiple machines and with realistic workloads (not microbenchmarks) to see whether it's actually worth it for your application. It's complex, linus goes into more detail here[1]
If you have a fallback to futex, then what you have isn't a spinlock, but an adaptive mutex (which spins, then blocks). Lots of operating systems have those already and you don't need to build your own.
If you really want to use a pure spinlock you'll either have to accept that worst-case behavior being worse than a mutex or you have to jump through a lot of hoops (pinning threads, isolating cores, reassigning interrupts) to make sure you don't get descheduled. Because the scheduler won't know anything about priority inversions or even waiting threads, they all look runnable.
In Windows, CriticalSection can spin for a user-defined period of time, which is more or less what you want. Although it would probably make more sense for it to have a "spin until my quantum is over" option because how is the caller supposed to know that?
> There are good reasons videogame engines tend to use spinlocks
Care to elaborate upon this statement?
> "Linux pessimizes for high-performance games atop its kernel."
Games can run at quite a high performance (which feels like a really odd statement) atop the general purpose Linux kernel. Even with the windows-specific APIs being emulated with Wine/Proton.
There are many modern games, including Doom 2016 as I mentioned below) which run beautifully on Linux.
I wouldn't redo the work of people who have literally written books on the topic (https://books.google.com/books?id=1g1mDwAAQBAJ&pg=PA319&lpg=...). Copyright 2019, so if it's not conventional wisdom that spinlocks are the right tool any longer, the "best practices" in game education are still teaching them, and OS's may need to account for the code actually written, not the code they wish were written.
> OS's may need to account for the code actually written, not the code they wish were written.
I don't think this should be asserted in such a black-and-white way. Yes, platforms need to accommodate the programming styles of users. But accommodating them blindly, when better solutions exist, is how you get bad APIs that age poorly.
In platforms that are aiming for long-term stable APIs, it may make more sense in some cases to prioritize the cleanliness of the API over developer patterns, given that developer patterns are easier to change.
> OS's may need to account for the code actually written
I absolutely disagree with this idea, particularly as it's related to spinlocks. Spinlocks are such a low level, inefficient, method of acquiring a lock that using them as a de-facto method of acquiring a lock is a terrible idea.
Especially as more and more gaming occurs on devices with batteries. Any moderately sane OS will treat the resulting device heat increase and increased battery draw as a trigger to degrade CPU performance, hurting the game performance even more than using a sane lock acquisition method.
If gamedev books are teaching specific quirks of the Windows scheduler, those books should be updated, rather than making linux emulate those quirks of Windows.
Anticipated response:
> "but that won't happen and gaming on linux will be too niche and will die."
To be quite frank, that sounds like a satisfactory outcome to me. If Google wants a different status quo, they should throw more money at the problem rather than telling game developers to moan about it. You don't hear game developers moaning about this sort of shit when porting their games to the FreeBSD-backed PS4, presumably because Sony, unlike Google, actually gave a damn and made sure the platform worked and made sure developers knew how it works. If the requisite work has already been done on FreeBSD, then maybe Google should have decided to use that instead. Either way, it's on them. They made this mess for themselves.
Definitely. It's not out of the realm of possibility (especially for Google) that the easiest solution to ease the woes of Stadia developers is to fork Linux and switch out the scheduler algorithm for their special-purpose application. Only downside to that solution is games ported to run on Stadia wouldn't be expected to perform particularly well on other Linux architectures.
No, there are not good reasons to use spinlocks in userland. Period. Even in a game. You always have the spinning while preempted problem. You can't blame a preempting operating system for preempting! Any situation you rig up where you don't preempt is very fragile.
You should always use waiting primitives when possible instead of just spinning. If you tell the kernel what you're doing, the kernel can help you. If you hide the wait DAG from the kernel, it can't.
If people are learning bad practice from the texts, and the bad practice hits Linux's scheduler asymmetrically hard (relative to other game platforms), one can expect outsized impact on the performance of games atop Linux architectures.
Those books are wrong and dangerous. There's no other way to describe the situation: they advocate a technique that's both counterproductive and detrimental to game and system performance.
This is where technology meets social network effect. Which is easier (modifying the scheduler for better spinlock support or get developers to stop using spinlock when it's easy to implement and works in other OS targets) is an open question.
"Don't use spinlock" is like "the user is holding it wrong" for API. The user can either choose to modify their behavior or throw up their hands and not port to Linux.
You cannot "fix" the kernel. Fundamentally, the owner of spinlock can get preempted for a large number of different reasons, and in such a case, the spinlock will spin until the owning thread gets scheduled and runs again. The kernel can't prioritize re-scheduling the owning thread because it doesn't know that some other thread is waiting on that thread.
Fundamentally, you need to tell the kernel what you're doing so it can help you. There's no fix here.
And given the over-arching topic of performance in Stadia, it may behoove Google to fork Linux and specialize the scheduler implementation to be more spinlock-friendly. Especially if Linux's benevolent dictator for life is telegraphing he doesn't consider it an interesting problem to solve.
No, it cannot be fixed, because the kernel simply does not have enough information in the case of naive spin locks to make good scheduling decisions. I understand the Chesterton's-fence argument in favor of learning why game developers use spin locks before telling them to do something else, but in this case, I really think simple ignorance is the most likely explanation.
You never want a spinlock in userspace. Period. An adaptive mutex will give you what you want without the pathological downsides.
Look: your whole line of reasoning here is just bizarre and invalid. You can't say "well, maybe there's a reason they're doing it!" without actually backing this claim up with specific reasoning. Sorry, but you can't keep using your spinlocks even if they "work" on other platforms. If they "work" there, it's an accident.
What if you've already used them in code that runs in N other architectures, the game works fine in those architectures, and Linux is architecture N+1, which you're trying to decide to support or not?
The less the code has to be changed, the easier it is to support architecture N+1.
> "Linux pessimizes for high-performance games atop its kernel."
Yes, because that's not the main focus of a general-purpose kernel. Which doesn't mean there aren't ways around it if your objective is running Linux for a specific application.
The real time patches are in the mainline kernel since some time ago, but there are other options for softer RT at the kernel as well.
if you read Linus' post, he says as much immediately after. He says he includes himself in that "you", and notes that the kernel is a result of decades of development by many people working to fix issues that pop up implementing the hard stuff.
They're very amazing, it feels truly magical, but in my experience, they're very fragile. Back in ~2014/2015 I was able to play Skyrim on my trash cheap laptop in Archlinux (don't remember the settings and fps, but the game ran "flawlessly" for me, yet I need to admit I'm not a gamer so I can't tell the difference between 30fps 60 fps etc).
I tried the same in 2018 just for fun and I wasn't able to run Skyrim, there was always some minor issues that made the game unplayable, like mouse not working, crashing on certain screens etc. I essentially kept trying older versions of Wine, and eventually found some version that made Skyrim playable (still not as flawless as it felt in 2015, but playable). Idk if it was my mistake, or some archlinux lib weirdness or something else, but I had to try dozens of wine versions to find something that worked. This is fine for me, since I wasn't intending to play Skyrim, I was just trying to run it (for the "WOW" effect), but I imagine people who actually need Windows software might have been frustrated by this.
FWIW, I've found that using the Steam launcher eliminated any compatibility issues which may have cropped up. I'm not a linux desktop power user, and didn't need to be.
> I'm not a gamer so I can't tell the difference between 30fps 60 fps etc
It's like driving a car with a rough idle vs. a smooth idle. You can get used to driving a car with a rough idle, and not particularly care. But once you've driven both, you can identify which is which pretty easily, and the rough idle starts to bug you when you know it can be smoother.
Everyone who has played more than one video game knows that some just feel better than others. In one, your character is fluid and nimble, and in another, slow and clunky.
This is the result of many things, but a large portion comes from input lag and frame rate. More advanced users aren't necessarily more perceptive, they're just better at pinpointing what they're reacting to. (And in some cases, conscious of how an experience could be improved.)
This is similar to the study of any type of art. Whereas a casual observer might look at a painting and say "meh, I don't like it", someone who has studied art pedagogically can describe precisely what isn't working for them.
To add, how you play and what you play matters too. 30fps on a TV with a controller in a racing game won't feel that bad compared to 30fps in CS:GO with a mouse - where 30fps will make the game a LOT harder to play.
If you have an iPhone a quick way to see the difference between 30fps and 60fps is to open the camera app and move the phone around in Photo mode, then switch to Slo-mo and try it again. You’ll notice that slo-mo feels way smoother. On an iPad Pro with the 120Hz Liquid Retina display it’s even more noticeable.
I'd say it's the opposite, Windows gamers are more used to this. I think by now one of the biggest parts of Nvidia/ATI drivers are patches and workarounds for certain games. Add the patches to the games themselves, and you've got a very fragile landscape. I've known a lot of people who aren't upgrading certain games and drivers for fear to upset their fps rate, mod compatibility or overall stability.
Heck, there are people keeping older PCs around just for that, so you can e.g. run your Skyrim LE with Win7 and an older DirectX release etc.
"Retrogaming" is quickly approaching a very small time delta.
I guess that's the price you have to pay for performance and availability. Games shouldn't be that brittle, but any fast-paced industry is likely to run into those problems, and with high-speed internet patches and a GPU duopoly, fixing it is too easy.
Indeed, i remember that i was wondering very much, when i found that Doom3 ran noticably smoother on Linux than on Windows on the same machine. I think i was even using wine back then.
I use some CAD/CAM programs on Wine and they're sensibly faster in both rendering and general usage compared to a windows install (I have a dual boot system).
I can only complain about some minor font rendering issues and the lack of child window rendering [0] which is frequently used in programs like these.
Once child window rendering is fixed, I will actually be considering ditching windows entirely and go wine-only.
I noticed this as well. Old Windows programs, like programs from XP era and earlier ran better in Wine than Windows 7 back when I was experimenting with Wine.
I wonder if that's due to the change in the way Apps are composited.
In the pre Windows Vista Era, the operating system handed control over to the App to draw itself on screen. In Vista they moved to a model, like OSX, where each App drew to a buffer and the OS composited those buffers. That's part of the reason why Vista was hot steaming garbage because they were still working that transition out.
Can't disagree more. I don't live in a big city. My best guess at where the data center is would be a mid-size city about an hour away, or more likely, a big city about 2.5 hours away (driving time). I didn't have high hopes for Stadia but wanted to check it out. I have zero problems with latency, and never think about it when playing. It does have its bugs...random crashes, a few seconds of LoFi here or there probably due to jitter somewhere...but overall, it really works well for me.
Stadia is a bad product, not because of the technical difficulties but because it makes no sense to pay for a service and then pay for access to the content in that service (the games) and if you stop paying for the service you lose access to the content (for which you payed full price by the way). I wouldn't mind a "Netflix for video games" but Stadia isn't it.
I mean, it's not that different than the current model. With XB1 for example, you lose access to the 'free' games, as well as the ability to play online, if you don't have a monthly subscription. For single player offline games, it might be ok. But Stadia supposedly will have a free tier I think, I don't keep up with its news much.
You do know that they are going to release a free 1080p version this year right (although you still have to pay for the games)? You only need to pay the subscription if you want 4k @ 60 fps.
There is a free tier coming, so you even if you stopped paying for "pro" you would be able to access any games you had purchased.
This is the same as Xbox and PSnow, etc. You have to pay for the service via a game pass or pro membership on top of buying the game and possibly the system before you could stream it.
Even in the case of shadow, you'd have to pay for the service to stream games you'd buy and access off your own/their system.
I would not get it for this exact reason. However to counterpoint your argument this will work for some people like my high school kids.
They stream spotify because they don't want to buy songs and after that current song is no longer popular they move on to another one and forget the first song ever existed. They play video games and when they are done they sell them and buy a new one, never wanting to revisit the old one. They spend $50 on the new game and sell it for $15ish. For that mindset of people this is a great product, along with the digital only xbox one.
I am not implying all kids or youth are like this, mine are and a bunch of their friends are. I prefer gog or a physical copy, but then again i don't buy that many games.
That's completely wrong. Any games you paid for are yours to play without any monthly subscription. The monthly subscription does give you access to a free game or so a month that you'd lose access to and you wouldn't be able to play at 4K but you wouldn't actually lose access to the games you paid for.
But the issue isn’t bandwidth, really, is it? I would think the ping would be the killing factor. Do they have a plan for that? Server farms no more than 20ms away from anywhere in the US?
These seem like pretty common latency numbers for online gaming. I guess it feels worse when there isn't the movement prediction and other techniques to make the latency feel better since the game isn't running locally.
Well the issue is that you have input lag on your controller inputs. When I move my crosshair it takes a long to actually show the crosshair moving on screen.
In normal online gaming you do not have that issue. When I move my crosshair in cs:go it moves instantly.
Right, yeah that's kind of what I was trying to suggest. Honestly, I don't really see how this can work in the general case all that well given exactly what you are saying. I have seen some examples of fighting games being played over stadia, but many fighting games rely on frame-perfect input (and purists in communities like melee dont even use certain monitors due to this...)
Some games can be far more demanding than video streaming. HDR content, 4k, >60fps, 4:4:4 chroma (text looks horrible on 4:2:0). The requirement for realtime encoding instead offline encoding also means the codecs get less lookahead (as few frames as possible) which results in higher bandwidth requirements for a certain quality level compared to VoD.
Absolutely. Doing realtime video encoding is costly as it is, and the kind of hardcore gamers Stadia seems to be marketing for are definitely going to notice this. Some games are absolute codec poison, too.
Check out some of the Games Done Quick runs from last summer on Youtube, for example. A lot of the FPS games especially look dreadful. I'm sure Stadia can do better, but there's only so much bandwidth you can throw at this problem before it becomes untenable: https://twitter.com/dada78641/status/1207751665752911872
To me, the main issue with Stadia is not the product itself but the massive amount of extra CO2 that's generated trying to solve all these issues that were already solved decades ago by having your own cpu in your own personal computer in your own house.
You hit the nail on the head. The distribution of local server farms in Google's network is much, much larger than people assume (has been for awhile).
There's also a chicken-egg effect that Google's still interested in for secondary reasons. Reverse the question: assume Stadia takes off and becomes the "killer app" of gaming (big assumption, but still). Will local municipalities continue to tolerate crap networks if it means they're left out in the cold on this thing? Google's still interested in improved network infrastructure and has the clout to play incentive games to make that scenario more likely.
It exists. RetroArch has it in some cores, for example. It was originally conceived of and made by emulator developers.
Of course, it gets really hairy really quickly. Technically, a perfect runahead/negative latency requires ([all possible input states] to the power of [amount of frames of lookahead]) frames to be rendered.
A single player NES game has 8 binary state controller buttons, so 2 to the power of 8 input states. You need to render 256 frames for each frame of runahead.
Now surely you're thinking you'll just prune the set of possible input states through smart prediction of what the player will probably do next, but it doesn't matter. The number very quickly grows to ludicrous amounts of processing for anything that isn't an antiquated console. Let alone for 3D FPSes that use float values for e.g. mouse state.
Edit: and I should add that if they even try to do this, the amount of CO2 they'll waste is going to be horrifying.
GGPO (the fighting game rollback implementation) simply predicts that each player's inputs on frame N+1 is the same as frame N. This works well enough for fighting games but may not generalize, especially in games with precise analog inputs.
Also, there's a hard requirement that in order to roll back 3 frames the engine has to be able to simulate re-running those 3 within the timespan of one frame, which is sometimes an issue when trying to patch this onto a finished game.
It's primarily latency but bandwidth can also be an issue. Assuming you have a fast enough Internet connection (possibly not a safe assumption here in the U.S.), imagine your PC or mobile device(s) decide to update themselves or just generally get 'chatty' while you're playing. I've had this create problems for relatively bandwidth-sipping VoIP sessions so I'd imagine it would be much worse for a game experience.
Yes, the problem is latency. Which is actually not that big of a problem in some parts of the world. Also, if you have a metered connection bandwidth can become a problem.
I have tested stadia at family and I was really impressed how well it works. Personally I am not going to pay for it and will keep using my consoles, but Stadia certainly works.
Local conditions (US for example) are not necessary a showstopper in Japan, South Korea or parts of Europe. Low latency extremely high bandwidth is the new playground.
No it's not haha. I live 20 mins from the center of London and the best broadband I can have at my address is 18Mbit. Yes, Megabits, not Megabytes. And this sort of shit-tier internet is very common in England, Stadia is a non starter for many people here.
I mean, you should get 50 Mbit/s from very affordable mobile broadband. If you have ASDL landline, you can get hybrid solution (combined mobile + landline that works as one) and you should get 100 Mbit/s
As long as you're on fibre, I don't think the link is relevant. You've got much more latency to nibble elsewhere. AFAIK the rendering pipeline is where most of the work is. It certainly seemed that way back when Online was a thing (the first service of this type I can think of).
Absolutely. One could even make a decent claim that (barring some truly ancient hardware between input, CPU, and monitor / audio circuits) it's impossible.
It's surprising how much latency a human will tolerate without notice, however.
>One could even make a decent claim that (barring some truly ancient hardware between input, CPU, and monitor / audio circuits) it's impossible.
I think that's only if we ignore monitor refresh rates. Most of us are maxed at 60 FPS which with a lot of hand waving is equivalent to ~17 ms latency. It's relatively easy to get a network with latency better than that. If you can achieve the rendering + data round trip in under 17 ms you're going to be functionally equivalent to a local computer.
Given that I'm running fiber that connects to the local university backbone, I'm running the fastest possible link in my area (excepting those in the university).
So, I'd say that I'm running the best case scenario available within my area.
That's not accurate. A university is a relatively small population and not something that Google will necessarily optimize for. And there's no guarantee that the university isn't doing something wrong that is artificially increasing latency.
My anecdata counterpoint is that I pay for the cheapest/slowest broadband available in a midsized city and my ping to google.com is 14ms.
I live near London central (20 mins from Oxford circus), the best internet I can get is 18 Megabit, and this is in a relatively modern building (< 15 years old). My situation is not uncommon here in the UK, Internet infrastructure here is terrible. Google Stadia is a non-starter for me.
Most of the seattle area. Anywhere there is only comcast, or frontier or non ATT Uverse fiber in orange county (Orange, Santa Ana, Anaheim, etc. im sure most of irvine is there). I have doubts about shared google fiber in high density housing as well. 35 MB is a big number.
35MB is a big number but luckily the requirement is 35Mbs. My experience with Comcast is limited to 2 areas. Northern Virginia where Comcast has competition and rural central Florida where they don't have any. In both locations Comcast is more than able to provide the required 35Mbs even during busy times of the day.
So it's not a good product for one US area. What's your exact point?
Product doesn't have to cater to every single place on earth to be viable. It's like arguing that ARPAnet was a terrible product because not everyone had T1 lines.
"200mb uncapped internet in the UK is super common and pretty cheap (£30-£35 a month), as it is in most of western Europe."
This isn't accurate coming from someone living in London. Most of my co-workers do not have access to "200mb" internet and most common is 25mb or less. I personally only get 18Mb on my home connection, i.e. roughly 2MB/sec and this was the best I could get from any provider.
The state of Internet infrastructure is fucking terrible in the UK given how small a country it is. Coming from California to the UK felt like going back in time in terms of Internet speeds.
FWIW, Seattle's always been behind the curve. I don't know why, but even in the AOL days they were bad relative to the mid-sized east coast town I grew up in.
In 2017, an average 20% of households in the EU did not have access to internet connections faster than 30 megabit/s and a report a year ago found that the original goal of 100% coverage by 2020 would not be reached.
If you zoom in you'll see that the average is misleading: There's a lot of areas with little coverage hidden in that average. It also includes LTE connectivity which theoretically gives you > 30GB and covers a large area, but practically doesn't.
Germany is pretty bad when it comes to internet speeds. Broken market.
You can get cheap 1 Gbps wired Internet in France, yet, in the 5 cities I've lived since 2016:
- I actually have > 100 Mbps in 2 of them (both in the 20 biggest cities);
- I painfully had > 5 Mbps in 1 of them (in the 20 biggest cities)
- I painfully have > 500 kbps in 2 of them (much smaller cities, but still in big urban areas).
I assume France is in the developed world as of 2020 and I believe I've more luck wrt Internet than most people here.
Most of America that isn't in a major coastal city. Cable internet promises more, but it's a shared backbone that is not properly provisioned in most locations during primetime. Streaming from Netflix and YouTube in 1080p is problematic in many of those areas.
To pick a state as an example, Iowa speed tests had a mean download speed of 71.39 Mbps in 2018. That is twice what Google says is the requirement for 4k Stadia. So even if prime time speeds are lower, there is still a healthy margin.
I have 250/100 mbit/s, my brother has 1000/1000 mbit/s. Both are included in our rent.
I don't like Stadia for other reasons but saying that consumer access is not capable because you live in a country that sucks on that part is not really a good argument.
Do you think modern cars are a bad idea as well because there are parts in Africa that doesn't have good roads?
I think the more interesting question here is why don't Stadia games use isocpu and sched_setaffinity to ensure their games have the best CPU cache performance and lowest latencies possible.
They control the whole box, it should be more like a console than a regular PC.
There are fairly standard real time scheduling techniques to give whatever time slice you want to each process. Usually you architect it to have a small hard real time and another process for soft real time (e.g. paging data) and you use a priority queue for world updates (intersection tests, gun/missile flyout, updating other player positions). The update thread gets a minimum time quantum and then yields, round robin sharing.
What Linus is really saying is that the way Skarupke measured the spinlock being 'bad' is wrong. Skarupke tried to measure it in C, but you can never be sure if you're not being scheduled by anything else.
Skarupke inaccurately defined this as 'mutex vs. spinlock' and an ongoing debate surrounding this as they are comparable, when in no way are they comparable as in userland you simply cannot use spinlocks.
The Windows scheduler apparently doesn't do this, and handles spinlocks better; whatever that means. But the way it works is that spinlocks should work poorly if you run them poorly. Full manual-control of when running a spinlock in userland, you will be scheduled by other things, and you should know that.
That's what Linux is really about. The ability to do something and not have the system "better-ize" it like Windows does by doing some black magic and running an unknown process alongside your spinlock to make it run better.
A spinlock run in userland SHOULD be scheduled by other things, as Linus says. As Linus implemented in the kernel.
Kinda. Normal Windows applications use a system call to wait for new messages to arrive, and then react to those. This is very similar to waiting for a mutex, and it allows the kernel to put the thread to sleep until a message comes along.
Games often integrates the message loop into their game loop. In their loop, they poll to see if any new message has arrived, and if so handles it. In any case it does it's game thing.
If there's not much to do, because the game is paused when it's minimized for example, this spins much like a spinlock spins. If, in addition, it has some other threads it's synchronizing using spinlocks then yes, it can cause it to burn several cores worth doing nothing.
Better games handles the loop differently while the game is minimized, for example doing the default wait-for-message variant.