> Here's the thing, we want AMD to join the graphics community not hang out inside the company in silos. We need to enable FreeSync on Linux, go ask the community how would be best to do it, don't shove it inside the driver hidden in a special ioctl. Got some new HDMI features that are secret, talk to other ppl in the same position and work out a plan
for moving forward. At the moment there is no engaging with the Linux
stack because you aren't really using it, as long as you hide behind
the abstraction there won't be much engagement, and neither side
benefits, so why should we merge the code if nobody benefits?
> The platform problem/Windows mindset is scary and makes a lot of decisions for you, open source doesn't have those restrictions, and I don't accept drivers that try and push those development model problems into our codebase.
I agree with Dave. If they need an HAL, that is why standards are for. USB does not need any HAL. It is standardized. And this is where Khronos group falls short.
They provide a standard implemented by the driver and not the hardware. There is not even a standard to get performance metrics for GFX cards. Nothing.
I agree with Dave. If you do not want to create the standard, leave others to do it. But having an HAL inside the driver is problematic.
Shall we have a cross-platform standard for writing cross-platform drivers? Write once, run everywhere?
Why not as long as it is open source.
But it still needs someone to govern it, like linux kernel project and device companies do not seem interested. Which says a lot for their intentions.
That's an interesting take from someone who develops a kernel that the rest of the world has to rip out all sorts of "linux'isms" from code daily and deal with the pain of porting non portable Linux code to their platform. Jesus the hypocrisy is deep.
So Linux is not allowed to demand quality code that is following their conventions and has to be satisfied with the minimum amount of effort for AMD's Windows code to run?
Why should the Linux kernel include an abstraction layer for AMD's Windows code?
Why would any sane person agree to that?
> all sorts of "linux'isms" from code daily and deal with the pain of porting non portable Linux code to their platform.
If you're developing for Linux, using Linux specific technology, then of course there would be porting effort required.
Same as if you want to make you Windows stuff work on Linux, there should be porting required - after all, it's a different platform,
What AMD wants to do is to sidestep as much of the porting as possible, by effectively shipping their Windows code inside the Linux kernel.
> So Linux is not allowed to demand quality code that is following their conventions and has to be satisfied with the minimum amount of effort for AMD's Windows code to run? Why should the Linux kernel include an abstraction layer for AMD's Windows code? Why would any sane person agree to that?
Linux is open source, so if the kernel developers desire better designed code, they are free to change the code up to their quality levels. If the kernel development team does not have the manpower for this, they should better think about a way to maintain the kernel that involves less work. One example (among many) would be to think about a way to keep the internal kernel interfaces typically stable over many years so that only rarely there is a lot work to be done for updating all the drivers to the new internal kernel interfaces.
> If you're developing for Linux, using Linux specific technology, then of course there would be porting effort required.
The released open source drivers seem to work quite well (as they do on Windows). The problem is that they don't fit the taste of the kernel developers.
Open source does not mean, "any code accepted here!"
> if the kernel developers desire better designed code, they are free to change the code up to their quality levels.
They are also free to reject bad code and demand that if you want you code in, you should improve it.
I don't get this mentality at all; why should the kernel developers accept inferior code and then improve it? Isn't that the responsibility of the vendor who designed the product? After all, AMD is a for profit company, not a charity. Why should the other developers provide charity to make AMD code better? So that AMD can sell more units or have better PR? What?
>The problem is that they don't fit the taste of the kernel developers.
That's not the problem.
The problem is that the drivers were designed for Windows, not Linux and such code is not suitable for inclusion in the Linux kernel.
If I'm following the conversation correctly, it doesn't break anything, the kernel developers don't understand it. Also, the followup from Alex Deucher of the AMD team is interesting: https://lists.freedesktop.org/archives/dri-devel/2016-Decemb... Basically, he reckons that the atomic modesetting code is a poorly-thought out and maintained disaster that regularly breaks multiple drivers - and having somewhat followed the changelogs, I can entirely believe this. (Dave Airlie responds by blaming AMD for not sufficiently testing the upstream kernel developers' buggy changes to their drivers.)
> If the kernel development team does not have the manpower for this, they should better think about a way to maintain the kernel that involves less work.
Or, they could have AMD do the work, since apparently they're the ones who didn't listen -- after being told months ago -- that this rejection could happen when they tried to include a HAL with their driver. I think that perfectly works: the kernel developers don't have to "do all that work" involving un-fucking AMD's driver, and AMD instead has to do the work. Sounds good to me.
There's literally 0 point in accepting the code as-is, because everyone would be on the hook for maintenance it in the mean time, while it got un-crappified, and it would make graphics subsystem maintainers life worse. No. Kick it out, make them do it the right way, and when they come back -- they can talk.
This whole thread has got plenty of of entitled whiners and people with a bone-to-pick against Linux, like yourself, bitching,about OSS maintainers not making their own lives harder because you want feel good about your graphics driver. Get over it. Or, get involved -- maybe you could send a few patches to AMD maintainers to clean out some of the crap.
AMD is not being banned from kernel development, but they're going to have to do it right. Just like the other 20-30 companies that regularly contribute upstream to Linux with their money/developers.
In fact, Linux often accepts drivers for hardware that only certain companies have access to and are of no use to anyone else in the general public. Why? Because they played by the rules, meaning the overall maintenance cost to include those drivers becomes far smaller in the long run. The cost of including the AMDGPU driver, as it is now, is astronomically high in comparison.
> The released open source drivers seem to work quite well (as they do on Windows). The problem is that they don't fit the taste of the kernel developers.
No. Let's be clear: the problem is AMD can't listen, and people like you apparently, can't read. That's about all it comes down to. You are free to now whine and complain about how incredibly important this is and how it's definitely worth breaking the rules over and how much it means to you, and I'm sure the kernel developers owe you this feature, or something (after all, developers are just robots with no lives and if they don't work hard enough, they're bad.)
Meanwhile, it will be ignored, Linux will go on (and continue to crush its competitors in the spaces it matters in), and the world will still turn. And maybe in a year from now AMD will actually have something worth merging. In the mean time, Nvidia will continue to dominate them in the compute market. Maybe actually listening 9 months ago would have saved them some time and market share.
> In the mean time, Nvidia will continue to dominate them in the compute market. Maybe actually listening 9 months ago would have saved them some time and market share.
NVIDIA is about the furthest thing from a shining example of good behavior in the Linux kernel. By not having open source drivers at all, they're far worse.
(By the way, their closed source driver has a HAL too.)
Yeah, this seems like the kernel devs are making the classic "the perfect is the enemy of the good" mistake.
They're sending the message that AMD would be better off shipping closed source drivers, like Nvidia, because that would save them the headache of trying to get their merely good but not perfect open source drivers into the hands of users via dealing with Kernel politics.
I don't think there is much difference between blob and such code from the kernel maintainers' point of view. From users' point of view the difference is huge (freedom to tinker / fix bugs), but maintainers need to keep the whole kernel maintainable. Merging such code would be irresponsible.
And no, I don't want just "good" code in kernel I am using. This is not business. Make it maintainable so it can get better (perfect) in the long run.
(yes, I know there are places in kernel where code is not even good, let alone perfect, but that's another issue altogether)
The kernel maintainers can choose between continuing with reverse engineering and hacking together their own driver, or starting with the code of AMDs driver and adapting that.
And you really think continuing with the hacky reverse engineering is the better solution?
> Since when did writing a driver become the kernel maintainers' problem?
Since the kernel maintainers, and other open source contributors wrote drivers?
> I don't think the AMDGPU driver not being mainlined really affects Linux. On the other hand, AMD will really benefit from it.
You remember the days when Linux never would work on any real PC because audio drivers, GPU drivers, everything was missing? Do you want those back?
Having shitty drivers in the kernel isn’t ideal either, just like microkernels vs. monolithic kernels are a tradeoff, but at least cooperating with them in how to get it best into the kernel (AMD rewrote ~100k LOC since the last "Nope" from the maintainers) would be a lot more helpful than this.
The thing is though, if they shipped a driver like Nvidia does, where it's this blob that just wraps its tendrils into what ever systems are needed, that's not the graphics maintainers problem either
It's a net loss for users of AMD hardware (though AMD's hardware on Linux has been a dumpster fire for as long as I can remember) because they have to suffer arguably worse driver support, but that's on AMD for wanting to have their cake and eat it too
If someone writes application software that depends on the linux kernel, and someone else wants to port that code to another platform, how is the fact that code changes are required the fault of the kernel developers? I don't see any hypocrisy.
I feel X86BSD here, because Linux has a history of being different for the sake of being different. A relatively recent example could be getentropy (BSD) vs getrandom (Linux).
OTOH Linux has some extremely useful syscalls that others just don't have, sometimes causing major performance regressions. Case in point: sync_file_range.
But then Linux also has a history of screwing up and having a bunch of different syscalls on different platforms because somebody introducing a syscall didn't quite think it through. Case in point: sync_file_range. This "only" causes extra work for developers directly working with syscalls - so usually libc devs.
> because Linux has a history of being different for the sake of being different.
Heh.
I get the same feeling about git when I'm working with hg, the Canada[1] of git. For svn, darcs, bzr, and Mercurial, "revert" means to change the working directory to the repo state. Why did git use it to mean creating a new commit that undoes an old commit?
There's no use rallying against the majority software when it doesn't follow the standard. When you're the majority, you just get to make your own standards because you are the standard. BSD being the Canada of Linux, it's a little funny to hear them complaining about things that Linux does.
I upvoted your comment because it is a real pain. But I don't agree with blaming the Linux kernel developers for it.
Neither do I think it's analogous to this particular discussion either, because a, Linux kernel developers can optimize/prioritize their kernel code as they please.
b, In this case, AMD is submitting a patch to make it easier for them to support linux for their GPU. So the "ball is in their court" to find a solution the community finds acceptable.
Your case is about porting code between different OS kernels.
dave airlie forces people at gunpoint to only write software for Linux and nothing else, meaning HEROES like myself have to pick up the slack. fact-a-mundo. it's also perfectly well known that if you dump 100,000 lines of code on a project, they are MANDATED BY LAW to accept it and not have standards to abide by. who do these jerks think they are?????????
i'm definitely not mad and bitter about Linux in any way, grasping at straws in an attempt for relevance. i promise you. pinky swear.
Ah, I believe he is talking about the general pain all non-linux distros have faced over the years where we have to painfully keep replacing linuxisms in the code with portable code.
The problem is: Good driver developers are not always good politicians/diplomats and vice versa. What this followup demands is that developers are suddenly supposed to become politicians to get their implementation accepted. I mean: Lots of developers have a deep hate for "organizational politics" (IMHO rightfully so) - and this is suddenly something that should be accepted/loved when it is about Linux? I prefer to hold up the same standards to all kinds of organisations and don't make exceptions just because it is "for the good/open source/Linux".
That's not how I read it. The driver developers are now acting as politicians, trying to merge their own AMD-specific worldview into an open structure and trying to convince the rest of the DRI world that that is a good thing.
DA instead is saying to the developers that they need to play ball and work with the existing Linux DRI world and not silo themselves off.
> That's not how I read it. The driver developers are now acting as politicians, trying to merge their own AMD-specific worldview into an open structure and trying to convince the rest of the DRI world that that is a good thing.
If AMD decided to simply leave their driver as closed-source blob, they would not have this problem. But all the Linux fanboys that they want AMD to open source their graphics drivers. Just to state one thing clear: AMD already released specifications beforehand such that the kernel developers could have developed an independent graphics driver if they wanted. But it is better to shitstorm companies not to develop a driver than to sully one's hands.
AMD was nice and did its job. But instead of being satisfied with the result they now want AMD to start playing the political game with them (for the protocol: IMHO the correct thing to do would be to bring the driver to the staging area and now it's part of the kernel developers to sully their hands to bring the part up to the standards they want).
NVidia simply refuses to open source their drivers and does not get into this kind of trouble. Lesson learned: Never negotiate with terrorists.
You already lost the argument, if you have to go with that, but I'll try my best to explain anyway; what the kernel developers want is for people to use the standard, already maintained interfaces, instead of companies developing their own and adding unnecessary complexity to the kernel, which would then need to be maintained by someone for a long time.
AMD is basically writing an abstraction layer to allow them to use their Windows driver code, the Linux maintainers are saying; why should we have an "inferior" driver, that's basically "ported" from Windows? If you want to support Linux, write code that interacts with standard Linux interfaces, follows our conventions and benefits the community as a whole.
> Lesson learned: Never negotiate with terrorists.
So not wanting to merge shitty* code into your codebase is now terrorism?
> the Linux maintainers are saying; why should we have an "inferior" driver, that's basically "ported" from Windows?
As I wrote: AMD already indulged a lot (first specifications, then even an open source Linux driver). Even after the first step the kernel developers would be able to write a Linux drivers(though it is a lot of work). I already clarified: AMD did all this to satisfy what the Linux fanboys wanted (while NVidia did nothing). But instead of saying "thanks for all the work you did, AMD. The driver is currently not up to the standards that we desire, but it still helped us to make the driver development a lot less work. We [kernel developers] will do the remaining job and lift the kernel up to the superior quality that we want.". But that is not what the kernel developers did. Instead they want AMD to dance to the kernel developer's piping.
> The driver is currently not up to the standards that we desire, but it still helped us to make the driver development a lot less work. We [kernel developers] will do the remaining job and lift the kernel up to the superior quality that we want.
Do you know who the elusive kernel developers that you keep referring to are? They're mostly employees of various other companies who want their code merged into the kernel and have to follow the same standards to get their code in!
Why should AMD be any different? Why should somebody else pick the slack up for AMD? Does AMD pick the slack for Intel as well? And who would them keep up with their upcoming silicon etc. (It wouldn't be AMD, since this approach makes it easier for them). Why would one company be allowed to sidestep what is required to get your code into the kernel? Is it because they showed some good will in the past? Is that why inferior code should now be allowed?
> they want AMD to dance to the kernel developer's piping.
So let me get this straight. AMD wants to merge their code into the Linux kernel, (not the other way around), but it's the kernel developers who should instead "dance to the AMD developer's piping"?
Look, It's good that they are trying and obviously, the code may be accepted at some point in the future, if it's good enough, but for now, let's not let our emotion, (AMD are the good guys), override our rational thinking.
> They're mostly employees of various other companies who want their code merged into the kernel and have to follow the same standards to get their code in!
Hardware differs a lot in complexity. GPUs are very complicated.
Independently: The job security of these people also depends on the fact that it is so politically involved to get "their" driver into the kernel. So they surely have no incentive to make it easier for other companies/developers to get their drivers in (for example by very stable internal kernel interfaces).
> AMD wants to merge their code into the Linux kernel
AMD indulged on the desire of lots of Linux users for open source drivers. They did their job. NVidia did nothing.
> Hardware differs a lot in complexity. GPUs are very complicated.
Yes, Intel has an open-source GPU driver in the kernel, AMD can too, they just need to follow the conventions.
> they surely have no incentive to make it easier for other companies/developers to get their drivers in (for example by very stable internal kernel interfaces).
The kernel interface around DRI is actually quite stable, I don't think making it hard for AMD to merge in their driver would help with anybody's job security. It would certainly not be enough to affect GPU market share in a significant way, so it would be very risky for little gain?
Why invent conspiracy theories, rather than just accept the far more likely explanation that AMD's code is not up to the job?
> AMD indulged on the desire of lots of Linux users for open source drivers. They did their job.
No they didn't. If they wanted to fulfil the promise of delivering a kernel driver because their users demanded it, they would have done their job, had they produced code that follows the conventions and work with the maintainers to get the code accepted.
Throwing some code over the wall, does not meet any reasonable definition of "doing their job".
> NVidia did nothing.
Ah, so now it's about, "but look over here, they're even worse!"
I mean, OK, NVidia did nothing, AMD did something, but not enough. Intel did even more than AMD did and is still not perfect. Why couldn't AMD be at least as good, if not better than Intel? If you want to compare, why compare against the worst, rather than the best player?
And, why can't we judge this independently?
Irrespectively of NVidia or Intel, this is what AMD produced and it's not yet good enough.
They ignored feedback from february[0]. How can that be "doing their job"? Do you think Ms would sign drivers that did the opposite of their guidelines?
the central component that Microsoft requires is a code signing certificate (i.e. money and perhaps a little boring bureaucracy, which still can be done in a very systematic way). What Microsoft tests internally at the drivers is whether they have potential dangerous security bugs (e.g. buffer overflows). You can architect the drivers as you want to (though Microsoft provides guidelines and reference source code to make it easier to write drivers "the officially desired way").
Getting the drivers into the Linux kernel means - as one can see - going deeply into kernel politics. If Microsoft required something similarly, the hardware vendors would tap their forehead at Microsoft.
Great point! Microsoft does require something similar. You _only get to use the APIs they provide_ which absolutely means that you don't get to write kernel patches to change the source of the kernel to invent new APIs which you want to use to write drivers for your product. So yeah, Microsoft does exactly this.
It's not as though if AMD had developed drivers for Linux first that they could then go bully MS into allowing them to patch the Windows kernel so that they didn't have to modify their Linux drivers to get them ported to Windows.
> Getting the drivers into the Linux kernel means - as one can see - going deeply into kernel politics. If Microsoft required something similarly, the hardware vendors would tap their forehead at Microsoft.
I'd imagine if you wanted to include code in the NT kernel they would. But since they don't allow you to include your code in the NT kernel at all, they don't require what the Linux kernel developers require,
i.e. MS is not providing the same level of access, so it doesn't have the same requirements.
There is also that the quality of the old closed source fglrx driver was just ass. It would be naive to think all code coming from AMD has acceptable quality.
Quite the opposite, somebody at AMD wanted to play along with the openness of the Linux kernel. Then to help unify their codes bases they push this - and get rejected. Somebody at NVIDIA is laughing their ass off, saying it would have much easier if AMD just did what NVIDIA does!
> Quite the opposite, somebody at AMD wanted to play along with the openness of the Linux kerne
Oh please.
No, the boss said "merge this" and the code has been developed "corporate style" (with a HAL, etc) now the "mean" kernel developers won't approve this
But the Kernel people are right, because the other option would be to introduce code that breaks every now and then and is unmaintainable. See all the ACPI issues for example, that only stopped when Linus said "no changes can break existing functionality anymore"
This depends on whether there is at least 1 person at AMD who has experience with kernel development. If all of the team members all kernel outsiders, then this might confuse/surprise them. Otherwise someone likely raised this as a potential issue.
Last time I heard NVIDIA had to divert some of their people to work on Tegra support in nouveau because their ARM customers want open drivers so I'm not sure if they are in position to laugh their asses off.
Quite professional, well written and clearly the words of someone who cares about the product.
Lack of profanities already exceeded the LKML reputation.
I think the point about rules being applied consistently is very true. If Alice does the work to comply then Bob shouldn't be able to get away without doing it just because he's bigger.
Very short - AMD made an abstraction layer, to share effort between Linux and other platforms (i.e. Windows and etc.). Kernel/DRM maintainers don't like that, since it causes several issues detailed in that thread (harder to understand logic of the driver, slowdown of DRM improvement itself, indirect workflow of AMD developers and so on). For the reference, DRM here is Direct Rendering Manager[1], nothing to do with crooked Digital Restrictions Management.
Nvidia uses largely the same driver code for both linux and windows in their proprietary driver (I believe they call it a unified driver).
AMD tried the same in their open source driver and were rejected by the kernel maintainer. Unified drivers have code sharing advantages but don't follow the practices of the linux kernel.
My naive assumption is that code reuse across platforms is a good thing, I'd love to understand why this isn't the case here or what the concrete arguments are against it.
> My naive assumption is that code reuse across platforms is a good thing, I'd love to understand why this isn't the case here or what the concrete arguments are against it.
A driver is inherently platform-specific. It's glue that ties the hardware to the operating system. The only "correct" way to have one driver work on multiple operating systems is for the operating systems to all use the same driver model.
The ugly way is to create your own hardware abstraction layer and then write a translation layer between that and each operating system, because that's complicated and hideous.
But it's especially silly because Linux accepts suitable contributed code, so you could instead use the native Linux model as your "intermediary layer" and fix Linux if it isn't suitable in some way. And then translate that to what the closed operating system you can't modify uses.
The result is that the Linux people are happier and you have one less translation layer to maintain.
That might run into license issues. If you want to avoid licensing the other versions of your driver under GPLv2, you'd have to carefully avoid copying any code from the main kernel into your translation layer (rewriting any helper functions you end up using), and even then there's the idea of API copyright to contend with.
One might ask whether it is desirable to avoid the GPL, and there are a lot of arguments on both sides there, but it's certainly easy to run into issues when you have a GPL licensed module designed to be linked into a proprietary program (kernel).
> If you want to avoid licensing the other versions of your driver under GPLv2, you'd have to carefully avoid copying any code from the main kernel into your translation layer (rewriting any helper functions you end up using), and even then there's the idea of API copyright to contend with.
Isn't the point supposed to be to not have other versions of your driver, so you can use the same one on every platform?
By "other versions" I mean the codebase used for a given non-Linux platform, which would (hypothetically) include most of the Linux driver's code plus a translation layer from Linux APIs to that platform's.
The translation layer isn't where the interesting bits are. The parts hardware companies want to keep secret are the hardware-specific parts, not the OS-specific parts. It might even help them to open source the translation layers because then others could potentially use them and shoulder some of the maintenance cost.
I can't speak to the legal status of GPL drivers for Windows, but several seem to exist already (e.g. Windows ext4 driver), and if they were actually worried about it they could always get explicit permission from the copyright holders of the relevant code. Either they say yes and you're fine or they say no you know what pieces of code to replace.
>But it's especially silly because Linux accepts suitable contributed code, so you could instead use the native Linux model as your "intermediary layer" and fix Linux if it isn't suitable in some way. And then translate that to what the closed operating system you can't modify uses.
But Linux repesents a tiny portion of the gaming community, so that approach would make no sense at all for a GPU vendor. C'mon.
Then they aren't going to get their driver upstream. End of story. Kernel developers have already done this once (Dave hinted at Exynos drivers in the past in his other posts) and it was a large amount of work to un-screw the pooch once all this crap came along.
I know that Linux people really really just want the kernel to take one for the team so they can have GPUs because that's just the goal, and clearly the goal is good and the means don't matter at all and everything else is irrelevant. 100,000 lines of crap code, 200k? 500k? Who cares, it's all in the name of GPUs clearly. It's obviously worth it no matter what.
But the kernel developers do not see it that way, and for good reason -- because once it's in tree, they are all on the hook for it and they all have to deal with the swamp, the added complexity, the maintenance, the un-fucking of this entire HAL, etc etc.
Having worked on a large open source project, I can assure you, it sucks when you have to say "This isn't acceptable and we aren't merging it", even when it's a feature the users want, and one someone worked on for a long time. It is also, almost always, the right thing to do in the long run (and several of those features did come back, in acceptable ways, in our case).
> But Linux repesents a tiny portion of the gaming community, so that approach would make no sense at all for a GPU vendor. C'mon.
The growth market for GPUs is GPGPU and servers. And Linux represents a large portion of the programming and server communities.
More to the point, as soon as you support Linux at all then it doesn't matter who has more share, it's still less work to do the above than have to maintain another translation layer.
But AMD doesn't. GPGPU is already supported on nvidia drivers with their opaque blob. AMD has a more-transparent blob. People who want this to work already have a solution. This kernel change is probably important to some people, but those who simply want to run a GPGPU cluster on linux already have workable solutions.
The GPGPU market is the polar opposite of the gaming market.
Game developers might like to see clean driver source but they don't get to choose what kind of GPU their customers have already bought. And 99% of gamers are not going to choose their GPU based on Linux drivers. So nobody has any leverage and vendors have no incentive to change.
Meanwhile thousands of universities and institutions are each going to be looking for 25,000 GPUs and they can choose what brand they buy based on what makes their internal developers happy. Hosts like Amazon and Google are each going to be buying millions of GPUs, and having better and more transparent drivers so they can more easily e.g. improve power consumption by a small percentage, can save them a million dollars/year in electricity.
Someone like Google could come to each vendor and say "first to have mainline kernel drivers gets all our business" at any point. Or the same result in the other order; once there are clean drivers third parties are more likely to make power consumption and performance improvements that give AMD the edge when the major customers crunch the numbers.
There is a significant competitive advantage in it for AMD to get this right.
I agree with you almost entirely, except the part about fixing Linux. If the abstraction that Linux provides isn't suitable for some reason, it probably isn't straightforward to change it because of compatibility with existing code.
That's not so much a concern within the kernel boundary, which is the case that applies here. If you have a compelling reason to redesign an internal API, you "just" have to fix up all the code across the tree that consumes it. Changes are regularly made to the internal VFS interfaces, for example.
It's also often the case that kernel-driver interfaces are extended without breaking compatibility. In those cases, you want to ensure that the extensions are suitable for more than one driver to consume.
The problem isn't that sharing code across platforms is bad, it's that not sharing code within Linux is bad. Airlie is basically saying that if the kernel API and subsystems are somehow inadequate, AMD should improve them directly instead of covering them up with a bunch more code.
> Airlie is basically saying that if the kernel API and subsystems are somehow inadequate, AMD should improve them directly instead of covering them up with a bunch more code.
And you really believe that the maintainers will be accepting a giant patch that changes the API and subsystem completely (though into something better) that has the risk of causing lots of regressions to existing drivers? And you believe that AMD is supposed to fix all the regressions that are caused in drivers by other vendors that this change causes?
Of course not, the maintainers will accept a well thought out series of patches that each make one small logical change towards the better interface.
And yes - who else is supposed to fix all the regressions caused by changes that AMD wants? Volunteers who would rather work on something else? If you want a change, you get to support the regressions - and if AMD's work gets merged, then anyone ELSE who wants to make a change in that page needs to support AMD's regressions.
Hence wanting to make sure that the changes from AMD are manageable and flexible enough to allow further changes.
Linux got where it is by evolving how the kernel and drivers interact whenever needed, without waiting to coordinate with outsiders and their closed work.
(without looking at details) The problem is that Windows and Linux expose hardware and drivers in different ways. You can shim things up to make the code work, but you end up with a driver that doesn't look like a Linux driver and doesn't work like a Linux driver and can't easily be maintained by people working in the Linux graphics drivers is going to be a problem.
If the driver doesn't really belong in the Linux kernel source for those reasons, it's better to keep it outside the kernel tree.
The open source developers don't care about invisible code reuse in a closed source driver. HALs across open source codebases do exist too (eg for ZFS) but Linux in particular does not like them.
AMD should move their HAL code into their Windows driver, making it a superset of the Linux driver. AMD would get to reduce driver code duplication and Linux kernel developers don't need to merge the AMD's ugly Linux/Windows HAL.
> AMD should move their HAL code into their Windows driver, making it a superset of the Linux driver.
This might theoretically make sense if the Linux subsystem was very stable over many years. Practice shows that the Windows interfaces are what are a lot more stable over the years and changes in them are communicated for a long time beforehand so that hardware vendors can begin changing their drivers long beforehand.
Regardless of one's thoughts on AMD, I think this is broadly true. Microsoft may do a lot of things poorly, but one thing they are good at (arguably, the only thing they're good at, hell maybe the key to their success, really) is maintaining compatibility and not breaking stuff.
Linux maintains compatibility by fixing the driver themselves when they break it. Microsoft cannot (actually, can, and does) break their interfaces since they don't control the drivers.
This allows Linux to keep improving without breaking things in production; while Microsoft has to either maintain huge backward compatibility abstractions for changes, go YOLO and break stuff (often unknowingly) or abstain from improving their OS.
It is a good thing. For the developers of that piece of code (AMD in this case).
However, it is introducing a second API for a very specific subset of hardware into a kernel that is being developed by not just AMD people. Dave Airlie is rightly saying that the second API and hence two different code structures makes the whole DRI infrastructure harder to maintain for everyone else.
And Dave's responsibility is to everyone else, not to AMD.
It is a good thing for the driver writer as they have less difference between their targets.
It is a bad thing for the targets as they implement both the driver functionality and the abstractions required to make the same code work cross platform. The response linked describes the cost of those abstractions to the target (Linux kernel in this case).
Nvidia's proprietary driver breaks upon every new kernel release, which is why they have a shim. Furthermore, Nvidia can't ship their driver in the official Linux kernel due to copyright issues, and they're forced to handle all the maintenance burden of their driver (whereas AMD reaps the benefits of Intel's GPU driver bugfixing, and vice versa, thus lowering both Intel's and AMD's driver costs on Linux).
Besides, Nvidia's been having trouble with their Tegra GPUs on Android, and as a result have been forced to pitch in a bit on Nouveau (the reverse-engineered open-source Nvidia driver). They're still having trouble with their driver situation on mobile, as a result of their unwillingness to play ball with the kernel.
> Nvidia's proprietary driver breaks upon every new kernel release, which is why they have a shim.
ELI5: why does each Linux kernel release break driver code? It can't be THAT hard to just have a stable interface and leave it for long periods of time, e.g. only bumping it on major version bumps in the Kernel?
Because in practice, APIs inside Linux do, in fact, change quite a bit -- and by itself maybe that wouldn't matter so much, but the nvidia driver has an insane amount of surface area on top of it. It's a massive driver. You can imagine then, that breaking it is actually easier than you might think.
There is no rule kernel interfaces can only change on major bumps. In reality, they change quite frequently, as new APIs and drivers are merged in, which requires generalization, refactoring, etc across API boundaries to keep things sane. Kernel developers specifically reject the notion of a "stable ABI" like this because they feel it would tie their hands, and lead them to design APIs and workarounds for things which would otherwise be fundamentally simple if you "just" break some function and its call sites. APIs in Linux tend to organically grow, and die, as they are needed, by this logic.
Why wait 5 years for a "major version bump" to delete an API call, you could just do it today and fix the callers, since they're all right there in the kernel tree? It's far easier and more straightforward to do this than attempting to work around "stable" systems for very long periods of time, which is likely to accumulate cruft.
Because they do not care about out-of-tree code, when an API changes, their obligations are to refactor the code using that API, inside the kernel, and nothing else. That means the person making the change also has to fix all the other drivers, too, even if they don't necessarily maintain them. Out of tree users will have to adapt on their own.
This also explains why they do not want a HAL. When a Linux driver interface changes, the person changing it is responsible for changing everything else and fixing other drivers. That means if AMD wants a large change, it may have to go and touch the Intel driver and refactor it to match the new API. If Intel wants something new, they may have to touch the AMD driver in turn. This, in effect, helps reduce the burden and share responsibilities among the affected people.
They don't want a HAL because a HAL is a massive impediment to exactly that workflow. If Intel wants to improve a DRM/DRI interface in the kernel for their GPUs, they could normally do so and touch all the other drivers. Out with the old, in with the new. But now, they'd have to also wade through like 50,000 lines of AMD abstraction code that no other system, no other driver, uses. It effectively makes life worse for every graphics subsystem maintainer when this happens, except for AMD I guess since they can pawn off some of the work. But if AMD plays by the rules -- Intel fixing their AMDGPU driver when they make a change shouldn't be that unusual, or any more difficult compared any other graphics driver. And likewise -- AMD making a change and having to fix Intel's driver? That's just par for the course.
Obviously Linux isn't perfect here and they do, and have, accepted questionable things in the past, or have rejected seemingly reasonable API changes out of stability fear (while simultaneously not wanting a stable ABI -- which is fair). But the logic is basically something like the above, as to why this is all happening.
AMD's recent strategy has been to try to confine the proprietary stuff to userspace, and to implement an open-source kernel driver that can be used by either the proprietary userspace driver or open-source userspace stack.
The difference between open source and free software is mostly in the political camp the word comes from. Read the OSI definition of open source if you don't believe:
Kernel driver of proprietary AMDGPU-PRO are licensed exactly same way as modules in mainline kernel. Most of them dual-licensed under MIT and GPL so BSD and other projects can use them.
Note that both the amd and the nvidia kernel modules always have been FOSS because of the GPL license. It's just that nvidia provides it by its own ways, not through the official linux branch, and thus doesn't have to respect linux rules nor to document the driver.
I think the idea is that the kernel maintainers can't break the code that the AMD driver relies on and in order to properly do that they need to be able to easily grok all the driver implementations, and abstraction layers make that more difficult.
I don't know anything about linux display architecture, but this point sounds really weird to me from general software engineering perspective. Isn't one of the goals of having drivers in an OS to establish a formalized interface between drivers and kernel, and thus achieve separation of concerns between driver maintainers and kernel maintainers? Requiring that kernel maintainers understand how all drivers work does not sound very scalable.
Linux developers want to be able to share code across drivers for similar devices, and they want to be able to refactor and improve that shared code without worrying about out-of-tree drivers. That strategy has worked well for eg. WiFi drivers where the shared mac80211 subsystem allows a lot of logic to be pulled out of individual NIC drivers, and improvements to mac80211 more or less automatically benefit all participating drivers.
Makes sense, but it doesn't have to be mutually exclusive. It is possible to have a fixed network driver interface, and some common helpers orthogonal to it. This way driver developers could chose whether to benefit from common code or not. I guess if linux developer want to enforce code sharing, this wouldn't work, but I wonder why they would do it. Seems like it just makes life harder for both parties.
Many network drivers in Linux do opt out of the common frameworks, and changes like BQL [0] and some of the recent WiFi restructuring [1] that require driver-level modifications are usually not introduced as a mandatory compatibility break that requires modifying every driver at once.
Sometimes drivers opt out because the hardware is odd enough that it needs to do things differently, or because it offloads to hardware (or proprietary firmware) functionality that is usually done in software. But often, it's just because a driver was written in isolation by the manufacturer and then dumped on the community. (See [2], from the comments in [1], about the work required to clean up some Realtek WiFi drivers enough to be merged to the kernel staging area.) If a driver unnecessarily opts out of common frameworks and does things internally and differently, it can be hard to even evaluate whether problems fixed in the standard frameworks exist in the special snowflake drivers. Even after identifying a problem, the recipes that worked to fix the standard drivers won't apply.
Linux very intentionally and explicitly do not guarantee a stable interface for drivers. If you want to maintain your driver out of tree, then that's your right, but you have to be prepared for Linux kernel maintainers to break your driver regularly. The point is they are not interested in sacrificing ability to improve the kernel just in order to make it easier for people to maintain proprietary drivers.
Proprietary drivers are tolerated, not liked, and people aren't interested in making it easier for them.
These interfaces aren't broken gratuitously. They change to accommodate new features or to allow for better code reuse within the kernel. In the long term it's only a significant maintenance burden compared to letting the driver stagnate and never gain new functionality or optimization after initial release. Manufacturers might prefer that development model since it helps sell more chips of the next generation, but it's really not what should be used as the baseline.
You only have to fix your driver if it's outside the kernel tree. Drivers in the tree get fixed by whoever's making the breaking change (which is why it's important that certain hackers be able to read and understand all related drivers).
I wouldn't say avoided.. A lot of the interfaces are still compatible with prior versions as is the software that runs on it, or we'd be on Kernel v50+ by now. Not all software, but a bit.
That said, the community isn't afraid of breaking changes to push future versions forward.
Short version: AMD as a company is dysfunctional. Perfectly happy to throw away $300Mil on failed acquisition, unwilling to hire sufficient number of competent driver developers.
Where did you get this idea from? Just because a patch set got rejected, we can dismiss the fact that AMD had been providing excellent open source driver support for years, and just call them incompetent?
AMD is prioritizing the business here, which makes perfect sense. Why would they spend more money than necessary to appease the Linux maintainers in order to serve the tiny population of Linux gamers who probably don't give a crap about how the kernel is maintained? Their business is on Windows.
Neither AMD nor Nvidia develop Linux drivers for Linux gamers. Not that they don't pay attention to us with fixes and optimizations here and there, but a good part of that is games simply being applications that make use of a lot of driver functionality that might not receive enough testing otherwise.
This whole project from AMD was to try to unify their driver infrastructure between Linux and Windows, because their old FGLRX driver was trash and was a hacked together mess of Windows bits tied to the Linux system with its own binary kernel driver.
So they wanted to get rid of the ugly kernel part that was a PITA to install and update and are now pushing a lot of their hardware abstraction code from Windows into kernel patches so the AMDGPU driver can just talk to the Windows blob pretty much verbatim (which is now AMDGPU Pro).
The practical effect is a continuation of the status quo. AMDGPU Pro, without a lot of this functionality, is either broken or underperforming across all distros. It is still better than what the last FGLRX was, but nowadays the Gallium free driver they also develop is beating the blob in almost everything except the latest driver-level optimized games.
Most distros have completely dropped all proprietary AMD support. Going forward, it will be up to them to ship a proprietary driver and maintain installation faculties pretty much everywhere. AMDGPU with Mesa is going to continue working fine, new GPUs are getting supported still, and a lot of what this HAL does (display / window management) has had usable support that has worked for years in various parts of Gallium / Mesa / DRM.
The optimistic future is that AMD drops AMDGPU Pro, refocuses developer effort on AMDGPU / Gallium, and works with the rest of the Linux graphics community to implement Freesync / Trueaudio / whatever other tech AMD has buzzwords on into shared kernel code rather than trying to stick it in a HAL from their Windows driver.
The pessimistic view has AMD just fire or reassign a lot of its Linux staff, leaving its hardware on the platform to wilt. It would never stop entirely, AMD provides programming manuals for their hardware and most of their ASM in new platforms to enable almost anyone to program their GPUs (unlike Nvidia, who publishes nothing, requiring devs to reverse engineer their hardware and ASM) so the support would still be better than Nouveau.
It's worth pointing out that apart from the display code (which is what this thread is about), the diff between amdgpu and amdgpu-pro is very small. Most of amdgpu (and hence amdgpu-pro) is actually very specific to Linux and shares no code with Windows.
From that perspective, saying that "the optimistic future is that AMD drops AMDGPU Pro" is a bit silly, since it's largely the same code base as AMDGPU and the same people working on both.
It also makes no sense at all to say that "a lot of what this HAL does has had usable support [...] in various parts of Gallium / Mesa", since Gallium and Mesa are purely concerned with rendering and video. They don't care about display. In fact, you can actually use radeonsi and the rest of the open-source stack on top of the amdgpu-pro kernel module. (And for that matter, the closed source Vulkan driver is supposed to be compatible with an otherwise open source stack.)
Also, AMDGPU is not necessarily fine going forward, precisely because of this display code issue. Yes, the memory management and rendering/video engine parts are going to be just fine, but that won't do you a lot of good (outside of compute purposes) if you can't light up a display...
I wanted to comment on/respond to both sibling posts to mine (by pjmlp and witty_username), but you seem quite knowledgeable so I'll ask you in reference to them:
I have run desktop Linux across a dozen (maybe slightly more) machines over a decade, and friends will ask me for advice stepping into that world. On graphics drivers, my safest recommendation has always been:
- If AMD, use the open source version.
- If Nvidia, use proprietary.
- If Intel integrated, thank whatever god you believe in for Mesa.
What about Nvidia's GPUs, community relations or [insert other topic] hobbles their open source one so thoroughly compared to AMD? Or alternatively: Why is AMD's (presumably) deeper knowledge of their graphics hardware unable to be more stable than the open source equivalent when Nvidia's is?
AMD largely is working on the open-source driver, but they need the proprietary driver to continue to limp along for the sake of existing customers, until the open-source driver is regression-less.
SteamOS and SteamMachines will be delayed as well.
Nvidia support for Linux is hard to get as well. Now AMD support is also hard to get.
Hardware OEM companies make Windows drivers, and don't seem to care about Linux. This is the same thing that happened to IBM and OS/2 not good third party driver support.
There are open source drivers that work, but are not as fast as the proprietary drivers.
Linux needs better display drivers and for that OpenCL or Vulkan support as well. Windows uses dotnet and DirectX for games.
Windows games are generally not made in dotnet, and games which are made in dotnet (mostly Unity) are actually run on Mono. AMD's Mesa RadeonSI(free) driver frequently outpaces their fglrx(proprietary) driver on recent hardware. NVIDIA and AMD distribute good quality proprietary drivers for Linux, but given the pace the Linux kernel takes, and the benefits of integrating with the native interfaces of the kernel, AMD has decided to write good quality open source drivers as well as their proprietary ones. NVIDIA blocks community open source efforts by delaying the release of firmware images vital to enabling basic features of their GPUs. SteamOS is shipping fglrx for now, and fglrx doesn't currently rely on the DC kernel code. Most SteamMachines seem to be NVIDIA-based for now, and while it's inconvenient to ship their proprietary driver, it is in no danger of failing to work.
What are you talking about? The amdgpu driver has been in the kernel for over a year and has been working just fine ever since.
Source: My desktop PC has an R9 Nano, and I've used the amdgpu driver since Linux 4.5 (when it good power management support for my card and thus could enable usual GPU clock speeds).
While HAL does stand for Hardware Abstraction Layer in general, the page you linked is for a specific piece of software that was used by previous versions of KDE and Gnome for this purpose.
As far as I'm concerned, it would be great to have more sophisticated modesetting available on my new AMD hardware; but if it means a compromise in the DRI I won't stand for it. There's no sense in having a review process if a vendor can largely ignore a review and go silent for three quarters, then ask for whatever they have to be merged because they'd like it that way.
It seems people think only about games for now, but taking a maintenance perspective is also wise from the perspective of using GPUs for other purposes.
One of these is cloud computing on large clusters of headless machines using the parallelization that GPUs are known for. If you want to do this right you definitely need input from a lot of sources, not just hacks in AMD delivered code.
That's an interesting take... this is why Linux gaming sucks. I get the pragmatism but we also need to be realistic that they aren't going to maintain a completely separate driver for Linux when the Linux gaming market share is barely a rounding error.
For workplace, sure. But the only reason I don't run exclusively on Mac or Linux is gaming. I've spent a lot of money and time in that ecosystem for that single reason.
There's a lot of people that care about gaming, even if you don't.
This will hardly change, back on my hard GNU/Linux days I came to realise that the gamer/demoscene culture and UNIX culture are totally opposed to each other.
The gaming and demoscene cultures don't care 1 second how much their tools cost, the openess of hardware and software tooling, rather the achieved results and getting their stuff on the hands of users, regardless how.
The GNU/Linux culture is all about the ideology of having stuff for free, replicating a desktop experience as if CDE was the epitome of UX, fulled with xterms.
Of course I am generalising and might get tons of counter examples, just noting my personal experience regarding friends and co-workers.
> The GNU/Linux culture is all about the ideology of having stuff for free
Uh, no. It's about having the freedom to fix, improve, or otherwise modify the software you use. Being free-as-in-beer happens to be a requirement for that, but it isn't the goal. Think about it this way: free software developers get paid to do work, instead of getting paid for having done work like proprietary software developers. You pay me to implement feature X, which is then released to the world for further improvement in the future.
The point I'm trying to make is no one wants commercial desktop software on Linux. We want software we can fix and improve without requiring anyone's permission.
If the desktop APIs break all the time and there are lots of incompatibilities between distributions, it is really hard to sell commercial applications on GNU/Linux.
Lol KDE has many flaws but bloat isn't one of them, I get the feeling sometimes that the presence of several ultra-ultra-light "desktop environments" on Linux causes people to think the major ones are "bloated" when in fact they're much smaller than the DEs on Windows/OS X. FWIW "slow under certain circumstances" does not necessarily imply bloat b/c a single badly implemented feature can cause slowness but that's not the same as bloat; try enabling compositing in some old versions of Xfce (basically right after it was added, maybe Xfce 4.2?) and you'll see what I mean real quick.
It is a matter of taste, of course, but compared to Windows 8 and Windows 10, common Linux/Unix desktop environments like MATE or Xfce are very pleasant to use.
Reliable wifi and streaming video appeal to a significantly larger market than GPU-heavy video games. Having a good, easy-to-use image editor (competes with paint.NET and old versions of photoshop) and music player (competes with foobar2k et al/the surprisingly good performance of iTunes on os x) doesn't hurt.
("Do users do X often?" isn't the question; "do they get annoyed when they can't?" is the question, and hardcore gamers tend to have one computer for gaming and oftentimes other computers for other stuff; if they were even using Linux on those it'd be a paradigm shift)
A forked version of it because Google didn't want to deal with "No." from the Linux maintainers when they were massacring the kernel into mobile ready form.
The point is that distro patches are trivial compared to Android patches. Your distro will continue to work fine with a vanilla kernel unless it's using an esoteric configuration, and even with esoteric configurations it's relatively easy to include necessary patches to vanilla kernel.
Compiling and using a vanilla kernel instead of the distro one is straightforward and easy job for people with basic knowledge of source building. It's not a job for "very few brave souls". The reason many people use distro kernels is because they're good enough.
OTOH any consumer Android device requires millions of loc patching to a several years old version of Linux kernel just to boot.
AFAIK, vanilla kernel improved a lot in terms of functionality, so Android can work on vanilla kernel if code will be adapted to use these new APIs instead of Android homegrown APIs. Anyway, it's open source, so anybody can use Android patches without restriction. I see no problem here. Kernel version 4.4 is fresh enough for me: https://android.googlesource.com/kernel/common/+/android-4.4... .
BTW. I'm not sure if my distro (Fedora) will work with vanilla kernel without Redhat patches. There was times when it wasn't. I compiled kernel myself with my own patches and configuration in between 2001-2008.
> AFAIK, vanilla kernel improved a lot in terms of functionality, so Android can work on vanilla kernel if code will be adapted to use these new APIs instead of Android homegrown APIs. Anyway, it's open source, so anybody can use Android patches without restriction. I see no problem here. Kernel version 4.4 is fresh enough for me: https://android.googlesource.com/kernel/common/+/android-4.4.... .
What I'm trying to tell is it being possible doesn't mean it is practically possible.
Android specific API or subsystems are one part of it, then there is device specific patches. Kernel and patches being open source makes it possible to switch to a new kernel version but I've almost never seen that exercised. Few years ago the stats were that a typical consumer phone contains millions of line of patches on top of the selected upstream kernel version. It is not feasible to rebase a typical device to use a newer kernel version, and it really shows: I've seen only few phones that got a newer kernel version than it originally shipped with, switching from an ancient kernel version to only slightly newer but ancient kernel version.
> BTW. I'm not sure if my distro (Fedora) will work with vanilla kernel without Redhat patches. There was times when it wasn't. I compiled kernel myself with my own patches and configuration in between 2001-2008.
I'm pretty sure it'll. Linus himself uses Fedora and he likes his kernel pure vanilla :)
Isn't this true for desktop GNU/Linux too? I mean, last I checked, Debian kFreeBSD would be similar enough to Debian GNU/Linux as long as you don't drop to the console - the equivalent of which would be adb shell.
No, because you get full access to the OS APIs, so any application is exposed to the implementation specific behaviours of POSIX and OS specific syscalls and paths.
Where in Android, you just get Java and a tiny bit of C and C++.
Check the NDK documentation, Google provides a list of the set of APIs any NDK application is allowed to use.
Since many used to ignore that list, starting with Android 7, any app that uses unauthorised native libraries will get killed.
I'm not sure of numbers but AFAIR both kfreebsd and khurd have 80% of packages in the archive and even getting to this level requires Linux compatibility hacks like procfs support
> I get the pragmatism but we also need to be realistic
I think what you mean is "I get the idealism but you also need to be realistic." It's not pragmatic to stand your guns and ask a multi-million dollar company to change the code they submit to your open-source project.
It's not pragmatic to allow a specific vendor to dump 100k LOC in your project that you have to maintain indefinitely so their developers have an easier time writing for both Windows and Linux.
In my opinion the existing AMDGPU code isn't exactly spectacular as it is. They barely have comments or commit messages and there's a ton of duplication. Linux has issues with keeping driver contributions up to snuff as it is, without enormous vendor-specific HALs everywhere.
There's also nothing pragmatic in one-sidedly denying contributions without understanding the case that rewriting a whole business stack for a minor player makes no sense.
What I see here is (sadly, again) two groups of developers unwilling to meet halfway and understanding each others problems. Expecting AMD to support a completely separate driver just for Linux is unrealistic. Expecting a 100kLOC code dump do be accepted is unrealistic as well. I don't see anyone talking about how to get over this hurdle on lkml, I just see the single least constructive word: "No."
Meanwhile, 3D support in Linux will remain a crappy tire fire which works well only if you use a completely proprietary nVidia driver.
I've only got one true power as a maintainer, and that is to say No. The other option is I personally sit down and rewrite all the code in an acceptable manner, and merge that instead. But I've discovered I probably don't scale to that level, so
again it leaves me with just the one actual power.
This position is so self-defeating, I really don't know where to start.
1. There is absolutely value in rejecting bad, or even good but unmaintainable code from your codebase. How is this even an argument?
2. The 'devs' don't just all meet and then decide to blow each other off anyway, AMD is simply in a position with Steam where they want it to "just work" for most games at the lowest investment cost possible. They took a gamble and lost.
3. A updated proprietary driver is not ideal, but works better than making the OS worse. Again, not sure you can really disagree.
I feel like the LKML is the most hypocritical place I've ever seen. One one hand, we want that more people GNU/Linux which depends on the quality and performance and number of application programs that work on Linux, while on the other we seem to encourage developers to make closed source binary blobs because we keep on showing them that we aren't willing to accept code.
It is a lose-lose situation for a developer. If I write open source software, people will demand more and more from me. If I say, "screw this, I'm just gonna release a blob." they will ridicule me while ignoring the fact that I am releasing the blob only because they pushed me to. /rant
> It's not pragmatic to stand your guns and ask a multi-million dollar company to change the code they submit to your open-source project.
The maintainer explains the pragmatism explicitly:
> AMD can't threaten not to support new GPUs in upstream kernels without merging this, that is totally something you can do, and here's the thing Linux will survive, we'll piss off a bunch of people, but the Linux kernel will just keep on rolling forward, maybe at some point someone will get pissed about lacking upstream support for your HW and go write support and submit it, maybe they won't. The kernel is bigger than any of us and has standards about what is acceptable
Rejecting half-assed patches is pretty pragmatic, no matter who the author is. Maintaining standards is pragmatic because 'your open-source project' is the one that will be maintaining (refactoring/rewriting) the code in the future, not the muliti-million dollar company.
> I think what you mean is "I get the idealism but you also need to be realistic." It's not pragmatic to stand your guns and ask a multi-million dollar company to change the code they submit to your open-source project.
That has basically been Linus' Torvalds job for the last 20 years. People want to contribute to the Linux kernel to get support for the thing that they are interested in, but often the code that they are offering should not be accepted as-is, because it will make Linux as a whole that bit worse. See DBus for an example where clever people strongly put forward useful functionality, and got push-back. The end result was that they went back to the drawing board, and designed something better.
AIUI, the reason that the AMD and nVidia proprietary graphics drivers are a terrifying mass of hacks on top of hacks is trying to say yes to everything. Years later, the vendors can only move forward by setting fire to the whole lot.
Thing is, Linux was never intended to run the latest greatest desktop gaming hardware. It was intended to be an open architecture where people contribute from all around the world to make something larger then themselves.
All of these people pretty much contribute their free time to it.
If they can't make basic architectural decisions that improve the worst kinds of work (driver authorship and maintenance is awful drudgery) how can you expect them to feel any kind of ownership over their fate?
You're asking unpaid people to do the work people get paid for. Even worse, when this work just gets dumped on those unpaid people by people who are paid quite well.
Most work on Linux is paid[0]. However, asking paid people to do work that does not align with their employers interest nor the project maintainers, is on par with what you're asking.
>However, asking paid people to do work that does not align with their employers interest nor the project maintainers, is on par with what you're asking.
Oh but it is in their employers' interest. It's the price of admission for mainline. And if they want in on mainline, wether simply to harvest PR or to net a contract that demands a mainlined kernel; they have to pay it. Just another case of the well-known "cost of doing business".
In the case of AMD I believe they want to reap the benefits of mainline (that is, not having to support the breakage that comes with being out of tree) and to be able to compete better with Nvidia; since AMD is unlikely to ever develop an OpenGL implementation as good as theirs but Nvidia cannot or is unlikely to be able to open source their driver.
It might be different if AMD, Nvidia and Intel all agreed on a longer term HAL, but that isn't the case... and 100kloc is pretty huge for a single vendor.
They do, if that's what it takes to keep the qualities that make Linux different. If it turns itself into a shitty Windows clone, then one might as well use the original.
No, on the desktop, they absolutely do not. It may please a tiny subset of people who understand what the kernel is and read patch notes, but no one else.
It works fine for members of my family and extended family that have Ubuntu installs on Intel GPUs. Light gaming and more importantly, hardware accelerated YouTube works just fine.
I'd be willing to bet that most people really only want windows to appear quickly, scrolling in web pages to work well, and to watch videos online.
Lots of casual gaming has moved to mobile, and never left the consoles. Hardcore gaming -- not most people.
I used to think like you too. But spending around 3 to 4 hours a month (that's being very modest) trying to fix video tearing, hibernation issues, wifi issues etc. after every kernel release is a pain that I won't endure for long.
Wow, you got really unlucky. I'm on Debian Unstable yet I spend less than that per year. Are you sure you just didn't happen to hit upon a particularly badly supported hardware configuration?
In any case, why are you updating the kernel version every month?
Well, I use the mainline since I would like to start contributing sometime later. Also, I have a custom bootloader set up that managed to integrate nicely with Secure Boot, VMWare modules have to be recompiled with almost every kernel update since it breaks the Virtual device monitor and I had to write a script to automate that patching process.
Video tearing has been a constant problem if you use any type of compositer like Compton or the one that comes with XFCE or GNOME. I tried it on various systems and the tearing is there. A lot of people don't seem to mind though. For some reason, the Ubuntu maintainers don't think my hardware (or rather all laptops) should have the capability to hibernate to disk so they disable the /sys/disk (Im not sure I got the correct filename) which enables suspend to disk (this is one of the reasons why I need to use mainline anyway). PulseAudio doesn't play nice with DACs, ALSA is a pain to set up.
I really like Linux (so much that I keep 'ricing' my system) but these are the kinds of things that I'd rather not spend my time on.
That sounds like someone who has no idea what they are talking about. The open source drivers are a bit buggy. The binary drivers work just fine on Linux. Their performance in games is lower compared to Windows, but that is the case on OSX too.
Many people in this thread implicitly suggest that this will have a negative impact on end users. I'm not so sure, amdgpu still can be distributed separately to the vanilla kernel.
So this rejection is about maintainership that negatively affects distribution of the amdgpu module as a side effect. It's nothing that can't be solved by linux distributions though.
I'm sure there's at least a few who saw this coming, argued against writing a HAL that the kernel developers didn't want, and feel at least a little vindicated right now.
Can anyone explain what the acronym DC stands for in these emails? (managed to get most of the others, but googling DC kernel doesn't turn up much :) )
We propose to use the Display Core (DC) driver for display support on AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to avoid a flag day the plan is to only support uGPU initially and transition to older ASICs gradually.
The DC component has received extensive testing within AMD for DCE8, 10, and 11 GPUs and is being prepared for uGPU. Support should be better than amdgpu's current display support.
I commented a summary on the phoronix forums already, copypasting here. I'm the good cop in the good cop/bad coop game Dave&I have been playing in this. Maybe this helps clear up some of the confusion and anger here.
This is me talking with my community hat on (not my Intel maintainer hat), and with that hat on my overall goal is always to build a strong community so that in the future open source gfx wins everywhere, and everyone can have good drivers with source-code. Anyway:
- "Why not merge through staging?" Staging is a ghetto, separate from the main dri-devel discussions. We've merged a few drivers through staging, it's a pain, and if your goal is to build a strong cross-vendor community and foster good collaboration between different teams to share code and bugfixes and ideas then staging is fail. We've merged about 20 atomic modeset drivers in the past 2 years, non of them went through staging.
- "Typing code twice doesn't make sense, why do you reject this?" Agreed, but there's fundamentally two ways to share code in drivers. One is you add a masive HAL to abstract away the differences between all the places you want your driver to run in. The other is that you build a helper library that programs different parts of your hw, and then you have a (fairly minimal) OS-specific piece of glue that binds it together in a way that's best for each OS. Simplifying things of course here, but the big lesson in Linux device drivers (not just drm) is that HAL is pain, and the little bit of additional unshared code that the helper library code requires gives you massive benefits. Upstream doesn't ask AMD to not share code, it's only the specific code sharing design that DAL/DC implements which isn't good.
- "Why do you expect perfect code before merging?" We don't, I think compard to most other parts in the kernel DRM is rather lenient in accepting good enough code - we know that somewhat bad code today is much more useful than perfect code 2 years down the road, simply because in 2 years no one gives a shit about your outdated gpu any more. But the goal is always to make the community stronger, and like Dave explains in his follow up, merging code that hurts effective collaboration is likely an overall (for the community, not individual vendors) loss and not worth it.
- "Why not fix up post-merge?" Perfectly reasonable plan, and often what we do. See above for why we tend to except not-yet-perfect code rather often. But doing that only makes sense when thing will move forward soon&fast, and for better or worse the DAL team is hidden behind that massive abstraction layer. And I've seen a lot of these, and if there's not massive pressure to fix up th problem it tends to get postponed forever since demidlayering a driver or subsystem is very hard work. We have some midlayer/abstraction layer issues dating back from the first drm drivers 15 years ago in the drm core, and it took over 5 years to clean up that mess. For a grand total of about 10k lines of code. Merging DAL as-is pretty much guarantees it'll never get fixed until the driver is forked once more.
- "Why don't you just talk and reach some sort of agreement?" There's lots of talking going on, it's just that most of it happens in private because things are complicated, and it's never easy to do such big course correction with big projects like AMD's DAL/DC efforts.
- "Why do you open source hippies hate AMD so much?" We don't, everyone wants to get AMD on board with upstream and be able to treat open-source gfx drivers as a first class citizen within AMD (stuff like using it to validate and power-on hardware is what will make the difference between "Linux kinda runs" and "Linux runs as good or better than any other OS"). But doing things the open source way is completely different from how companies tend to do things traditinoally (note: just different, not better or worse!), and if you drag lots of engineers and teams and managers into upstream the learning experience tends to be painful for everyone and take years. We'll all get there eventually, but it's not going to happen in a few days. It's just unfortunate that things are a bit ugly while that's going on, but looking at any other company that tries to do large-scale open-source efforts, especially hw teams, it's the same story, e.g. see what IBM is trying to pull off with open power.
Hope that sheds some more light onto all this and calms everyone down ;-)
> Cleaning up that is not enough, abstracting kernel API like kmalloc or i2c, or similar, is a no go. If the current drm infrastructure does not suit your need then you need to work on improving it to suit your need. You can not develop a whole drm layer inside your code and expect to upstream it.
> Cleaning up that is not enough, abstracting kernel API like kmalloc or i2c, or similar, is a no go. If the current drm infrastructure does not suit your need then you need to work on improving it to suit your need. You can not develop a whole drm layer inside your code and expect to upstream it.
But abstracting all this is fitting the infrastructure to their needs (which is having a common driver infrastructure for many operating systems).
Sure, but then why not just continue with the binary blob?
Kernel development is largely an optimization process of the internal model of what a kernel should do, and for that developers need first-class raw data, the DRM maintainer wants to know how you'd like to change DRM. And though you can put that into a translation/abstraction layer, it's not helpful, because that doesn't scale, because the maintainer would have to look at every such layers and come up with a common one, and repeat this grueling task every time they want to move forard with DRM itself.
Well, there are ways to make a point and emphasizing an idea without resorting to swearing. The Linux kernel project has lost some maintainers because of this.
Now to be fair, one merit of Linus is that he is direct and says he doesn't like something, and not weasel language like "maybe later" or other time-wasting expressions that force people to guess what he is thinking.
Exactly what I'm wondering since they've been working on this for quite some time and asked for feedback a long time ago. How do you sit on this until they are this far along?
They've been getting this feedback - and apparently ignoring it - since February:
> Cleaning up that is not enough, abstracting kernel API like kmalloc or i2c,
or similar, is a no go. If the current drm infrastructure does not suit your
need then you need to work on improving it to suit your need. You can not
redevelop a whole drm layer inside your code and expect to upstream it.
> Linux device driver are about sharing infrastructure and trying to expose
it through common API to userspace.
> So i strongly suggest that you start thinking on how to change the drm API
to suit your need and to start discussions about those changes. If you need
them then they will likely be usefull to others down the road.
How does this relate in any way to this discussion? Do you think the kernel will be more user-friendly when the maintainers cannot understand their code anymore?
They were advised months ago that this could happen and they kept going with that. It might suck as a user, but you have to respect whichever standards the Linux kernel asks for.
a 100kloc hal shim for a single vendor is unacceptable. I don't blame them... If they'd presented this as a joint approach with nvidia and intel, maybe, but that is not the case... Why not make a HAL for Windows/osx and use a linux mode as the base?
In the end, it kind of sucks, but was the right thing to do.
I wonder how that proposal would go inside a corporation - "we need to rewrite our driver at huge development costs instead of further competing with nVidia because we need to make OS developers of marginal market importance happy".
One of the problems for AMD is that everybody uses nVidia for computing, and nobody uses AMD. One of the reasons is thaf they have been ignoring non-mainstream GPU market. Well, those non-mainstream markets become mainstream and if you fall behind it is almost impossible to catch up. I personally prefer AMD's GPUs for computing and prefer OpenCL to CUDA. But, they are minor, since nvidia has much better software offering. AMD absolutely need to offer superb Linux story if they want to get people to use their hardware for computing instead of nvidia. They need Linux more than Linux needs them.
AMD used to have much a much better hardware story for compute. I don't know how it looks now, but nVidia have absolutely stolen the market due, in part, to their excellent software -- even on Linux.
My only experience of compute on AMD was a FirePro v7900 -- an expensive, workstation-class card. With both the latest, and the 'workstation-class' Catalyst Linux drivers, my LuxMark tests came out very vast, but very red.
With nVidia, I can simply add a repo to my Ubuntu machine and have the latest stable drivers every time I do a dist-upgrade. If I want a solid, tested CUDA dev environment, I can install the CUDA repo and do likewise.
AMD have to make sure the end-user experience with these cards is as smooth as that, and that everything works.
I really hope AMDGPU-PRO is that experience. I haven't tried it yet, so I can't comment.
It's easy to dismiss enthusiasts / hobbyists / developers / gamers as a 'small market'. There was no 'pro gamer' market until ~10 years ago. Now there are entire companies built off the back of it. AMD cannot continue to leave a sour taste in end-users' mouths, otherwise there might not be any left soon.
Yes they can, they'll just have to commit resources to keeping up with kernel changes instead of having it done for them upstream. You can't have your cake and eat it as well.
If you want the kernel devs to maintain your code, it has to be maintainable. If not, you have to fix it yourself every time the kernel changes. That's the cost NVIDIA pays for their abomination.
They can still do it. If they don't have resources for further development, then just ship it out of kernel as a separate patch/driver for now. That way it will still be open source and will be included in Ubuntu, Archlinux, etc.
> Here's the thing, we want AMD to join the graphics community not hang out inside the company in silos. We need to enable FreeSync on Linux, go ask the community how would be best to do it, don't shove it inside the driver hidden in a special ioctl. Got some new HDMI features that are secret, talk to other ppl in the same position and work out a plan for moving forward. At the moment there is no engaging with the Linux stack because you aren't really using it, as long as you hide behind the abstraction there won't be much engagement, and neither side benefits, so why should we merge the code if nobody benefits?
> The platform problem/Windows mindset is scary and makes a lot of decisions for you, open source doesn't have those restrictions, and I don't accept drivers that try and push those development model problems into our codebase.