Hacker News new | past | comments | ask | show | jobs | submit login
AMD responds to Linux kernel maintainer's rejection of AMDGPU patch (freedesktop.org)
350 points by belltaco on Dec 9, 2016 | hide | past | favorite | 271 comments

I really don't understand why AMD cares so much about getting their driver upstreamed. If their code doesn't meet the kernel standards and they don't want to fix it then just package it up as kernel module and ship it like Nvidia does. Distributions will package it using DKMS and other than some occasional troubleshooting they wont even notice. It's really no difference than Windows in that respect.

His arguments seem really weak. Apparently the AMD team believes that the kernel should accept this code despite it not meeting their standards because:

* Otherwise they'll write angry blog posts about how mean the kernel maintainers are.

* They're a primarily Windows shop.

* They don't have the resources to do it right.

* They don't have the time to do it right.

* They're 'trying' to do the right thing.

* They believe that AMD is a big enough company to get special treatment.

* "But Andrrroid gets special treatment"

* There are lots of people with unreleased hardware that desperately need driver support.

* They're doing in wrong, but the current kernel maintainers are doing it wrong in a different way so it should be okay.

* Graphics drivers are what's preventing the 'year of the Linux desktop'.

The claim that AMD, a company with gross profit in the $1 - 2 billion range, does not have the resources to support a dedicated development team that can do a good job on upstream-able code is just farcical. That's a question of corporate priorities, not resources, and gets precisely to the critique of corporate culture that was in question.

> There's only so much time in the day; we'd like to make our code perfect, but we also want to get it out to customers while the hw is still relevant.

And this has always been the damn problem with AMD's drivers, even in their wheelhouse, Windows. They just have a slipshod attitude toward the software end of their core business.

AMD doesn't have "gross profit in the $1 - 2 billion range". They have about $1B in revenue per quarter and have had negative profits (i.e. they lose money) for quite some time now. They were on the verge of bankrupcy about a year ago. Maybe Zen can turn their fortunes around, but they're correct in that they don't have the resources for a dedicated development team right now.

It does, in fact, report around the gross profit I specified. Why don't you look at the financials [1]. I think many people in this thread, including you, don't know what the term "gross profit" means.

[1] http://ir.amd.com/phoenix.zhtml?c=74093&p=irol-fundIncomeA

I had to look it up. And you're right, I didn't know what the accountants think gross profit means. The thing is, gross profit doesn't include operating expenses[0]: "rent, equipment, inventory costs, marketing, payroll, insurance and funds allocated toward research and development."[1] As you can see on the page you linked, for a company like AMD, their operating expenses are huge. And that's not even including expenses from taxes and debts. So the number we really want, and what I imagine most people are thinking you meant, is net income. Which, you'll notice is sadly rather negative.

[0] http://www.investopedia.com/ask/answers/031015/what-differen...

[1] http://www.investopedia.com/terms/o/operating_expense.asp

Gross profits don't include things like R&D and Sales just the Revenue - cost of goods sold. They have a negative net profit margin. They are hemorrhaging cash, and have a lot of debt. The have negative share holder equity. They are in a bad financial position. Therefore it is kind of commendable they have some one trying to improve their linux drivers. They can't just "Afford" to just throw more people at the problem because of their financial issues.

But just because they have a seemingly sufficient gross margin doesn't mean they don't have significant unavoidable costs that can't neatly be attributed to cost of sales.

Writing the software that makes your hardware work is much more of a revenue-based issue than a profit-based issue. Even if you're losing money you need to balance your development budget, and not cut any segment too far.

Yes, AMD should definitely dedicate an entire team to focusing on writing drivers for desktop Linux. An OS that currently captures less than 1% of the overall desktop gaming market[1]. In other news, McDonald's should really start tailoring their marketing and in-store experience to people making over $10 million/year.

[1] https://en.wikipedia.org/wiki/Linux_gaming#Market_share_vs._...

But then why bother? Why even pretend to support Linux? The fact that the Linux team at AMD exists means that upper management must see some value in keeping them around.

Fear of missing out on market segment, as well as support for super-computer configurations. Without keeping a foot in the door for these segments, they risk fading away into obscurity like 3dfx did back in the days (for other reasons, however).

For perspective, 3dfx had stellar drivers for their cards, and never had issues with linux kernel maintainers refusing merges of their code for lack of quality.

> some value

I think you have answered your own question here :)

They are doing to to compete in the high margin part of the market, which is workstations, cloud rendering, GPUs, and console contracts. They care zero about low end linux desktop. But what nvidia learned is that the people who do all that high end stuff would much prefer it to just work, and will steer business accordingly.

Isn't it because the high end hardware becoming useless because of the very lack of drivers? Nobody wants to spend lot of money on hardware and don't get to use it optimally.

Driver support on platform -> Games released on platform -> Gamers using the platform.

Linux drivers on the nVidia side are OK. I've had excellent performance, and no trouble with them for what I think are already 10 years now, if not more. Lack of driver support is not what's stopping people from releasing Linux games.

And they are closed source. AMD tried to go open source and upstream and found out that that is not viable, because they would need to duplicate a lot of work in order to avoid putting abstractions in the kernel.

The upshot is, kernel developers have been complaining about nVidia lots for the route they took. When I install Linux on a reasonably recent desktop I still have to fiddle with boot options and/or blacklist the nouveau driver for some nVidia cards. And now nVidias approach has been completely vindicated.

This. Exactly this. The treatment AMD got from LKML was slightly deserved but doing that they alienated a hardware partner and indirectly set the precedent for doing closed source binary releases that use HAL anyway. It sucks for the consumers and that's what drives the installed base of Linux.

> slightly deserved

ATi was legendary for their terrible drivers. The fglrx drivers took literally 5 years to not suck on first-generation radeon cards. By "not suck", I mean that the system would not have kernel panics at least once a day due to them. While they were piling on support for newer drivers in their proprietary stack, they were leaving triaged bugs open for years. When AMD bought them out, they didn't fire every developer they had and replace them with better talent, which is why we're seeing this LKML thread today.

I attempted to use ATi/AMD video cards in systems every time they'd make a big linux announcement, to attempt to support them for their decisions. Every time, I would have new linux converts telling me they were going back to windows due to how unstable their linux systems were, all thanks to these terrible bug-ridden drivers.

I and many other linux zealots now refuse to use ATi/AMD video cards to this day for that reason. It's clear that AMDGPU is only slightly improved from that situation, so I don't anticipate myself changing that opinion any time soon. The open radeon/radeonsi/radeonhd drivers are substantially more stable than the proprietary ones, adhere to kernel standards, and are now starting to reach feature/performance-parity with the proprietary drivers, with barely any help from ATi/AMD.

The Linux kernel will be just fine without ATi's terrible code infecting the tree. Eventually, the vendors start playing ball once they realize the kernel doesn't budge on code quality/standards, as the net80211 debacle proved. If they had just released all the specs without substantial NDA's in the way, ATi could rely on the volunteer kernel hackers to produce a driver that outperformed their windows equivalents, just like how Intel is currently benefitting.

You make some really valid points. I'm sorry I overlooked the part about past drivers because I haven't had the misfortune of experiencing them (from what you say I feel lucky for that).

I don't have an AMD card handy but could you tell me more about the open radeon drivers? And if we do have those, why did this discussion start in the first place?

Are you talking about the Intel firmware or the OpenGL mesa drivers? I would say it has improved but the one pain of tearing videos and panning windows makes me really sad.

The open radeon drivers were developed outside of ATi, using the (limited/sparse) documentation they were provided under extraordinarily strict NDA. As a result, we didn't have things like Z Buffering in the drivers until this year. But despite that, the drivers were in a much better shape.

This discussion (which if you read the dri-devel thread, ended with the ATi guys saying "we're sorry. we'll do better") is happening because AMD's marketing department dictates that they keep as much of the driver closed and developed in-house as possible. They're completely opening the kernel driver, but every layer above that is planned to be closed. There will be a free Mesa/Xorg/Wayland stack developed by volunteers in parallel with the proprietary drivers, for those that actually want to use their computer and not experience kernel panics (at a performance cost, due to not having full documentation of the card). Oh yeah, the proprietary drivers haven't even begun support for Wayland yet, and probably won't for another year or two.

This is a step up from before, mind you; AMD's previous setup under fglrx was a closed-source kernel driver with binary blobs, which would usually require a recompile every time you upgraded your kernel. In addition, because their software team was insane, they would take a snapshot once a year of the mainline kernel/X11, and build against that instead of merging to the latest tree. This means that all distros would have to pin their kernel/Xorg packages to a specific version (again, a year out of date), otherwise you just plain wouldn't have accellerated video. Arch (and other rolling distros) got so fed up with it that they stopped building fglrx in their repository, forcing people to use the open radeon/si/hd drivers. You had to do a rather immense amount of voodoo (add third-party repository, pin xorg/kernel to ancient versions included only in that repository, downgrade packages) just to get them working. If there's an exploit in the kernel or xorg, tough. If you want to have system uptime measured in days/weeks/months, tough. If you want Xinerama (good multi-monitor that plays nice with xrandr) support, tough.

In comparison, this new situation is a lot better. AMD is at least playing nice with the kernel developers, and the development team has fought enough with the marketing department to embrace a semi-open development model, which is where this dri-devel discussion comes in. We also got a pretty good inside view of way that AMD develops drivers/hardware, and some amusing commentary by the dri-devel maintainers (including Intel employees) about how screwed-up AMD's internal culture is to produce these sorts of problems. For instance, their software team cannot communicate with the hardware team, because by the time the drivers are started, the hardware team has moved on to another card, and all the knowledge is pretty much lost/changed/irrelevant. This bears repeating: the hardware and drivers are developed separately and at different times.

The end result of the thread is that they're going to split this 100,000 lines of hardware abstraction into much smaller chunks and merge it piece by piece. Hopefully this means that AMDGPU will be worth using in Kernel 4.10 or 4.11, depending on how long it takes.

I'm still using intel's mesa/kernel drivers in my chromebook. Yes, tearing/panning is gross, but if I had to pick between Intel and AMD's code on my machine, it's a no-contest.

Thanks for taking the time out for such a super detailed reply.

> Linux drivers on the nVidia side are OK.

Except for the annoying tearing. I'm considering a RX480 for my next GPU thanks to it.

Their revenue is from Windows, Apple and Cknsoles, not from Linux.

It's impossible whether to tell if it'd be better if their drivers sucked less. It's certainly a massive barrier to Linux being usable.

Maybe this is why their net is negative; because they don't allocate money to the right things.

In my experience supporting Linux desktop is hardly ever the right thing.

Example: ubutnu cannot consistently ship a version of network manager that supports reconnecting to a wifi network suspend with out being manually restarted.

I suspect that at a dollar value there is most no point but I don't have access to enough data to prove it.

Your example re wifi is not great, works fine for me. Dell hardware.

> Lennart Poettering's NetworkManager

Your experience is limited, and tainted. wpa-supplicant by itself can accomplish what you desire; NetworkManager (poorly) adds three or four extra layers of abstraction on top of that.

Meanwhile, Wicd and connman do exactly what you need, and don't constantly crash while doing it.

Blame ubuntu for following Red Hat's lead, and using their "solutions" to the problem.

That may have been true of their driver quality in the past, but it's really not so true now. On Windows, their constant work at improving drivers, leading up to the release of their Crimson driver software, has been evident over the last two years. I have not had an AMD driver issue since maybe late last year (when there was a persistent incompatibility for a long time between Oculus Rift and the drivers). I run a dual-GPU card (R9-295X2) so my setup isn't exactly vanilla either (admittedly though I only use a single monitor, multi-monitor seems to have been a persistent pain point for people).

I can understand AMD wanting to unify their codebase. I agree with the parent that they shouldn't worry so much about upstreaming into the kernel. I think their HAL approach is the right engineering one, given the constraints that their team is working under. The team seems to be trying to do the best with what they've been given.

Also, hasn't the graphics division been spun-off into a separate company now (Radeon Graphics Group)? Or is it just a more focused internal division within AMD?

In The Thread: Windows users thinking improved windows performance somehow equates to improved linux driver quality.

Their paid developers are being outclassed by volunteers who write more solid drivers. Somehow this doesn't sink in.

> In The Thread: Windows users thinking improved windows performance somehow equates to improved linux driver quality.

Well, as far as I understood, that's exactly what the entire discussion is about: their HAL that would allow them to reuse large parts of their well optimized and tested Windows code and integrating that into the kernel.

> well-optimized

Later in the dri-devel thread, they even admit that DC isn't even being used in the Windows drivers, because by the time the software devs get to write drivers, the hardware devs they should be hooking up with have already moved to the latest/greatest thing, and can't be bothered with legacy.

AMD's drivers are far from "well-optimized", and their internal developer culture perpetuates this problem.

You've confused revenue with profit. They are not the same thing. AMD has been losing money for most of its existence and almost went bankrupt on a number of occasions. This renders your points completely invalid.

I have not. I used the term "gross profit," which is reported on their annual income statement exactly as I suggested. You can argue whether I should have talked about their net losses, but you can't claim I've confused anything. You don't need to operate at a net profit in order to be able to devote resources to a strategy that you value as a company, but you do need cash flow.


For those of us out of the loop, if AMD has been "losing money for most of its existence" and "almost went bankrupt on a number of occasions", then how is it still alive? Did it make a ridiculous amount of money at some points?

AMD has had ups and downs, but over the past 15 years they've had a net loss of $7 billion. Yes, they've had some good years too. They made over a billion dollars in 2000 alone, and they've been knocking around since 1969. And they've been written off for dead many, many times.

That being said, the current situation is bleak, and no, they never made a ridiculous amount of money. They've had to go to pretty extreme lengths not to go under already (eg, selling off their headquarters building, then leasing it back, just to scrape together some quick cash). They're very, very cash strapped. So much so that the news they signed a somewhat nebulous licensing deal with a Chinese company to help them make servers caused their share price to jump 52%, just because they'd be getting ~$300m in licensing fees.

Thanks! But I still don't understand how they're still alive?

How can you lose billions of dollars and still be in business? Are they borrowing the money from someone? Where is the money they don't have coming from?

They have raised cash by taking on investment from the Abu Dhabi Investment Authority (the sovereign wealth fund) and also selling off some major assets, most notably their fabs (GlobalFoundries) and even their HQ building. It is also worth noting that large accounting losses do not always correspond to large negative cash flows. Stuff like depreciation (though less of an issue now without the fabs) and writing off "goodwill" from e.g. their ATI acquisition cause large paper losses without actually adversely affecting their cash position.

Huh, not sure I understand all of it but good information nonetheless; thanks!

Let's say you sell $9 billion of shares, and build a massive fab with it. You believe it'll have a working life of about 15 years, before changing technology makes it worthless.

Each year you spend $100 million on salaries, rent, and materials, and you earn $500 million in sales, leaving you with $400 million in the bank at the end of year 1, $800 million after year 2, $1.6 billion after year 4, etc.

Pretty good, right? Not really.

You're cash flow positive to the tune of $400 million/year, but you're not profitable. You spent $9 billion on that fab; since it'll last for 15 years that means each year costs about $600 million. Or to put it another way, at the end of 15 years you'll have $6 billion in the bank, but you started with $9 billion. Turning $9 billion into $6 billion is the opposite of a profit. And since it's not enough to build a new fab, it's also the opposite of "having a functional business".

Another example might be selling off a profitable business for an injection of cash. The cash helps you pay salaries and keep the lights on, but if that's all you do with it you're now even less profitable than when you started. Or as in AMDs case, you could sell off your headquarters, then lease it back. You get a pile of cash initially, but you then have to pay it all back and more just to keep using your headquarters, and the increased costs will lower profits.

Similarly, if you can convince people to keep investing, you can run keep running a loss but not run out of cash.

(All numbers utterly hypothetical. I'm also simplifying a lot.)

Essentially: accounting is an art. If you do not have profits, you pay less taxes, for example. Paper "losses" are a thing. Money losses are another.

That's a very clueless way to answer their question.

Accounting has several layers:

- Cash flow: this is what you'd look at for your lemonade stand. Actual money comes in and goes out (either "cash cash" or you bank balance, both is "cash" in this regard)

But that layer isn't the most important one for incorporated companies. Yes, running out of cash is a problem. But what usually / actually happens is failure on the "value" level:

- Your company has a value of which cash is only one, usually small, part. Stuff you own, like buildings and patents and brands are another. So is debt your customers have with you. On this level, you can spend money without any effect on the value: If you buy a skyscraper in Manhattan, you may spend %2 billion in cash, but you get a $2 billion building in return. You can also increase the value ("make a profit") without actually getting any money: if you sell the skyscraper for $4 billion on December 20th, 2016, you've made a $2b profit in 2016, even though the money will only arrive in 2017.

The reasoning is that this system results in a more accurate picture of a company's finances.

Accounting is a conart

Well, no; it doesn't really even impact whether you pay dividends/taxes or not, but it does impact when you pay them. It's not that difficult to shift things across financial years, although eventually you'll show the profit/loss accurately, cumulatively.

You can see this by looking at their balance sheet or cash flow statement. Easiest to see these on Yahoo Finance or Google Finance.

Looking at the quarterly data, as of Sept 2016, you can see in the Balance Sheet, there is a "Capital Surplus" line showing they have raised $8.2B of equity over the life of the Corp. Now look at the "Retained Earnings", they have lost $7.7B of it. They also have $1.6B in debt.

Intel infused them with money the last time they were about to go under.

AMD's existence prevents Intel from having to face prosecution for monopoly status.

Infused them with money? If you are talking about the legal settlement then this is a very charitable description on what happened from Intels side. https://en.wikipedia.org/wiki/Advanced_Micro_Devices,_Inc._v....

Your link is broken, below is the correct one. More importantly, Intel wants AMD to stay alive and keep vaguely competing with them since they avoid significantly worse anti-trust rulings in the future (here is a current one[1]) due to there being no active x86-64 competitors.

Additionally, think about Google, Amazon, MS Azure and all the other big players who push large volume on Intel's higher end SKUs, what do you think they will do if Intel becomes their sole source vendor? I'd predict all 3 will take their current dabbling in ARM servers and amp it up, since being captured by a single vendor is a serious issue for all of them.

Correct Link: https://en.wikipedia.org/wiki/Advanced_Micro_Devices,_Inc._v....

1 - http://www.theverge.com/2014/6/12/5803442/intel-nearly-1-and...

Your correct link also broke. I didn't see why, but now I realized the period at the end is what's messing it up. You have to put it manually.

I don't remember where I read this, but as I understand it, Intel is required by various contracts (military, aerospace, various other mission-critical stuff) to not become a monopoly on certain classes of products. In other words, if AMD went under, Intel would have to give the x86 license to someone else.

Thanks! But what does "infused them with money" money mean? They wrote them a check for a few billion dollars? Would you mind explaining like I'm completely clueless (which I kind of am)? I literally don't know (or if I did, remember) anything about this so the missing details are not helping. Thanks!

Intel actually gave about 1.25B$ to AMD in 2009, in order to rid itself of an unfair competition and patent claim. This helped AMD to stay afloat. Without it, Intel would be in an effective monopoly position and that would bring in all sorts of restrictions (due to competition laws particularly in the US and EU) it doesn't want to have.


You can see a table of AMD quarterly net income back to 2004 here.


The use of gross profit here is severely misleading. AMD has famously been losing money for a long time - they made a loss of half a billion last year.

"they made a loss of half a billion last year"

Even that is misleading... More recently, AMD lost $406 million between June to Sept 2016.

The funny thing is, had they followed some of the subclassing practices already in the kernel, it sounds like they could have had cleaner, more easily testable code with what from what I can tell tends to giv them a pretty unified interface.

The belie, however, is an appalling response really:

I realize you care about code quality and style, but do you care about stable functionality? Would you really merge a bunch of huge cleanups that would potentially break tons of stuff in subtle ways because coding style is that important? I'm done with that myself. I've merged too many half-baked cleanups and new features in the past and ended up spending way more time fixing them than I would have otherwise for relatively little gain. The hw is just too complicated these days. At some point people what support for the hw they have and they want it to work. If code trumps all, then why do we have staging?

Wut? If you want stable functionality, then you follow the kernel conventions, practices and coding styles.

This whole screed equates to: "We've got some code we've put together in our silo, you amateurs who want perfection are wrong and our unified layers and push to get our processors to market mean you can't comment on our code."

The bit about cleanups makes little sense to me, they seem to have some fundamental issues baked into their code that will necessitate these very cleanups...

I think the argument here is that stability is gained from having code shared as closely as possible with the Windows codebase, which is where most of the QA work happens, and what the real experts debugging their ASICs use.

From a hardware and driver development standpoint, I think that makes a decent amount of sense. Hardware is often weird and quirky, so making sure you handle all the edge cases in two different codebases is a lot harder than doing it once. Certainly too much work for a handful of people who just want to plug their code into Linux.

Stability here is on the driver/hardware interaction, not the kernel/driver boundary, which is what the kernel folks seem to care about, naturally.

Flipside being that if AMD drivers are developed in the open with proper convention then kernel developers will help maintain them, alleviating cost for AMD. If AMD doesn't want the public maintaining their open source driver then they don't want their driver in the kernel

Stability for whom? Trying to contort Linux code to follow the practices that work well for Windows does not strike me as either maintainable, or even particularly stable!

What does this have to do with Windows vs Linux? It's about closely working with hardware engineers.

"I think the argument here is that stability is gained from having code shared as closely as possible with the Windows codebase".


You have to read the rest of the comment too. "HTH"

Indeed, I did which you seem to have thought I didn't read. So, do you think that bunging in code designed for Windows, which specifically uses a HAL, into Linux, which tries its best not to with the philosophy that the code that goes into Linux should be as clear as possible with as small an abstraction layer as possible is good for Linux, or any of the hundreds of developers or companies relying on the maintainers to keep to this coding and architecture style?

I think not. So no, it didn't help.

You missed the bit where AMD chides kernel maintainer for attacking AMD's corporate culture rather than technical merits, and then goes on to chide maintainer culture in the same paragraph.

And it's not even true what AMD says - that to commit to the kernel you need to be a funky part-time hacker or an immense behemoth. There are tons of small hardware shops out there with drivers in the kernel, written by paid developers. True, a graphics card is much more complex than any other appliance, but the characterisation is extremely wrong - the most active part of the kernel is the drivers, and they're not written by "redhat + movie-style hackers", but largely by the companies that make the hardware.

> You missed the bit where AMD

I don't believe he is speaking on behalf of AMD. He is speaking for himself and trying very desperately to shift his blame onto anything he could think of at that time.

He clearly tried to blame the problems caused by his technical decisions on subjective and social reasons.

I don't believe that a AMD PR employee would be so stupid to ever think of saying such nonsense, particularly in such an eggregious manner.

The developer claims he has been working for AMD for 10 years, and yet he still is unable to write a driver, let alone get one accepted. Facts speak for themselves.

They don't have the resources to do it right.

I loved that line from his post, in particular. "We have the resources to design a billion-transistor GPU, but not to write a Linux driver for it."

Poor management on AMD's part does not constitute an emergency on the Linux kernel's part.

How is that poor management ?

Linux gaming is a blip when you compare it to (a) the total gaming market, (b) the professional creative market and (c) the big data/deep learning market. All of which AMD is selling graphics cards into. And it's not like Linux is even some massive growth market that justifies some risky investment.

> the professional creative market

This is not true for the animation/CGI/effects industry where Linux not only looms large but is growing.

I'm also sceptical of Linux having a small share of big data/machine learning since this is adjacent to servers where Linux dominates- would you mind sharing your sources? Nvidia's CUDA is cross-platform, this sets a baseline for whatever response AMD is planning.

Also GPU-as-a-service seems to be a growth market, and Linux is usually the first choice in X-as-a-service.

That's what's frustrating to me. I have an AMD R9 290 and an Intel integrated GPU. I want to use them together for GPGPU stuff (simulating dybamical systems), but the driver situation has consistently been kindof a mess. I get it working and then all the sudden things shift and I have to configure again... Reminds me of the good old days of configuring xorg.conf in 2004.

if you're still using fglrx, configuring xorg.conf is still a part of your life.

the sooner fglrx dies, the better off the planet will be.

GPU support means high-performance computing.

Today's 3rd spot on Top500 is occupied by a GPU-based supercomputer.

...which is equipped with NVidia cards.


Graphics cards are used for professional work too, not only gaming.

I recently bought an AMD card for the first time, because I was annoyed with nVidia jerk behaviour.

And... I am not sure AMD has resources to design good GPUs either, the GPU I own, the 380X, is hot, power-hungry and buggy.

The RX480 launched recently to replace it, draws too much power from PCI-E slot and can damage people motherboards and raiser cards/cables.

So I am not entirely sure they have resources for their hardware division either, in fact the recent card launches from them all looked "rushed" in some way, and undertested. Some bugs are haunting their cards for 3 years now, and they don't even bother in putting in the "known issues" list anymore, because they have no idea how to fix it, despite having a 300-page+ thread on their support forums about it with people contributing lots of information.

It's the labor theory of value in action. "We put in the labor, labor is intrinsically worth something, because we're giving you something that's worth something, it should be appreciated by you and by the wider community."

But that only works when you're in the programming equivalent of a 20 member socialist commune. It doesn't work at scale, and the Linux kernel is the effing king of large open-source projects. At scale, you must have standards to ensure the value added is higher than the cost of maintenance. Who's going to set the standards if not the kernel maintainers? The contributors? That's anarchy, leading to collapse. The wider community, in some democratic fashion? Easiest way to kill momentum and thus kill the project.

Or even better, give us an open source out-of-tree driver, and wait. Someone in the community will do the cleanup and get it upstream eventually. For free.

While that's probably true, it won't happen overnight, and I understand their position of wanting it merged before rolling out new hardware, or more generally, as soon as possible.

...fast, correct, cheap: pick two.

How come Intel can do this, but ATi/AMD can't?

> They don't have the resources to do it right.

Honestly I hear this quite often from big companies and it always is the biggest load of BS ever..

Correction: we have the resources. We just don't want to allocate them and would prefer that you did the work instead.

I would think that any company that's lost as much money for as long as AMD has has a right to say that they don't have the necessary resources.

The mere fact they are losing all this money means they DO have the resources, they just routinely waste them on stupid shit (seamicro/raptr etc).

> Correction: we have the resources. We just don't want to allocate them and would prefer that you did the work instead.

Perhaps it's just internal politics: the company did in fact allocated the resources, the people put in charge failed to do their job, and once their screwup goes public they opt to shift the blame elsewhere.

It is hard to get resources to do things right sometimes, even in big companies. There are a lot of competing priorities.

>> It is hard to get resources to do things right sometimes, even in big companies. There are a lot of competing priorities.

AMD has the resources. The problem seems to be getting them allocated to this particular issue. I also find your phrasing interesting. Doing it "right" is always my top priority and every compromise from that is considered and balanced. In my experience, not doing it right is almost always more costly but sometime necessary to appease someone. It's interesting that Dave is standing his ground against a multi-billion dollar company for their own long-term good - or at least what he perceives it to be.

AMD could likely allocate the resources, but what will they cut to do so? At this point they are a token competitor to Intel and are losing money and have been losing money for years, so how much of their financial runway are they willing to burn up on a platform that doesn't use them for servers and minimally uses them for gaming (where the platform has no marketshare).

I as a full time debian user for the past few years get it. AMD's Linux devs heavily dislike direct rendering manager and want to provide all the cool crap their proprietary driver does to Mesa, cause DRM is legacy tech and they have over 1000 SKUs of cards to support, 2 competitors breathing down their neck on both the GPU, CPU and SOC sides, and they wanna get this shipped.

I hope I've illuminated the state of the situation for you, I do think the right decision was made to not merge this code, but be realistic about AMD's position, they are so far in the hle financially that customers are having chips made to order after paying, meaning its at least 3 months from the time f order until you get your chip. They spend nearly all that time making the silicon, then packaging it so it can go onto your motherboard.

OMG I didn't know they were making customers pay in advance. That's really bad. I do understand their position on the issue, but I also get the kernel guys point. I have some faith that a compromise will be found because it's in the best interest of AMD and Linux.

Yeah, the outlook for AMD is poor, but their new ARM chips at least look appealing, with 2x 10GBe on the SOC at well below market price.

If only there was some way to use leverage to require them to assign resources to a project.

> the 'year of the Linux desktop'.

What was that 2001? I can probably dig up an old slashdot and find it...

That's a lot of strawmen you managed to build here :/

That list really is sourced from Alex' email.

It's a really unfair reading of that email. Yes, some of them are true, but that post is no better than turning a random quote into a clickbait headline.

Coming from the FreeBSD perspective, I would freaking LOVE it if some drivers had a HAL (or, really, OSAL). Instead, for the vendors that cannot allocate the resources to properly support FreeBSD, we have a linux kernel compat shim that tries to translate from the linux kernel apis to FreeBSD ones. This is much harder to deal with than a HAL because it makes it even harder than a HAL to reason about the code.

One recent example is that I was debugging a rogue DMA issue in a driver that uses the linux kpi shims. I wanted to use the Intel DMAR to try to catch the DMA, but because of the use of linux shims, the driver would not work at all with the DMAR. We had to improve the linux kpi shims to do busdma, rather than just use pmap_kextrect() to convert kernel virtual to physical addresses (and this was a hack, because there is a gigantic imedence mismatch between dma_map_single and busdma). And, as soon as we had the driver working with the DMAR, we caught the rogue DMA.

Weeks of my time could have been saved if, instead of writing to Linux, the vendor driver had included a full hal that supported busdma. Instead, they wrote to the linux kpi.

And I fully blame the Linux kernel maintainers for this. They're on top of the world and can dictate to hardware vendors to remove their portability shims. Meanwhile, other OS projects get the dregs.

I'm sympathetic to BSD, but that seems like a lot of misplaced blame to me.

Companies don't write FreeBSD drivers because there's no ROI for the hardware companies. Making the drivers doesn't help them sell more hardware. Likewise, the LK devs made these decisions because their vision of the kernel doesn't involve HALs, not to spite FreeBSD.

I don't know if HALs in the kernel would have been the smart thing to do here (I'm not familiar enough with the problem to comment; generally, abstracting hardware access behind a HAL is a smart thing, but "generally" is a really bad snare trap).

On the other hand, I'd really love to see the Linux team getting the same treatment that Microsoft gets whenever they encourage lack of portability, even where portability would be irrelevant. A whole thread about this, and not once was the word "EEE" uttered.

I'm sure business had nothing to do with the Linux team's decision here, I'm just a little pissed at our double standards ("our" as in the open source/free software community of users and developers). AMD's criticism is not without valid points. Getting drivers to work (let alone in upstream) while the hardware is still relevant is difficult and requires a lot of maintenance, hacking and testing due to things like API changes, undocumented/shifting ad-hoc conventions and so on. Driver development on Linux is very much unlike what you expect with a Windows background; the sheer fact that they managed to convince a largely Windows-only shop to let them do it, with an eye to the future, is amazing.

I'd have expected to see questions like "why did these guys write the whole fsckin thing, all 100,000 lines of it, and only found out it's not upstreamable now". I've been hearing of AMD trying to get their Linux drivers in good shape for a long time now. "We don't do the thing that is most fundamental to your architecture" looks like the kind of problem that could have surfaced within, I don't know, two emails?

Edit: I do think that the LK maintainer was right not to merge this. What bothers me is that everyone's focusing on everything except the examination of the technical issues and what would benefit Linux users.

> I'd have expected to see questions like "why did these guys write the whole fsckin thing, all 100,000 lines of it, and only found out it's not upstreamable now"

AMD was told 6 months ago that it wouldn't be merged if they didn't follow certain guidelines. Then they didn't follow the guidelines.

What bothers me here, is the way AMD's management has handled the whole thing.

It seems like management demanded it be a certain way, and the coders were forced to build something they knew would be unmergable. And then management chucked a hissy fit.

There's been really good work here, and management has got upset, rather than follow guidelines, or nVidia's example.

It reflects really badly on the company, which is sad considering the space for AMD left by nVidia's Optimus kafuffle.

There is a market for GPUs here, but it does need to show some professionalism, which they (management) haven't.

I agree that there is room for disagreement between the two approaches. However, I don't think that AMD did things this way just because of short-sighted management. A HAL-like layer is a solution that I've seen or heard of in a lot of places, from a lot of hardware manufacturers that want to support Linux.

It may seem -- and may well be -- a sub-optimal solution, but it's not worse than what we have now, and AMD looks willing to commit to the long-term support of the HAL and the drivers. This is likely something that they want to do not just because they're lazy and would rather spend the money on something else -- it's likely that their management genuinely sees the development and maintenance of an entirely unabstracted set of drivers for Linux as inefficient, especially when you look at how much money they make out of it. And they aren't entirely wrong.

Deucher's remark about the Red Hat silo may look malicious and abrasive, but it has a glimpse of truth. I could make a really cool photo album by taking snapshots of developers and managers who are only familiar with Windows and hear about the challenges involved in writing (and upstreaming) a non-trivial Linux driver.

I'm not saying that the driver should have been merged as it is just because there's no alternative. I do think, however, that it's a little presumptuous to think its architecture is the way it is just because managers are stupid. Maybe a third option, that's not HAL but also addresses the concerns and requirements of AMD exists.

You may be right but do you have any evidence? It's too convenient for developers to just blame managers when stupid things happen.

I agree with you. We really are a hypocritical bunch. This move may have essentially alienated any new hardware maker from making open source drivers. They too may go the closed source, taped together with a HAL way and that's not good for the long run.

This essentially gives a nod to the way Nvidia's been treating Linux.

The kernel maintainers don't necessarily want your code.

Additional shit in mainline increases the maintenance burden. If the code isn't putting upstream maintainability above all else in its implementation, then it's generally not getting merged unless someone wasn't paying attention or some other forces compelled an exception.

Alex's first reply reads like he's willfully ignoring that aspect of the NACK. It's not a coding style issue, it's a HAL issue. Upstream isn't interested in maintaining a HAL and living with the impedance mismatch from day zero. The driver can stay out of tree. This is just about code getting into mainline.

I'm personally very happy to see the gpu subsystem maintainers paying attention and having the maturity to know a fool's errand when they see it.

Then again, we're not talking about some wierd network card driver here.

We're talking about having a constant up-to-date driver on par with Windows for a major GPU card manufacturer. A driver which has caused a large amount of desktop users to return to Windows due to its historical issues.

While I get the reasoning for rejection, the typical OSS rejection attitude is also problematic. I kinda don't see a dialog happening here on how to resolve problems both sides have, just a repeat of posturing which has historically brought us awesome things like binary blob drivers that only work with single version of kernels (yes, I'm looking at you every ARM GPU ever!).

> We're talking about having a constant up-to-date driver on par with Windows for a major GPU card manufacturer

Nvidia manages to do this without any conflict with the kernel maintainers by simply not mainlining the code. If functionality trumps code quality, AMD can release kernel modules.

Perhaps the best current solution for all parties is for AMD publish sources on github and release kernel modules while getting PRs from the community to get the code up to meet the kernel standards. When the time comes, it will be duly mainlined.

Without any conflict, uh?


> Nvidia has been the single worst company we've ever dealt with.

> - Linus

Kernel devs are antagonizing the only two GPU makers that matters. Besides the kernel, there has been some flame going between NVIDIA and Wayland devs too. Open source devs are immature men who do not understand the word compromise.

> Without any conflict, uh?

> Nvidia has been the single worst company we've ever dealt with.

> - Linus

This had nothing to do with the kernel and everything to do with the lack of optimus support (years later, its still shit).

> Kernel devs are antagonizing the only two GPU makers that matters

Maybe the 2 should ask Intel for some pointers on how to contribute to the kernel the right way.

> Open source devs are immature men who do not understand the word compromise.

I suspect that's part of the reason the kernel is stable, and for that I'm thankful. Not compromising on code quality is something I wish more projects would do, if they had the well-earned political/social capital the Linux kernel has.

> Maybe the 2 should ask Intel for some pointers on how to contribute to the kernel the right way.

Intel is not waiting for them to ask:


> This is something you need to fix, or it'll stay completely painful forever. It's hard work and takes years, but here at Intel we pulled it off. We can upstream everything from a _very_ early stage (can't tell you how early). And we have full marketing approval for that. If you watch the i915 commit stream you can see how our code is chasing updates from the hw engineers debugging things.

(More specific advice follows.)

Out of the three common GPU brands found in computers nowadays, Intel's integrated GPUs have mostly been a very pleasant experience for the end user when installing any GNU/Linux distribution. It just works without any additional work.

If AMD manages to reach that level of out-of-the-box working graphics driver, that will definitely reflect positively on their brand of graphics cards, and might give them an advantage over Nvidia.

I suspect that working with the kernel developers and maintainers will also be beneficial to the driver itself. It seems to me that for cooperating you gain valuable feedback on your code from people well-versed in kernel and driver code.

Sarah Sharp worked for Intel and apparently didn't manage to contribute to the kernel in a way that would make communication with Torvalds particularly pleasant.

I extend my sympathy to anyone paid to contribute code to Linux.

She never had any unpleasant communication herself, for what it's worth.

Most people who are paid to contribute to Linux, including myself, are very happy to do so, and no, it's not a case of Stockholm syndrome.

to me Sharp came across as trying to score social points by being a "woman in tech" rather than being honesty interested in tech. Something i fear is going on a whole lot in recent years in the FOSS world.

I keep seeing already cash strapped projects go off the rails because someone decides that they need a gender oriented outreach program, complete with elaborate gatherings and whatsnot.

And when that crash and burns, their excuse for the cash bonfire is that the FOSS world is misogynistic...

AMD was told in February that they would have to obey kernel code standards / no HAL.

In late October they dropped a steaming pile of 100k lines of code with a HAL and are whining it isn't getting merged.

The solution is AMD keeps code out of mainline or obeys kernel code standards.

My favorite part is back in February they made mention of shrinking it to 1/3 of what it was (93KLOC).

Now we're up to 100K+


> A driver which has caused a large amount of desktop users to return to Windows due to its historical issues.

Who's to say its replacement will be any better? If the hygiene isn't there, maybe the quality isn't either.

> A driver which has caused a large amount of desktop users to return to Windows due to its historical issues.

Kernel developers do not care. They care about good software.

Define "good" and "software."

This got into negative territory, but I was being serious.

Define "good." Is quality a thing just defined by code style and properness, or is it defined as fitness for a purpose, connection to human use and usefulness, usability or function, or lack of defects to the end user?

Define "software." Is software just code, or does it also include the experience the code generates for the user? Is software only considered in terms of what developers of software are interested in, or does it include what users need and want?

Of course Linux kernel developers are only interested in narrow definitions of those things, and that's surely part of the problem. It's good to think about what "good software" really means, in my opinion.

Developers are also users, and a part of the application they use is the source code. Having good quality source code benefits all users, not just developers. Linux is where it is today because of the quality of the code. Things like "use and usefulness, usability or function" have different meanings to a developer looking at the source code.

This is why—effectively—use and application of Linux is limited to developers.

I do feel Dave went a bit over the top in his response. If he keeps it technical it's easier to fix, people want to see their stuff merged. When he makes it personal (which he kind of did) it turns people off.

That being said, I do think Dave probably made the right call, but I also think he may need to compromise on this one. That being said, AMD should strategicly try and get their GPUs supported. Couple that, with merging an OpenCL version of Tensorflow or something and AMD would probably be more valuable than Nvidia. I think both groups could benefit and should probably sit and try to work it out in person.

From his point of view the AMD guys were told that this code wouldn’t be merged as is six months ago and now they come back with a massive code drop and very effectively put him in the position of having to be the bad guy.

I think we’d all have some sharp words we’d like to use in that situation even if the better part of our natures might counsel us to keep them to ourselves.

The kernel essentially demanded they drop the idea of cross-platform driver (which is what makes nVidia drivers work so well on Linux and is keeping them in lockstep with Windows releases) and maintain a full separate copy with a small team.

It was unrealistic six months ago as it is now. And yet, I still don't see a constructive debate from Linux (or AMD) side on how to sync the goals. All I see is Linux people posturing about how they don't want to be the second platform and AMD refusing to rebuild the driver just for them.

These kind of attitudes bring us awful situations like Android Linux devices each having a broken fork of Linux kernel to accomodate closed proprietary blobs because neither side is ready to step together.

The goal of Linux governance is that all the video developers can understand and maintain all the in-kernel video drivers in order to evolve the internal API as needed. The goal of AMD is to reuse whatever the hell they had to write for Windows. Since Linus and friends are not in charge of Microsoft's driver API, these can't be reconciled.

From the AMD point of view, the Linux upstream's approach to modifying the internal API and drivers is a problem itself: "I understand forward progress on APIs, but frankly from my perspective, atomic has been a disaster for stability of both atomic and pre-atomic code. Every kernel cycle manages to break several drivers. What happened to figuring out how to do in right in a couple of drivers and then moving that to the core. We seem to have lost that in favor of starting in the core first. I feel like we constantly refactor the core to deal with that or that quirk or requirement of someone's hardware and then deal with tons of fallout."

Certainly, from what I've seen the first few stable releases after every new kernel version are often full of fixes for modesetting-related regressions.

The kernel demanded they drop the idea of a cross-platform driver to get merged into the kernel. They can still develop a standalone driver module, same as everybody else. Code doesn't need to be in kernel to run.

But then they don't get the free lunch of kernel people maintaining and fixing their stuff.

Which ends up in the same situation as older catalyst.

Looks like the free lunch isn't that free after all then. Tough luck.

They can still get the free lunch, but they have to deserve it.

Nvidia also doesn't merge into the kernel. "Nvidia does it" is not a good argument IMO.

> which is what makes nVidia drivers work so well on Linux

Maybe for you, every kernel upgrade I have done in the past would break my desktop for a day or two until a patch is released. Not too mention oddities and artifacts, I get from time to time. Things that I find unacceptable and embarrassing if it happens on Windows or Mac and I have to learn to live with it on Linux.

I am probably out of luck but I think I would start to see you if FreeBSD would be a good replacement for Linux

Breaking on every kernel upgrade is a configuration issue (and a result of Linux's driver model); the "quality" of the NVIDIA Linux driver being discussed is an issue of the actual performance, compliance, stability, and feature parity/freshness relative to others, once configured correctly.

Reconfiguration/rebuild on every kernel update is definitely a hassle, and the NVIDIA scripts which try to automate the process still occasionally fail (IME, generally due to conflicts with the system package manager constantly deciding it knows best and trying to displace them with its own versions). But that's essentially an unavoidable consequence of combining out-of-tree drivers Linux's extreme no-stable-driver-ABI-ever design which requires all drivers to be rebuilt directly against the exact current kernel source: you always have to rebuild the bindings on kernel upgrade. The bindings can still be maintained by the distro vendor and its package manager, so they're centrally rebuilt (and fixed if needed), and updated in lock step with the kernel. This works well in Ubuntu, at least, these days. But absent that, the Linux model fundamentally requires local rebuild automation (or full manual rebuilds), which is prone to breakage, like any other "just download and rebuild from scratch against a new version" you've ever done, kernel/driver or otherwise.

That's why I switched to the open source nVidia driver long ago. I got tired of upgrades screwing up the driver. It's not a configuration issue, it's because the other nVida driver doesn't get rebuilt along with the kernel due to not being upstream.

To be clear, were you using DKMS? It's not clear at all to me if that's what the OP is talking about using (he mentions scripts in the nvidia installer), but DKMS has worked quite well for me for a long time when I needed to keep out-of-tree drivers up to date.

Of course, if your distro tracks the latest kernel upgrades in a frequent fashion (e.g. ArchLinux) then it's more problematic, since API changes in the kernel will actually cause compilation failures. But actually making a rebuild happen? Easy. Debian derivatives tend to have stable kernel versions for years and years, so DKMS works quite well here for modules that are out of tree (another example is ZFS on Linux).

> The kernel essentially demanded they drop the idea of cross-platform driver

Don't you agree that it is very silly to try to force cross-platform code into the very source of a very specific platform?

After all, the people working for that platform need to actually maintain that code.

No, it works very well for nVidia.

NVidia does nothing of the sort. NVidia's driver is a kernel module they distribute themselves, and that NVidia users need to download and install themselves.

why does AMD need to be in the kernel? Why can't they release the way nVidia does?

or open source it, develop along with upstream but don't make it part of upstream.

You're asking a question with a very complex answer.

I'll start with a short answer - AMD wants to have at minimum a reasonably good baseline functionality on linux, out of the box. If that answer doesn't make much sense, please read on for the complex bits.

To provide some historical context, Linus gave a VERY public, VERY brutal rant on NVIDIA and their drivers four years ago.[0] At the time, NVIDIA's closed drivers were an opaque blob which basically reimplemented the entirety of OpenGL. These collided with everything else.

The open drivers (nouveau) were slow and far behind in features. The situation was bad enough that you couldn't necessarily even install a linux distro on a system with NVIDIA chip because the open drivers wouldn't work well enough for X and/or desktop environment to initialise properly, let alone remain up and functional.

In effect, if you wanted to run linux, you were best off without NVIDIA.

Fast forward year and a half. NVIDIA had come out and committed to improving the linux driver situation. They still couldn't open source their current-generation drivers, but they had realised the horrible reputation of their hardware on linux was going to be a persistent PR nightmare, and thus an existential threat to their growing mobile division. A space where they were up against Imagination Technologies. (Don't get me started on ImgTec, please...)

NVIDIA needed to get their hardware and software processes aligned in a way that they would work reliably out of the box on linux, everywhere. Even if the user only needed to keep the freshly installed or upgraded system up long enough for them to download and install the closed drivers, the system really should not break. Incidentally this meant that even the open drivers should be "good enough".

Over the past 3+ years, the situation has improved. NVIDIA has managed to shed their reputation of being completely broken on linux, they have worked closely with kernel folks to get their more recent hardware supported sensibly out of the box and in the process (I believe) they have managed to reduce the code delta between their windows drivers and linux drivers.

It helps that during the same 4 years we have had OpenGL on mobile drive the separation of duties between EGL and GLES (v2+). These changes, with all the refactorings, have also provided a cleaner split on desktop - to the point that it is no longer absolutely necessary to provide full OpenGL implementation. You can, for most parts, expect that EGL just works; that DRM and KMS both just work; and that your highly optimised GLES implementation can happily live on top of these layers.

As a result, your closed driver offering has less to override. Of course it's going to be less unstable!

Disclosure: in my previous job, I helped integrate couple of EGL+GLES driver stacks with Wayland on mobile systems. I learned to hate mobile GPU drivers with a passion.

P.S.: I haven't read enough about Vulkan to know if it improves things or not.

0: https://www.youtube.com/watch?v=IVpOyKCNZYw

Although you don't say it directly but by saying that "The open drivers (nouveau) were slow and far behind in features." and then that the situation improved because of Nvidia involvement you are not being honest.

Nvidia hasn't improved performance or contributed features to nouveau.

Their changes are limited to code that is mobile chip-specific and from time to time they contribute some reliability fix that affects non-mobile chips, but only if benefits mobile.

Since Maxwell 2 (9xx+) their chips are designed to be hostile to nouveau, by requiring firmware loaded by the driver to be signed by Nvidia (hardware refuses to load firmware that wasn't signed by them). It means that without Nvidia blessing nouveau for example can't change the fan speed (but still can change clocks! how ridiculous it is?).

Nvidia contibuted signed firmware loading for non-mobile chips (so called SecureBoot), but only because it's also required by mobile. And they still have not released enough firmwares for desktop cards to be usable...

> Although you don't say it directly but by saying that "The open drivers (nouveau) were slow and far behind in features." and then that the situation improved because of Nvidia involvement you are not being honest.

> Nvidia hasn't improved performance or contributed features to nouveau.

Fair point, I didn't realise it could be read that way. Thank you.

Nvidia has contributed enough fixes to make their hardware sort of respond out of the fox. Yes, primarily for devices on mobile space, and occasionally on non-mobile when the same fixes happen to apply. I believe this is directly a result of same hardware designs being used across the board.

I didn't mean NVIDIA were being particularly nice, although I didn't know about the active hostility against nouveau. (IIRC their driver employees are contractually prevented from contributing to nouveau, but I can't find reference. At least there I can understand the reason.)

> In effect, if you wanted to run linux, you were best off without NVIDIA.

Depends on your choices. I guess if you throw Intel in there as an option, then maybe. But as of 4 years ago, if you were running a desktop -- Nvidia was pretty much the only acceptable choice if you wanted discrete graphics on Linux of any form, IMO, and driver quality was a massive part of why this is true.

Optimus, though -- that's what was really the debbie-downer for Linux/Nvidia, and Optimus is what inspired the question which lead to Linus's "Fuck you" rant, because Optimus support was so bad, and is still bad.

But the mainstream cards have worked fine for a long time, and, at least for me and everyone I know -- were the only acceptable ones for Linux, until relatively recently, and most certainly as of ~5yrs ago. As opposed to AMD, where I don't think I ever heard of a single instance of fglrx ever being anything but a nightmare.

> why does AMD need to be in the kernel? Why can't they release the way nVidia does?

Perhaps it was strategically important to them to be upstream (marketing point, driver ubiquity)? It may not be the year of Linux on the Desktop, but GNOME and the like are well-placed to grow in usage over the next 5 or so odd years. As others here have said, that other than for average desktop users, I don't see this as being as important to more technical Linux users (including corporate).

If it is strategically important, then they should allocate enough resources to do it right.

> why does AMD need to be in the kernel?

Speaking as an ATI user, having the drives in the kernel means that the GPU card just works right out of the box.

No need to mess up with drivers. Just install a distro and you're good to go.

This is exactly how I feel. I don't use Linux, I'm an outsider, and the whole thing just makes no sense to me.

Dave's response: https://lists.freedesktop.org/archives/dri-devel/2016-Decemb...

"I brought up the AMD culture because either one of two things have happened here, a) you've lost sight of what upstream kernel code looks like, or b) people in AMD aren't listening to you, and if its the latter case then it is a direct result of the AMD culture, and so far I'm not willing to believe it's the former (except maybe CGS - still on the wall whether that was a good idea or a floodgate warning)."

And, Dave's absolutely key point:

"Code doesn't trump all, I'd have merged DAL if it did. Maintainability trumps all."

Just FYI, but Torch supports OpenCL - which I have played with since that "should" allow models to be worked with on either card rather than having a CUDA and an AMD version...

Or you can run CPU only - if you have an eternity to let your training loop run.

AMD's response is 25% talking about unrelated Exynos code, 35% sarcastic complaints about Linux culture and 20% explaining why its naive to care about code quality and style. It concludes with a barely veiled threat about how they won't change and that not merging their code is what is keeping Linux off the desktop. I think we're done here.

> It concludes with a barely veiled threat about how they won't change and that not merging their code is what is keeping Linux off the desktop.

It seems to me that they made no threat. They simply endorsed NVidia's product line for anyone interested in running Linux on the desktop.

Gotta say this is my takeaway. If near latest kernels work with nvidia, I'll buy nvidia in the future (I'm all amd at the moment).

And yet AMD releases far more of it's specification for its chipsets than Nvidia does (which releases next to none). This is the reason that the AMD open source drivers are within 10~20% of performance of the closed source drivers on a number of chipsets (not all) whereas on Nvidia chipsets the open source driver is often just to boot a system and get X running enough to install the closed source driver.

> And yet AMD releases far more of it's specification for its chipsets than Nvidia does (which releases next to none).

That is only relevant if any third-party has any interest in taking up the challenge of developing an alternative driver.

And the open source drivers that do exist (and are far better than the Nvidia open source drivers) are proof that said third parties exist. So the net take away is that if you are a pragmatic open sourcer then you should buy AMD hardware. And even better support the open source coders who do write these drivers; cash, bug reports, testing, documentation, etc.

On a tangent: what the fuck is up with the giant quotes at the top of these messages? They're technically fulfilling the letter of netiquette by not top-posting, but they serve absolutely no purpose.

If you aren't going to take the trouble to pull out short quotes to respond to, please just delete the message altogether. All our clients have excellent support for threading these days, and we can find the message you're responding to without needing it repeated in full every. single. time.

(This criticism also applies to the message this was responding to: https://news.ycombinator.com/item?id=13136426)

Some clients just fold the quote.

I also stand on the kernel maintainer's side.

Kernel is not your AMD's kitchen-sink:) They are always fighting for any holes. The maintainers become the maintainers because the community think they has the qualified capabilities. The best way is to argue in technical side: for example, why you should have such complexities, how to control the complexities or something like. No technical response makes non-sense and low down your reputation in the community.

I didn't see where Dave slagged AMD's culture, and I think everyone was very polite and detailed about why this doesn't fit.

I understand why AMD wants to have one code base, but the kernel needs to have one consistent code base and style and can't afford to do otherwise. There's nothing stopping AMD distributing the code, it just won't be upstreamed.

Because rather than believing the AMD kernel devs are misguided about the kernel, he suspects the higher-ups at AMD are not listening to them. See https://news.ycombinator.com/item?id=13143371

That's probably far more accurate and much more fair than assuming they're just shit programmers.

This is yet another situation where I (as a user) am being held hostage [1].

The net result of attitudes like these is that I can't have a working multi-monitor desktop Linux setup, because everything is hopelessly broken and requires many hours of tinkering just to kind-of sometimes work.

I find it sad.

[1] Apple has been excelling at this recently, too, with the forced move to USB-C and dropping of the headphone jack — to "make progress" at my expense, where I am held hostage in the dongle-world as companies "move to new standards".

I understand and sympathize with your frustration. A month ago I threw out about 10 gallons of cables (it was a 10 gallon garbage bag) that I had collected over the years. I have a pile of drive enclosures with at least 4 different port connectors on the outside and at least 3 different interfaces connecting the drives to the ports. I have to use two different dongles to connect the 3 monitors I have to my two different machines, one of which I can't upgrade beyond OS X Leopard because it's PowerPC.

I take issue with the held hostage hyperbole, though. Yeah, you're inconvenienced by the OS/hardware options you have available. Your freedom of movement is not at stake. Your life is not at risk. I know it's fashionable to take poetic license and raise the stakes for everyday inconveniences, but let's keep things in perspective. Nothing's preventing you from using your old phone, or switching to another platform, or buying a couple of inexpensive dongles (very inexpensive relative to the hardware you've chosen to buy into).

Hm, some folks have installed Ubuntu, plugged in monitors, and it works for them.

KDE usually screws up when I plug a monitor into my notebook, but Xfce doesn't.

It's not a kernel issue.

KDE has a guy, he's singlehandedly awesome, yet of course a bit strange when it comes to interacting with mortals, who works on "these" things: https://blog.martin-graesslin.com/blog/2016/09/to-eglstream-...

And see, that it's NVIDIA that's currently holding back the year of the multi-monitor Linux desktop, because they're not supporting Wayland, like the others (Intel, AMD, Android).

Best i can tell, Xfce gets it right because by virtue of being short staffed they trust X to not screw up. KDE and Gnome would rather see X get out of the way and let them do their "magic".

Sadly the latter people are now in control of "X" development, or perhaps i should say post-X development, thanks to Wayland.

"This is something you need to fix, or it'll stay completely painful forever. It's hard work and takes years, but here at Intel we pulled it off. We can upstream everything from a _very_ early stage (can't tell you how early). And we have full marketing approval for that. If you watch the i915 commit stream you can see how our code is chasing updates from the hw engineers debugging things. Daniel Vetter Software Engineer, Intel Corporation"

Intel got their sh*t together, why can't AMD ?

Well, whatever Intel has been doing, they're just successful as merging their codebase. It doesn't mean the driver is stable though.

The KMS/dri driver itself certainly has _major_ unsolved issues since basically forever. The KMS driver for broadwell graphics was trashing the screen with massive horizontal flicker until kernel 4.8. That's not a long time ago. It's still not solved BTW, just got less frequent. Random pipeline stalls happen on an hourly basis. Tons of graphical glitches and random performance issues with the "glamour" accell path. I've stopped reporting issues at their tracker, as they just get ignored.

The performance of the KMS driver is also inferior to their existing xorg driver in a number of very important scenarios (Xrender is particularly affected).

Sure, I do have vulkan drivers as first class, but my screen flickers, I get graphical corruption and the driver hangs the entire system with certain shaders. I can see inkscape repainting beneath my eyes like it's '84. Wow. And I'm trying that with 4.9 rc8.

I've been using laptops with integrated graphics for almost 10 years. The moment you can get the intel driver to work half-decently, they're already rewriting it. I wish I was kidding.

Intel does have the money. They're actually doing worse in my mind.

Money (developer time) and some deluded idea that drivers, of all things, can be made portable across kernels.

Did I miss the mud-slinging and corporate culture bashing from the original rejection? I can get the AMD dev being a bit sour by the whole situation if they personally spent time working on this rejected patch, but damn.

There were certainly implications that the AMD coders were making technically compromised decisions:

> The reason the toplevel maintainer (me) doesn't work for Intel or AMD or any vendors, is that I can say NO when your maintainers can't or won't say it.

And telling them to sit in a corner and really THINK about what you've done, young man:

> I'd like some serious introspection on your team's part on how you got into this situation and how even if I was feeling like merging this (which I'm not) how you'd actually deal with being part of the Linux kernel and not hiding in nicely framed orgchart silo behind a HAL. I honestly don't think the code is Linux worthy code

All pretty mild, really, but I'm not the one who got the email telling me my months of work was not Linux-quality and would not be merged.

(I am giggling at the idea of "Linux quality" being held up as an ideal)

> ... but I'm not the one who got the email telling me my months of work ... would not be merged.

See, the thing is that they were told previously -- back in February? March? -- that this wasn't gonna happen. Instead of trying to work out a better way forward, AMD effectively ignored that, went back to work, and just now came back and said "here's 90k LOC, please merge".

Dave replied -- rightfully, in my opinion -- "no".

Ahh, thanks for pointing those out. I completely skipped over those portions...I was expecting some Linus-level chewing out that I'd missed given the tone of the response, haha.

Certainly pretty mild considering the source, though I understand why the second quote would stir things up a bit, especially making it into an "Us vs Them" situation.

While I agree with Dave that the patch should not have been merged, I can understand the tone of the reply, not just from frustration but maybe even some actual panic about job security. Depending on how AMD's management takes this news, some of the people who wrote this patch may not have a job in a little while if the merge doesn't happen. I'm trying to be sympathetic.

Considering AMD is bleeding money, I assume it was pretty hard as well to get someone to sign off on getting headcount/resources to write Linux drivers.

I think the issue is more or less clarified behind the scenes. Have a look here from AMD's bridgman


And this is the thread that bridgman points to: https://lists.freedesktop.org/archives/dri-devel/2016-Decemb...

Much more constructive interactions, I think.

> Is all we care about android? I constantly hear the argument, if we don't do all of this android will do their own thing and then that will be the end. Right now we are all suffering and android barely even using this yet. If Linux will carry on without AMD contributing maybe Linux will carry on ok without bending over backwards for android.

Could someone here fill in the details for the uninitiated? Do the kernel devs feel a pressure for the kernel to stay relevant in the Android world?

I don't know the details either, but honestly, I'm amazed that Google hasn't just thrown Linux out and built a microkernel with a Linux-compatible syscall API. That way there's a single binary that only cares about standard ARM things like memory management and scheduling and doesn't require hardware vendors to tweak it for every new SoC. The SoC-specific drivers can be written and updated separately just like apps. Maybe then it would be easier for vendors to keep their shit updated so I don't have to throw out my perfectly good phone just because a company doesn't feel like porting their changes to a new kernel version.

They have, it's called Fuchsia. I think most the arguments for monokernels have weakened with age. You have L4 microkernels in every iPhone A7 processor now. The demands of security, crash resistance, and portability I think outweigh any small advantages in performance. We've got performance to waste these days, what we're lacking is more secure, crash resilient software.

And it's also not GPL, which means that vendors will be able to create fully proprietary versions of it. Yay?

The kernel developers don't really care about the GPL either, sorry to say. They don't go after violators, nor does the Linux Foundation, it seems. Linus himself has said that going after violators is in bad taste and poisons the well, vs just trying to be buddy-buddy and get them involved. Because obviously they'll just move away from Linux or whatever if you're a dick to them (except in the case where Linux effectively subsidizes their existence, making their product even possible). So, don't be a dick, and they'll come to you when you buy them a beer or whatever.

Which is clearly the reason why there aren't actually 14,000 different Android kernels and 2,000,000 different kernels for your $50 router running around. Because they all get involved, of course, from being so buddy buddy.

Naturally, this attitude costs Linux developers nothing at all for the most part, and keeps their lives easy (no legal shit, no hard times) -- while absolutely hurting users who can't get the source to their devices, and completely eviscerating the social/political capital of a license like the GPL, and all the people who use it.

When literally the biggest GPL success story can't get off their ass and prosecute license violators, who actually will care when you try to use it as a tool, one which actually has teeth to back it up? Why use a license if its major champion treats it like a complete piece of trash, a worthless bargaining chip, a chip which is only possible because of an effectively unique, lighning-in-a-bottle position?

If the kernel developers just don't give a shit about proprietary vendor kernel forks (I really, really don't think they really do, at least nobody with actual meaningful, large scale influence cares at all), and want to force involvement by "getting them in the cycle" and being buddy-buddy and just not-giving-a-fuck about people outside the source tree, making sure constant churn is how people have to keep up -- they should just use the BSD license.

It seems to have worked out pretty OK for LLVM, and this is basically their operating philosophy, too. At least then, maybe another project can arise that actually takes its own license terms halfway seriously...

Linus made the right engineering choice when he took the monolithic route in the 90's. However, the technical advantages of that architecture are now moot. Sooner-or-later, Google is going to get tired of Linux's shortcomings and build a replacement OS.

They also have some folks working on Akaros. http://akaros.cs.berkeley.edu

Good luck getting the drivers working for all the esoteric USB/Bluetooth devices in the wild. IMO, Linux's greatest advantage is not it's technical architecture, but the breadth and depth of people and organizations working on it.

https://lwn.net/Articles/446297/ - 2011 https://lwn.net/Articles/481661/ - 2012 https://lwn.net/Articles/602521/ - 2014

Android is important, because it's successful, despite how crude it is. And the Kernel maintainers - basically as a modern-day abstracted version of self-preservation - want to see the mainline kernel in/on Android devices, and so they are thinking about __why__ Android had to fork. (Had to? Or were they just that stubborn and time-limited?)

Discussion of the rejection here.


I admire AMD for even trying to work with Linux on a driver solution for The Kernel. I am very worried about the development, because since Nvidia fucked me over with utterly crappy hardware in the past, now, that AMD started making decent hardware. I use three different kinds of operating systems at the same time, an Linux is on the least position of being stable and useable. There's always something broken in Linux, not working, patched/made worse - it's a mess. I'm not trying to defend Windows or something, it's been a rough for Windows too. MacOS leads the way. But, Linux is existing as long as those two did, and it's a huge pile of patchwork. Maybe it's supposed to be that way. But then I understand why it never had a year of Linux desktop.

For the record... AMD are perfectly welcome to publish their patches and try to convince Ubuntu/Red Hat/etc to include them in their own forks of the kernel (no distro uses a completely vanilla kernel, they all have their own patchsets they apply on top of it for various reasons). The only great loss here is that AMD can't force kernel developers to maintain substandard code, and will have to do it themselves.

Which they generally fail at due to lack of manpower and pace of changes in the kernel.

Alex, from AMD, has now posted a response saying he over-reacted https://lists.freedesktop.org/archives/dri-devel/2016-Decemb...

Everyone should take a deep breath and read this - it wasn't even a merge request originally, just an RFC.

I don't know about AMD drivers but Nvidia's Linux drivers are so iffy. For one thing, I often notice some flickering of one of the windows that I have on the desktop (it's totally random and it usually involves some window in the background. But the biggest issue is that I can't suspend my PC because once it wakes up, GFX driver just goes nuts and everything starts to stutter and every window's corrupted and the only option is to hit the reset. Drives me nuts :(

I'm glad that AMD is still not giving up on Linux!

I think this is a great situation to quote Linus on Linux supporters inside companies (not 100% same, but I think related):

(on litigation against companies that contribute)

> Anyone in the company that pushed to use Linux is now seen as "wrong" and instantly is pissed off that external people just messed up their employment future.

> - Anyone in the company that resisted the use of Linux (possibly in ways that caused the code not to be released over the objection of the previously mentioned people) are vindicated in their opinion of those "hippy"[2] programmers who develop Linux.


> Now, even if, after many years of work on your part, you do get that code, what is the end result? You have made an enemy for life from the people who brought Linux into the company, you have pissed off the people who didn't like Linux in the first place as they were right and yet you "defeated" them. Both of those groups of people will now work together to never use Linux again as they don't want to go through that hell again, no matter what.

(source and context: https://lists.linuxfoundation.org/pipermail/ksummit-discuss/... )


I think this is the worst part of this SNAFU - Dave has just proven every naysayer in AMD right that Linux is not worth supporting. He cut away feet from the team that has supported and fought for an equivalent opensource driver support on Linux which would be released in lockstep with Windows releases. He has also proven right everyone on nVidia which has talked against opensourcing their drivers and keeping them as a horrendous binary blob. He has also pissed off his greatest ally inside AMD.

All with a single "no". Not "We can't accept this, let's talk about how to make us both happy". With that he gave new ammunition to every manager at every large hardware corporation that's fighting against opensourcing their drivers and made every Linux supporting team lead in such corporation less likely to push opensource world forward. It seems we're stuck in shitty Android kernel forks with shitty GPU binary blobs in near future, with only Windows as a proper contender for good 3D performance.

EDIT: To be clear, I'm not blaming Dave for refusing the patch for tech cases. I AM blaming him for refusing it so flatly and not actively working more with AMD to get the situation fixed. This is not a minor thing - having stable AMD drivers in kernel would really push Linux desktop forward, make sure Linux is compatible with several Macs among other machines and put pressure on nVidia to opensource theirs. But you don't get there by belittling contributors and cultivating "us vs. them" mentality.

All with a single "no". Not "We can't accept this, let's talk about how to make us both happy". With that he gave new ammunition to every manager at every large hardware corporation [...] I'm not blaming Dave for refusing the patch for tech cases. I AM blaming him for refusing it so flatly and not actively working more with AMD to get the situation fixed.

He's been giving the "let's talk about how to make us both happy" answers for months and months. That's not a single "no." That's not flat. What that is, is a lot of design review and "soft nos." (I do still agree that it's ammunition. People who are looking for a reason not to contribute to the kernel will have no problem using this email out of context either.)

Other than eventually "taking one for the team," caving, and merging what AMD wants, what level of support are you talking about here? As far as I can tell, your post takes the stance that a "hard no" to any patchset is going too far.

> Not "We can't accept this, let's talk about how to make us both happy".

He gave them that answer 6 months ago, and they ignored it. This was not an abrupt rejection out of nowhere.

The "answer" 6 months ago was "Yeah, you'll have to write a full Linux driver yourself and maintain it." type deal.

It was unreasonable then as it is now.

If I tell you "You need to rewrite your product in Brainfuck or I'll kick you out in 6 months.", the fact that I told you that doesn't make it any less insane.

AMD is trying to merge a Windows driver with enough shims to make it sort of interact with Linux. That might as well be written in Brainfuck from the point of view of a Linux kernel hacker trying to figure out WTF it does.

To merge a driver into the kernel is to accept responsibility for maintaining it, which they recognize they are not prepared or even interested in doing.

They've made it clear that this code doesn't come from the Windows driver, it was specifically written with the intention of being usable on any platform. Also, they seem quite willing to maintain it. The problem is that the Linux developers aren't willing to merge code that's usable on anything other than Linux.

Thanks for the correction. They aren't relying upon a Windows API that Linux kernel hackers have no experience with, but instead a new API that nobody has experience with. That's still a huge problem for the maintenance that in-kernel drivers need; they can't have a driver which only one company understands.

This was pitched by AMD as "Yeah, our new proprietary driver can use the same kernel stub as the libre driver" with minor modifications a year ago, and instead what was delivered was a giant HAL that only AMD can maintain. I get why its not being mainlined, this is AMD's screwup.

> I can respect your technical points and if you kept it to that, I'd be fine with it and we could have a technical discussion starting there.

Why is that so unreasonable?

By the look of it, he's trying to be reasonable now; after half a year of hope that the code will just get checked in because it's coming from AMD, regardless of quality.

If I recall correctly, Google tried the same thing some time ago and it didn't go well for them either.

That argument is too broad, because it could be used against /any/ effort to get code into the kernel by a large corp.

Linux is no longer in the weak position it once was. It has won the war for server market share (at least to a degree where using it has nothing to do with being a "hippy" programmer), and it has lost the desktop war so thoroughly that at this point it really doesn't matter anymore.

In this position, Linux can live without AMD drivers, probably more so than AMD can.

A lot of people are unhappy with Windows 10 and would love to switch.

Now if Linux only wants to be a server OS then that is fine but if it wants to try and grow then it does need either AMD or Nvidia. I think that for now Nvidia will continue to do what they have always done(binary blob) but if AMD gains market share by being more open then Nvidia will respond in turn.

GNU/Linux will never be anything more than a server or embedded device OS.

Anyone that is serious about graphics programming is on Windows and Mac.

I learned the hard way that the way that FOSS religion and graphis programming industry don't mix (lost a few job opportunities due to that).

As for the movie industry, they basically have heavily customised GNU/Linux workstations, using the workflows they ported from SGI workstations into GNU/Linux.

This is nonsense, a NACK is nowhere near the table-flip you're portraying here.

This actually feels kind of good to me as someone who's trying to get his code into the kernel. Big guys are not getting any special treatment. It feels fair.

It's very simple, Linux won't bend the rules for anyone, and that includes AMD.

The sooner they realize that, the faster they can actually work together with Linux kernel maintainers.

Do these drivers have any relevance to OpenCL etc? It seems that could be a bigger market for AMD on Linux than graphics/gaming.

The GPU driver that's required for both gaming and OpenCL is already in the kernel. The current issue is, if I understand correctly, is related to display management stuff.

Seems kind of silly that the Kernel should be nanny to all drivers. Drivers to me should host their own repo, and provide proper integration tests before they are blessed. Some sort of --externDrivers="foo2016" flag you pass to the build. No drama on how horrid their code is because the Kernel shouldn't care.

It's not that easy though. The kernel doesn't have a stable internal API, which means that it's nearly impossible to maintain an out-of-tree driver. The kernel operates on a classic google-style monorepo model (well, maybe google-style isn't a good term, after all, they certainly didn't invent it...) because when someone changes a kernel API they are obligated to ensure the change is propagated to all related code. You get maintenance basically for free in the kernel, but only if you build an in-tree driver.

FWIW, a proprietary "binary blob" driver has already been available for a while that was built out-of-tree, but it constantly has to be updated for the latest kernel and doesn't release a version for every single kernel, so it's very difficult to use it unless you either 1) specifically use the exact version the driver requires (which is ridiculous-you should be able to dictate which kernel version you want) or 2) use a well-known and well-supported repo (which limits your choice as well). Ultimately, the best way to get driver support into the kernel is to go through the standard kernel channels.

And also FWIW, the kernel does care about how drivers are written-even if the driver is out-of-tree, its possible shittiness reflects back (totally unfairly, I know) on the kernel. By having a gatekeeper for device drivers, they can ensure that the kernel is as stable as possible for as many users as possible, and that's a laudable end goal. It's also orthogonal to wide hardware support, as all that was required in this situation was for AMD to adhere to the guidelines properly, but they didn't do so, so they have no right to be upset that their code got rejected.

Would love to see FreeBSD take their code.

https://github.com/FreeBSDDesktop/freebsd-base-graphics just takes whatever crap is in Linux. Very few people are there to maintain all that, so the focus is on reducing the diff with Linux and improving stability on Intel GPUs. Maintaining an additional patch written against Linux doesn't make sense.

This (and the replies on this thread from Linux supporters) shows how far away Linux is to be an acceptable alternative to Windows on the desktop. Instead of welcoming such a big hardware manufacturer with open arms and try to reach a compromise (and understanding that a company is not going to devote infinite resources to something that's not really as profitable as Windows, because you know, companies are created to make money, not lose it - they aren't not-for-profit orgs) the kernel maintainers choose to focus on purity and style crap. Well, enjoy your pure code and your 2% desktop marketshare. The next time you recommend Linux to someone, when he finds out his graphics card doesn't work well you can put the blame on the hardware maker as always.

I absolutely love that the maintainers can stick up to any company that tries to add shit code to the kernel.

There is no excuse for AMD. The response reads like a total whine to me.

It's not the kernel maintainers fault that your company does not give you proper resources or approach Linux in a correct way.

If you had proper opensource drivers I'm sure the "in" crowd hackers you speak of would take it from there...

I've personally had horrible experiences with AMD drivers on both Windows and Linux. I had a GPU that worked wellfor my purposes, even ran games well enough for me, but you dropped driver support for it, so I was left with no option but to buy another card...

This was all a big ado over nothing. John Bridgman and two other AMD devs have backed away from Alex's initial flame-y post, and are trying to de-silo their codebase, break up the commit into smaller patches, and move forward (despite their marketing dept. getting in the way of fully-upstreaming beta cards the way Intel does).

AMD has been good to Linux and Linux community always supported AMD. Its principle from beginning of creating x86 clone has been something similar to what Linux is to other OSes. There will always be some up and down but has not gone too much away from Linux.

I am not familiar at all with this stuff but honestly people should be working on a way to isolate external code in the kernel at runtime somehow. I wouldnt want to be in either position, I dont want to maintain somebody else's shitty code or get any kind of bugs for my own code from it but I also wouldnt want to be in AMDs position and adhere or rewrite some code that I'm absolutely fine with as it is. There is a lack of a project vision if people reject code that would otherwise lead to great "commercial" aka user adoption success. The maintainers are understandably reluctant to accept new code, especially if it doesnt even try to adhere to the coding standards. He even acknowledged the political situation but it wasnt even his job to care about that. There needs to be somebody over him who's job it is to make him accept that code or figure out a better solution.

The problem isn't runtime isolation. The problem is that if you merge a Windows driver into the Linux kernel, the Linux hackers won't be able to make effective and safe changes to it as the rest of the kernel evolves, because it's nearly unreadable to them.

An unmaintainable mess with lots of users is not a victory unless users are paying you for shitty work.

Runtime isolation would absolutely solve that problem, it would safely shift the blame to AMD if the thing becomes broken. Without it, the kernel devs now have to maintain more and more code that they probably dont know anything about. I dont see how that can possibly be a good solution. I dont see how anyone can even argue that. Why should a 3rd party graphics driver NOT be a plugin instead of core code? Stupidly obvious to make this an isolated plugin.

The whole point of putting drivers in the kernel tree is that they get properly maintained as part of kernel development. All the kernel hackers are responsible for keeping all the in-kernel drivers working. If it's at all acceptable for a kernel change to break a driver with no fix, that driver doesn't belong in the kernel tree.

The whole point of plugins is that they dont need to be maintained as part of the core product. Seeing AMD's response, it's obvious that they dont expect Linux kernel devs to maintain this thing. They should offer a proper and easy plugin interface for the kernel where devs can make drivers for it without having to merge code into the kernel itself. This really seems too obvious, at some point the kernel will have too much code, will have to support too many different pieces of new hardware to be understood or maintained by anybody. I'm sure Windows doesnt merge 3rd party graphics driver code into their subversion repo, that would be insane. But just because Linux is open source, it has to do that.. no of course not.

They do already have clear interfaces that do this. Some modules have less clear interfaces, but if you followed what they were saying they actually said that it would have been easier if they had subclassed some of the code and followed the way that most folks were writing atomic code.

And there was a function with the bane "validate" that didn't, well, validate. In a bit of code that rung alarm bells.

An email from one of the Intel devs clarified that the validation was actually happening in the correct place, it just was hard to see that on first reading because the code was too foreign:

> And by following that pattern (and again you can store whatever you want in your own private dc_surface_state) it makes it really easy for others to quickly check a few things in your driver, and I wouldn't have made the mistake of not realizing that you do validate the state in atomic_check.


> And there was a function with the bane "validate" that didn't, well, validate

so what? you still dont seem to grasp the concept of plugins. Plugin = the 3rd party developer can do whatever he wants and it doesnt hurt the core product.

You say "so what", but that is the so what. A core Linux developer saw a massive commit come though from AMD and couldn't understand it easily.

The point that has been made over and over in these threads is that if you want to develop Linux code then you can't just stick a development team to work in complete isolation from everyone else in the Linux development community and expect to be able submit grand unifying architectures you designed to make it easier for your company but that make it harder for everyone else.

If you want to do this, then you really need to work within this particular community to effect change. For instance, there apparently are some standard idioms that have emerged from within the atomic code. The way AMD have done things is different enough to confuse the core maintainer, and he has reasonably said that he doesn't want to accept a commit like this. Hence his comments about the HAL and a massive middleware layer.

The bottom line is: AMD want to merge this into the kernel's main tree. But if they want to do this, they have to get through the maintainers, and the maintainers have to consider the whole picture and notjust your team, no matter how hard they have worked on their code.

The AMD team seem to have worked in a silo, not released to the CI servers and from what I'm reading broke stuff that others then fixed. So when the AMD guys did a big release all at once like this, then they got told - politely! - that their code wasn't up to scratch.

I'm not arguing with you over what they did, I have not read about it enough. I didn't even read the the exchange in full. This is purely political and a sign of a lack of leadership. The Linux guys should be so grateful for these drivers that they do everything they can to keep AMD happy.

> The point that has been made over and over in these threads is that if you want to develop Linux code then you can't just stick a development team to work in complete isolation from everyone else

that's a problem for Linux

> The bottom line is: AMD want to merge this into the kernel's main tree

No, AMD wants to have working AMD drivers on Linux. It's more than likely that they were told to do it this way and this way sucks. A lot.

But hell, maybe Linux devs think that Linux is so important now that they can pressure AMD devs into doing whatever they want from them. Maybe it works, maybe it wont.

> I'm not arguing with you over what they did, I have not read about it enough. I didn't even read the the exchange in full. This is purely political and a sign of a lack of leadership.

"I literally don't even know what I'm talking about at all, I'll admit it -- but definitely, trust me and my immediate assessment of the situation, it's accurate"


>> The bottom line is: AMD want to merge this into the kernel's main tree

> No, AMD wants to have working AMD drivers on Linux.

Are you even reading the words you type? AMD _already has working drivers_. They're right there. You can go look at the code right now, 'git pull' it and install it on your machine. What's stopping you? Your inability to read, apparently?

No, it is literally -- by the definition of the above email -- the case that they want to merge already existing code upstream, into the kernel, and have upstream share the maintenance burden. That's part of the deal -- if AMD code goes upstream, everyone helps maintain it, and in turn, they help maintain everyone elses.

But it turns out, upstream doesn't want their code in its current state. Of course, they don't have to merge it upstream -- they just want to. They don't even have to merge it upstream now or "soon", but they would have liked that. They could easily ship the AMDGPU driver as an external module using DKMS or something, just as things like ZFS-on-Linux do, and start ironing out problems for upstreamability while actually shipping drivers to people.

They have drivers. The drivers work already, in fact. Having them upstream is totally different. Try reading the article and doing some digging through this thread to understand the context.

> But hell, maybe Linux devs think that Linux is so important now that they can pressure AMD devs into doing whatever they want from them. Maybe it works, maybe it wont.

You realize that given AMD's history -- it's entirely possible AMD needs Linux more than Linux needs AMD, right? Linux doesn't need to win the desktop or win over AMD, it thrives in its own market and has been surviving perfectly well without them.

Linux has supported separately compiled kernel modules for decades. If kernel developers are not expected to maintain this code, it need not and should not be merged into the kernel source tree.

Windows presents an API to drivers that's very painful for them to change. This is a problem that Linux can avoid by not treating drivers as black boxes.

>We don't happen to have the resources to pay someone else to do that for us.

Typical AMD bullshit. They do have resources for developing garbage like 'Gaming Evolved App' just to abandon it 6 months later tho.

Gaming Evolved was not developed by AMD, it's a branded version of raptr.com.

Didn't the AMD guy respond months ago that he expects they'll reduce it to like 30k loc? What happened to that?

Not that this is the crux of the problem, just curious.

It is not about compromise, it is about the long term maintainability of the code.

Nice: so many we's in the last paragraph.

At this point in time, those who don't support Linux are only hurting themselves.

oh this is good gets popcorn crunch crunch

We should be thankful to AMD for keeping thousands of poor people warm this winter.

Well, there's a reason I only ever buy NVidia video cards, even for AMD CPUs. I'm fine with AMD not being interested in Linux support, I'll just vote with my wallet.

Yes, because NVIDIA has been great contributing kernel code. /irony

What I don't get is how one kernel maintainer can make such a massive decision that affects all of Linux. That's some near-totalitarian level of power.

After reading the arguments, I'm kind of on AMD's side. I get what Dave wants, but it seems extremely idealistic.

That's some near-totalitarian level of power.

AMD is perfectly within their rights and abilities to ship an out-of-tree driver. As I understand it, DKMS exists to make that use case easier for end users. The popular consumer-facing distros would probably make it easy as dirt to install too.

The difference between that and upstreaming into the kernel is primarily shifting some of the the maintenance burden onto the kernel developers. That comes with the condition of adding stakeholders to the driver design, not just using them as a code dumping ground.

"Totalitarian" seems a bit over the top.

Linus delegates authority to people he trusts. And in this case, Dave is likely representing well what Linus would do.

> near-totalitarian level of power.

I don't think you meant to use that word. Linus and the kernel contributors together own the copyrights on the kernel. It's not totalitarian for them to dictate its future. It's their creative work. It's downright egalitarian that they let you use the kernel however you like, provided that you give the source in turn to whoever you hand it to.

I suppose the intended notion, would be given the number of Linux distributions and use cases out there, the number of companies involved, the number of products built on top of the Linux kernel... it's amazing that one guy with little financial interest in the results or effects can make that call for everyone.

I'm not sure if that's a good thing or a bad thing, to be honest, it's just... particularly interesting about the way Linux works.

It's amazing that one guy with little financial interest in the results or effects can make that call for everyone.

Impartial expert judges are usually considered a good thing. Note that Dave Airlie is not acting on a whim here; the Linux development community has discussed HALs for probably 20 years and has come to the conclusion that they are a net negative. For better or worse, the Linux development process does not care about democracy or market share or vendor relations. It's a somewhat unusual way to build software, but the result pretty much speaks for itself.

Yep, each vendor that doesn't want to deal with it, just forks it, never contributing anything back.

That's not true, really. Many vendors keep their own in-house patches to the mainline tree, that never make it upstream for various reasons(See Android's kernel for instance). There is nothing stopping a vendor or anyone else, from publishing their own linux tree that is a fork of mainline with this AMD patch applied. Linus maintains the 'mainline' tree, just because of the history and because most kernel hackers believe in their ability to run with it, but if Linus and friends ever did something horrid, there is nothing to stop some other group of people from forking and everyone moving to said new fork. That's part of the whole point of the GPLv2 and other OSS licenses.

Right. And this is an indicator that the Good King and his generals have done an exemplary job keeping it solid: Everyone uses their product.

Except the largest deployment of Linux in the world: Android devices.

Which are running a severely modified kernel because the process of going through mainline to get it mobile ready would be far too painful.

Not really severely. It has a few questionable tweaks here and there.

The questionable part is CPU companies not pushing for upstream integration, so you get e.g. Qualcomm Linux. In fact, the process is not pushing, it is pulling. (some people have started working on it, still long way off)

It can be done, as shown by efforts by TI, ARM and many more...

Hell, even Rockchip, Allwinner and Sony can do it, Qualcomm, Mediatek, Google and Samsung are just not doing it because they are able to skate by without doing it.

The funny part is on Windows Phone, Microsoft learned from the kernel development process and made Samsung, LG, HTC, et all upstream their drivers, whereas Google in the same position with the same vendors has not done this and thus Sony is the only OEM pushing drivers upstream.

Upstream Allwinner support is a community effort, and it takes so long to convince the kernel developers to merge code that parts are generally years obsolete by the time they're fully supported. I think the upstream support may almost be at the point that C.H.I.P could use it - but they're using a single-core SoC from early 2012. I've seen patches for basic functionality like clock controllers (without which nothing works) get stuck in endless mailing list arguments about the most elegant approach, contradictory requirements from different maintainers, all sorts. It's not working.

The kernels used by actual Allwinner-based Android devices have almost nothing in common with the upstream support. They use a completely different mechanism for describing the hardware configuration, a completely different set of drivers, and are based on a kernel that predates upstream support.

True, but what has been accomplished with the Linux Sunxi community is a shortening of that loop to bring up new boards with modern kernels, the Pine A64 for example took only a year to gain support[1], and that was around the time they actually started shipping in quantity. Additionally, many of the ethernet drivers, HDMI PHY drivers, Cedrus (provides H.264 & H.265 decoding @ 10bit) are reusable on new chips like the H5[2].

I would not expect much from Allwinner directly, but at this point there is a sizable ecosystem and mainline support for most of their chips, which can't be said for most other ARM vendors. Another aspect of this is Allwinner sells most of their chips thru multiple business units, with the silicon being exactly the same, just what is silkscreened on the top of the package being different[3].

1 - http://www.cnx-software.com/2015/11/10/allwinner-a64-datashe... and https://forum.armbian.com/index.php/topic/1917-armbian-runni... 2 - http://www.cnx-software.com/2016/08/17/allwinner-h5-is-a-qua...

3 - https://forum.armbian.com/index.php/topic/2099-crypto-engine...

The A64 was one of the chips whose clock controller support got stuck in mailing list hell for over a year. Currently that's planned for 4.10, basic stuff like Ethernet, USB and SD card support is still pending though: http://linux-sunxi.org/Linux_mainlining_effort I imagine they must be using a ton of patches on top of mainline to make that work.

If it was a big secret that this was how Linux worked and it took the companies by surprise, I could see how this could be a bad thing, but it never has been: Linus has always been very upfront that he is Benevolent Dictator for Life and always has the last word. Don't get onboard if you don't like that system.

Then all the Linux fans that complained to AMD that they don't deliver open source drivers should from now on better shitstorm the kernel maintainers, since AMD delivered.

There is absolutely nothing preventing AMD from delivering their own open-source drivers to their customers.

They want it in kernel, however, and that means they have to play by the in kernel rules.

AMD didn't deliver a Linux driver. They delivered a Windows driver plus a bunch of glue code that nobody really wants. In the best of all possible worlds, somebody would dissect the Windows driver for instruction and write a proper Linux driver and nobody would try to run the Windows driver as-is.

This wasn't an open source driver, this was a hardware abstraction layer that they wanted in the kernel so their Windows driver could be distributed under a proprietary license and used without a kernel recompile.

A year ago AMD said they would just use what is currently in the kernel with a few minor modifications to get their driver working in userspace, apparently they did not follow through on that though

No one is being prevented from using AMD's code, it's just not going to be baked into the official kernel.

That's how the kernel development works. As far as I can tell, if you get maintainer status from Linus you can do as you please for the understanding is that you're acting as he would.

As Dave said:

> Given the choice between maintaining Linus' trust that I won't merge 100,000 lines of abstracted HAL code and merging 100,000 lines of abstracted HAL code I'll give you one guess where my loyalties lie.

He is acting in a way he thinks would keep Linus' trust. Also, usually the maintainers have far superior technical knowledge of their areas to Linus. I mean, Linus is just one guy and each maintainer have their own specialty.

Why don't you try to get some code merged into the Windows kernel and tell us how that goes?

> After reading the arguments, I'm kind of on AMD's side.

And that's why stuff like this doesn't get put to a vote.

The Good King is how Linux has come to run the world's servers. I am perfectly fine missing out on some incidental functionality here or there in order to keep this power structure which had protected the Linux kernel thus far. It seems to work extremely well.

I'd prefer to see NVidia style drivers, personally. I don't give a rat's beans if optimized video card, nic, etc drivers are open.

I just want a system that will work properly without them. Not 60FPS on Ultra settings in <game> work, I mean boot and function as a normal user's desktop.

Yeah, I don't care about nvidia drivers, either. AMD doesn't need this in the kernel anymore than nvidia does (and they don't). Kernel modules work just fine for drivers. I don't want some convoluted shit crammed into mainline because <company> chooses to ignore feedback about their code. I am perfectly fine with this one guy shooting down AMD's pretentious attitude here. If it weren't for those kernel devs, good luck getting a Linux system to "boot and function as a normal user's desktop" because of all the crap that would be in there. Like systemd, just piles of crap in a confused, kitchen-sink orgy.

I have come to fear that we will eventually need GPU drivers in the kernel directly, thanks to more and more of the graphics code out there assuming DRM/DRI etc.

Being a free beer clone of expensive UNIX workstation OSes helped.

Of course companies had an agenda to improve it to the point they wouldn't need to keep paying Sun, SGI, HP, IBM, Compaq, Unisys,...

There's a reason Linus is called the BDFL of Linux.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact