Certainly if the lifespan of Intel chips turns out to be much shorter than the marketplace expected (because Intel is unable to provide security updates), that affects the value of Intel products and ought to inform future buying decisions.
Whether it is the unfortunate materialization of Spectre-style bugs or the deliberately insecure-by-design ME, Intel's inability to support its products is dismaying.
The ordinary life cycle of an Intel CPU is the five t̶h̶r̶e̶e̶ year depreciation schedule in the US tax system. The life cycle for Intel's most important customers is less and is based on operating cost in large data centers and these are driven by density, throughput, and energy utilization. Traditionally this has been two years or less as reflected in Intel's tick-tock iteration strategy. The critical life cycle for Intel is not consumer/SMB sales which don't generate sales frequencies and volumes of leading edge products.
Usually the amortization of such systems is ~ 4 years. But many smaller companies choose to stay with the old systems a little longer, 5, or even 6 years lately. Simply because there is no push performance wise.
The main motivation for upgrade is software support (usually for the OS, driven by Microsoft), or failure rates for the older systems.
And that's for the desktop side. For servers they tend to be taken out of commission when the service they provide is migrated to a whole new platform and the legacy one goes to a better place along with the metal. Servers are more reliable than your regular desktops, they get better support, and most companies don't upgrade running systems.
P.S. I don't know of any enterprise environment where ME is used for managing servers. It's usually ILO, ILOM, DRAC, IMM, RMM. I think ME is mostly for desktops or SOHO, Intel has RMM for servers.
Also, on servers, Intel has (requires?) SPS: server platform services. It's like an ME, but worse, and without a way to neuter it.
Edit: a quick search yields a common core or division at Intel behind the ME and SPS, so that makes a bit of sense. There has been at least 1 exploit in the wild for that SPS, it also lists TXT and ME, so I guess it's a shared (MINIX?) kernel that had the bug.
> For servers they tend to be taken out of commission when the service they provide is migrated to a whole new platform
Or when the service contract expires or is too expensive to extend. You can't run a server of any importance without a service contract; it can be the difference between all the server's users and services being down for hours or a more than a week, and between IT management keeping their job for hours or longer.
That's just the "enterprise" hardware model. It's overall extremely expensive to begin with and may have made sense back in the days of proprietary hardware, but makes no sense for a commodity hardware installation beyond a trivial size.
It's hard to imagine not being able to find a replacement in more than a week, especially if one skipped service contract and just bought a spare or two with a fraction of the savings.
It's why moving to cloud infrastructure can be so much cheaper for these shops.
Of course, if it is a trivial size, like a single server at a small business, that's a different story.
Yes, somehow I thought the GP was talking about small business, but I see there is nothing in the text that says so (at least not now). Yes, of course, with cloud infrastructure or just virtualization and spare hardware, service contracts have much less value.
> makes no sense for a commodity hardware installation beyond a trivial size
It's not the commodity hardware - x86 servers have been mostly commodity hardware for decades - it's virtualization (or other rapid recovery and migration tech) that makes it work.
> Yes, of course, with cloud infrastructure or just virtualization and spare hardware, service contracts have much less value.
My assertion is that service contracts for hardware have zero value for commodity hardware installations (beyond trivial size). To whit, they are, invariably, a scam.
Server hardware fails vanishingly rarely, with specific, notable exceptions of certain components. Those exceptions have predictable [1] failure rates that are therefore straightforward to budget and/or engineer around. Managers don't know to insist on this, so the scam persists.
This was true even before the advent/popularity of virtualization (and therefore "cloud" techniques).
It's also true regardless of whether or not one keeps spares on hand. With commodity hardware, vendors always have plenty of spares.. one just hasn't purchased them in advance, so there's an increased latency (not entirely unlike the spares available under a service contract).
> It's not the commodity hardware - x86 servers have been mostly commodity hardware for decades - it's virtualization (or other rapid recovery and migration tech) that makes it work.
That's where I disagree, at least partly, if I understand correctly the kind of tech you're referring to.
Those software solutions are generally just about convenience, or, ideally, reducing recovery time in the case of a non-redundant architecture. Even then, they might turn hours into minutes, not weeks into minutes.
More importantly, it's not that software that makes this work. It's the commodity hardware (and, arguably, commodity software/firmware hiding within) that makes it work. If a disk fails, any same or larger [1] can be used as a replacement. Same for RAM or even full servers. It doesn't have to be the same brand because that's the point of it being a commodity. That fact completely eliminates any problem of being down for a week due to hardware failure. In a major tech hub city, getting something delivered same day might not even require paying a premium.
Before virtualization or any similar abstraction layer, one could just move disks to a new server (best case) or restore from a backup (worst case).
[1] Other than "black swan" events like the flooding in Thailand that wrecked predictability (and reliability overall) for a generation of HDDs. Even then, if one made assumptions based on warranty length changes, those would have been close enough.
[2] Well, OK, one does have to be careful to meet minimum performance, especially for SSDs, but even a failure there won't kill basic functionality
I didn’t have one, but it seemed like a worthwhile thing to have and so I spent a few minutes putting the following list together (these are all announcement dates):
August 2006 – m1.small.
October 2007 – m1.large, m1.xlarge.
May 2008 – c1.medium, c1.xlarge.
October 2009 – m2.2xlarge, m2.4xlarge.
February 2010 – m2.xlarge.
July 2010 – cc1.4xlarge.
September 2010 – t1.micro.
November 2010 – cg1.4xlarge.
November 2011 – cc2.8xlarge.
March 2012 – m1.medium.
July 2012 – hi1.4xlarge.
October 2012 – m3.xlarge, m3.2xlarge.
December 2012 – hs1.8xlarge.
January 2013 – cr1.8xlarge.
November 2013 – c3.large, c3.xlarge, c3.2xlarge, c3.4xlarge, c3.8xlarge.
November 2013 – g2.2xlarge.
December 2013 – i2.xlarge, i2.2xlarge, i2.4xlarge, i2.8xlarge.
January 2014 – m3.medium, m3.large.
April 2014 – r3.large, r3.xlarge, r3.2xlarge, r3.4xlarge, r3.8xlarge.
July 2014 – t2.micro, t2.small, t2.medium.
January 2015 – c4.large, c4.xlarge, c4.2xlarge, c4.4xlarge, c4.8xlarge.
March 2015 – d2.xlarge, d2.2xlarge, d2.4xlarge, d2.8xlarge.
April 2015 – g2.8xlarge.
June 2015 – t2.large.
June 2015 – m4.large, m4.xlarge, m4.2xlarge, m4.4xlarge, m4.10xlarge.
December 2015 – t2.nano.
May 2016 – x1.32xlarge.
September 2016 – m4.16xlarge.
September 2016 – p2.xlarge, p2.8xlarge, p2.16xlarge.
October 2016 – x1.16xlarge.
November 2016 – f1.2xlarge, f1.16xlarge.
November 2016 – r4.large, r4.xlarge, r4.2xlarge, r4.4xlarge, r4.8xlarge, r4.16xlarge.
November 2016 – t2.xlarge, t2.2xlarge.
November 2016 – i3.large, i3.xlarge, i3.2xlarge, i3.4xlarge, i3.8xlarge, i3.16xlarge.
November 2016 – c5.large, c5.xlarge, c5.2xlarge, c5.4xlarge, c5.8xlarge, c5.16xlarge.
July 2017 – g3.4xlarge, g3.8xlarge, g3.16xlarge.
September 2017 – x1e.32xlarge.
October 2017 – p3.2xlarge, p3.8xlarge, p3.16xlarge.
November 2017 – x1e.xlarge, x1e.2xlarge, x1e.4xlarge, x1e.8xlarge, x1e.16xlarge.
November 2017 – m5.large, m5.xlarge, m5.2xlarge, m5.4xlarge, m5.12xlarge, m5.24xlarge.
November 2017 – h1.2xlarge, h1.4xlarge, h1.8xlarge, h1.16xlarge.
November 2017 – i3.metal.
June 2018 – m5d.large, m5d.xlarge, m5d.2xlarge, m5d.4xlarge, m5d.12xlarge, m5d.24xlarge.
Unfortunately, announcement dates only give one endpoint of the timeline. The other endpoint, retirement date (from even previous-generation availability), is crucial to any anlysis.
Yeah, exactly. No argument that Intel keeps improving their processors. But at what cycle is the improvement so great, you are throwing money away by not replacing the hardware for the next generation.
It's also never quite as simple as looking at each generation as a discrete unit of upgradability.
Not only can the price:performance spread vary between generations, but this can change over time, particularly because the model availability within a generation broadens over time.
Add to this the dimension of low power versions of certain processor models (whose selection is therefore strictly a cost/longevity decision, presumably invisible to someone like a cloud end user), one can't safely generalize.
The other problem is that the CPU isn't even, necessarily, the majority of the purchase cost of a server.
The intention of my post was to look at the announcement dates as possible relation to the roll-out of new CPU capacity within AWS, which may tie into the retirement of older capacity and compute types. If we had that information, then you could answer some of the questions asked surrounding the deprecation schedules of HW in a DC the size of Amazon.
Specifically as it relates to the resources required to provide the same compute power for any given pool of instance capacity.
I know for a first-hand fact that AWS will not reveal its compute capacity, but in conversations with them, when we were actively monitoring the availability and continual use of all spot prices across every region, we had the data, according to them, to infer what their capacities were.
We had the data at the time, but not the interest/need, to be able to surmise ballpark compute capacity across all instance types in the spot pools - which is to say, what AWS' spare (i.e. non-fully reserved/dedicated) capacity was, and what the demand was for it. Additionally - we could have also bee publishing AWS' hourly revenues, dammit - I wish I would have thought of that - as people would have been interested in that data...
> If we had that information, then you could answer some of the questions asked surrounding the deprecation schedules of HW in a DC the size of Amazon.
But we still don't, so we still can't infer anything at all from the announcement schedule.
Moreover, assuming AWS is growing fast enough, new instance type announcements could be entirely decoupled from deprecations. Even with both sets of data, it may not tell us anything about overall replacement rate, without also knowing the overall growth rate.
Yes but it's also used as a reference point for when you can upgrade. It's not the trigger but way back when YoY performance improvements were substantial companies waited for the amortization cycle to complete and jumped on the next gen. Not anymore, no point.
But the XP example you gave kind of undermines the point. Nobody should be using XP. Not even on air-gapped networks, totally cut off, with a special support contract from MS, etc. If anyone is still using it they clearly have nothing but disregard for any kind of rules (like all the XP ATMS still out there).
>Nobody should be using XP. Not even on air-gapped networks, totally cut off, with a special support contract from MS, etc.
I think you're severely over-estimating the criticality of stuff that's still running XP. It's mostly used for antiquated industrial hardware that gets used 5x a year, maybe (old CNC mills and whatnot), often times with no network. If it gets crypotwalled via flash-drive then someone will reinstall it and carry on with life.
More accurately, current stats put XP at around 4-6% of the world's desktops. Only Windows 7, 10, and 8.1 beat it. That's more than Linux and more than any MacOS version (more than almost all all of them put together). That would be ~5 million XP machines, give or take.
So we're talking about 5 million machines with an OS designed in the late '90s (20 years ago) and that stopped receiving any meaningful updates 4 years ago.
Most of them are actually ATMs in developing countries like India and they are definitely not air-gapped [1].
The POSReady XP with the bare modicum of support until April 2019 made companies take it as a green light to keep using XP in embedded systems.
Other systems running XP: many of the NHS systems hit by WannaCry last year, many of the systems in UK Police stations, most electronic voting machines and gas stations in the US, most digital signage in train stations, airports, hospitals, or cinemas, parking garage payment machines, so many POSes, even passport security in some airports!
Does this put the magnitude of the problem in perspective? It's a threat from so many perspectives. Your safety, your data, your money, you name it.
Well the distributors stamp the top of the server with a sticker that says it's safely operable for 2 years, so no one is going to insure it for more, of course. Might be European thing, though; good think for enthusiasts is that it's common to contact a company and join their next 2 years buyout and acquire cheap hardware.
I've never seen this 2 year thing in Europe. There are plenty of companies that stick to their servers for over 5 years. Some servers stay there even for more than a decade because they deliver a service using software that is no longer developed or supported, and there's no replacement for it in the company. So they keep them there, chugging along.
But with x86 being basically commodity and virtualization being used everywhere this is less of an issue in most cases. It means your VMs can mostly run on whatever metal you throw under them and that x86 metal can just be propped up to keep working for many, many years.
I consult in verticals where uptime really matters. It's not unusual though, but yes, I know that companies do this. They usually don't care about insurance/hardware SLA though. Old hardware needs to be emulated.
Oh I'm sure some companies do it. Whatever it is you can be certain someone is doing it :). But it's nowhere near being a rule in Europe.
And TBH replacing servers after 2 years because you're worried about uptime feels like a horrible overreaction and self harming at the same time. Servers that are meant to provide 99.999% uptime (so 5 nines or above, or maximum 6min downtime per year) are built to run for far longer than 2 years. And they must be supported by the manufacturer of the system and the manufacturer of every sub-component for longer than that.
So unless I'm missing something I really can't see the benefit of such upgrade cycles.
> And TBH replacing servers after 2 years because you're worried about uptime feels like a horrible overreaction and self harming at the same time.
I agree. Such a tactic, in the face of modern high-availability systems design (including the notion of servers being "cattle not pets"), seems actively harmful, other than, perhaps, introducing a something akin to Netflix's "chaos monkey" into the system.
I could understand pre-emptive replacement of non-hot-swappable components, but only if they're known to degrade over time [1] and only if the server is already otherwise out of service. Even then, I'd consider 3 years the minimum.
> Servers that are meant to provide 99.999% uptime (so 5 nines or above, or maximum 6min downtime per year) are built to run for far longer than 2 years.
This kind of design sounds like it's from a different "world" (mainframes) or era (proprietary, even if x86-based, Unix hardware of the 90s), not commodity x86 servers.
[1] so, maybe RAM, which has increasing CEs with age/usage, though UEs appear to be skewed toward early RAM life and therefore the bad apples are eliminated early. non-pluggable PSUs. fans. very old internal-only HDDs.
I'm just wondering if "those systems" include any present-day x86 servers. If not, then it's safe to say they're a red herring, since this whole discussion is based on an article about, specifically, Intel CPUs.
I now see that the content is behind a paywall but the title is still visible. It's a statistic for 2017/2018:
> Share of servers with four or more hours of unplanned downtime due to hardware flaws worldwide in 2017/2018, by hardware platform
Systems like IBM's System Z had 0 servers with more than 4 hours unplanned downtime over this period. At the other end Cisco and Fujitsu x86 servers had ~16% of servers experiencing 4 or more hours of unplanned downtime over the period.
We can infer that the vast majority of the x86 machines will be Intel considering AMD's market share in the server space.
P.S. The article is about Intel's ME flaws and patches so probably our whole discussion branched off in the wrong direction :).
> P.S. The article is about Intel's ME flaws and patches so probably our whole discussion branched off in the wrong direction :).
Maybe that's why I was confused? I thought we were talking about commodity, x86 servers all along, particularly the GGP about "where uptime really matters" and other comments by the same person about warranty lengths (which seemed irrelevant, other than being indicative of commodity hardware).
Of course, that commenter's very terse remarks, combining "safety", "uptime", and "insurance" (and, later, billions in damages) into an environment where commodity (or not?) x86 servers are proactively replaced started me off in a state of confusion, like there was some very major assumption about the system design that I (and everyone else here, too, it seems) was missing.
Servers are "safely" operable until they die. The failure mode is that the machine stops operating. If your system has redundancy built in it keeps working possibly at reduced capacity. If not it stops. If its essential that service not be interrupted you build in redundancy and ensure that it fails in a safe way if it does.
I'm not sure where safety or insurance comes into this discussion at all.
People upgrade sooner because they stand to gain more than the upgrade costs not because its not safe to operate.
Yes. I guess that you're trying to point at us increasing the early failure rate, right? We're insured against that. We can't insure ourselves against the right part of the curve.
For what it's worth and graphics on Linux are working great with amdgpu. Their efforts to develop and support open source mainline graphics has really finally paid off. Support is usually mainlined before hardware release now. I don't think "just works" integrated graphics are an issue for either company (Intel/AMD) in general on the Linux front.
That said the amdgpu driver is still relatively new and major changes and improvements are still ongoing but overall it's definitely production stable and for dedicated gpus performance is generally very good.
Not trying to sound like a fanboy it's just nice to see someone else push as hard for open source mainlined graphics as Intel has for all these years.
Except for those that happen to own an older card that used to work perfectly fine on the former AMD driver.
Case in point, the ATI Radeon HD 6310 that came on EEE PC models, the video hardware acceleration no longer works as it used to be.
Sure, I can probably hack the older driver into newer Ubuntu releases or track down someone that has already done it, but that is exactly what I don't want to spend my time doing outside work.
FWIW, it’s a completely different architecture. GCN has support all the way down to 1.0. The graphics card in your EEE PC is upgradable, if you’re feeling adventurous[1].
I know, and this kind of attitude regarding drivers is what as graphics oriented person, eventually pushed me back into the Windows/OS X world.
The graphics card was working perfectly fine before they decided to reboot driver support.
Now with the legacy driver I have to force enable acceleration and even then I sometimes get the feeling it isn't really working, given how the fan behaves when watching movies on the go.
> Now with the legacy driver I have to force enable acceleration and even then I sometimes get the feeling it isn't really working, given how the fan behaves when watching movies on the go.
If you want zero-copy video playback for optimizing battery life use mpv with --hwdec=vaapi. Or vdpau or whatever API is supported with that driver. You can also try switching -vo to vdpau/vaapi from OpenGL.
> I know, and this kind of attitude regarding drivers is what as graphics oriented person, eventually pushed me back into the Windows/OS X world.
On Windows you get legacy drivers for older architectures as well. Also the case with NVIDIA. It’s legacy hardware, after all… AMD’s new driver is completely open source, but it targets GCN, which is a completely different type of hardware.
Nvidia has historically supported new drivers/Xorg for old hardware for aprox 10 years whereas amd/ati cards still available at retail have been unsupported in as little as 3 years time leaving you with open source drivers as your only other option if you want to install a new version of your distro with your older hardware.
The fact that they are open source is of course a good thing whats not is that they were at one time less than half the performance.
The new drivers from AMD gpus are both open source AND performant.
Basically the proper strategy 2003-2017 was to buy nvidia and install the binary drivers.
At present you can go with either so long as you aren't buying hardware too old to be supported by the new amd drivers. I'm still using nvidia on all my hardware but maybe I will give amd a try again next time around.
It sucks that its complicated but its not as complicated as it seems.
I know Linux since Slackware 2.0, so I am quite used to these issues regarding graphics cards, including the fun days of manually writing my own xorg.conf file.
Eventually one gets fed up and wants the laptop just to work.
This laptop came with Linux that was the point of buying in first place.
Asus used to sell their netbooks with Ubuntu pre-installed on the German Amazon store.
It already had a phase where I couldn't use WiFi for a couple of months as Ubuntu decided to replace the binary driver for an open source implementation partially working, and then fix issues as they came.
One thing the PSP doesn't have is AMT style remote management.
AMD has their own kind of management system available on some machines (DASH, using "smart" NICs like Broadcom), but the PSP isn't even involved when DASH is available and in use, as far as I know.
However, on my Intel machine with AMT, there's a network port opened by the ME itself (TCP/16992). It can use the same IP as the main OS, or a different IP entirely if desired. It uses the same ethernet port as the main OS though, splitting the packets that are directed to one of the ME ports and selectively allowing the rest to continue to the main OS (there's a low-level ME firewall[1]).
On that port, there is a full remote desktop with mouse and keyboard, the ability to remotely connect small drives and/or ISO files, a remote serial console, a low level firewall configuration utility, and power/reboot controls. Even on machines where AMT is not even supposed to be available, there have been PoC demonstrated using one of the ME flaws to turn it back on[2].
You know the shame of ME is that the concept and intended use isn't terrible and the engineering behind it is rather cool. It is however unfortunate how poorly Intel implemented it and essentially forced it onto personal systems that have no need for it to begin with.
I have Ryzen 2400G which has Vega 11 internal graphics. GPU is supported only by the latest versions of graphic stack (kernel 4.17+, Mesa 18) and the only problem I have is that in Debian testing the kernel flag to enable the support of that GPU family is off, so I had to compile it myself (which could be false atm because I was away for a month and hadn't checked the updates).
I have recently built a machine with Ryzen 2700X & geforce gtx 1050ti (GPU is required as Ryzen doesn't have an internal one but there are plenty of cheaper ones like mine), running Ubuntu and it's just fantastic. Docker builds, compilation takes seconds as compared to my macbook which has i7. Plus there's no overheating issues when I am running Kubernetes all the time and I can still code.
TL;DR Ryzen 2700X is great value for money, would recommend it any day :)
I've been running on a Ryzen 5 1600 for a year and agree with you entirely, the performance is amazing for the price. Currently dual booting with Win10 and Void Linux and haven't had any processor-related issues so far!
It's much worse than that. This "Intel patches" thing is a lie - or at least it doesn't mean that your systems are patched, which is what 99.9% of people reading such headlines believe happened.
Intel only patches its own firmware, but it's normally up to manufacturers to update that firmware for devices. So most PC/laptops users really won't even see these patches.
And I agree with your main point. For one of the more recent Spectre-class flaws, Intel basically said "it's third-party developers' problem to fix."
How is this Intel ME CPU patch deployed and where does it actually go? Is there some tiny flash in the CPU itself where the Intel ME code resides? Or does the patch get deployed as part of a UEFI firmware update, but isn't actually part of UEFI firmware, and somehow the CPU can reach out and grab its own updates from UEFI?
Usually a separate patch, an ME firmware patch. The ME is physically located in the chipset but I'm not entirely sure where the FW resides, whether the chipset or a flash on the motherboard (sharing with the system UEFI/BIOS).
The ME firmware lives on an SPI FLASH chip, on the motherboard. It can either be the same chip where the BIOS is stored or a separate one (which is often the case because two smaller chips cost less than one big one)
Actually you're right. Since most of the analysis on the ME FW was done after dumping the flash from the physically removed clip on the motherboard. That might have been more difficult with a chipset embedded one.
In all honesty "most PC/laptops users" will never need these patches because their systems don't have the ME firmware. You need specific CPU, specific chipset, specific NIC, and the ME FW. Which you're only going to find in OEM systems marked as such - vPro.
It's the same as the Meltdown/Spectre patches where Intel updates the code but it's up to the manufacturer to include it where applicable.
A regular desktop motherboard might include the correct HW but the manufacturer won't bother including the ME FW.
And companies like Lenovo, HP, and Dell already offer the updated ME firmware.
> In all honesty "most PC/laptops users" will never need these patches because their systems don't have the ME firmware. You need specific CPU, specific chipset, specific NIC, and the ME FW.
Every Intel system shipped in the last few years has the correct CPU, chipset and some version of the ME firmware (it's involved in initial platform boot, including Boot Guard which validates the BIOS before the CPU even gets a chance to run it), however some of them have "diet" ME firmware with fewer modules present.
The supported NIC is the only one that common desktops and laptops may not have at all. There's a side-channel required between the NIC and ME for certain features like AMT (remote desktop/management), and some NICs don't/can't support it. However, I recall seeing something about Intel allowing non-Intel NICs to be used at some point.
> Which you're only going to find in OEM systems marked as such - vPro.
I just realized I was actually thinking about AMT FW not being present (the management over the network part). ME is indeed there since 2006 and can be removed with varying degrees of success.
But I was under the impression that the ME FW on non vPro systems is there just for things like BootGuard or TXT, which (presumably) wouldn't need the full ME functionality. I did not expect that some manufacturers ship the AMT-enabled FW without marketing it as such.
I can't see the actual demo so I imagine they found some non-vPro system where the manufacturer actually shipped the full AMT FW. I'm curious if that's a common occurrence.
Non-AMT firmware is about the 3-4 time smaller than AMT-enabled firmware. That's a hefty difference so if you have the non-AMT version and you can still somehow enable it you'd get a very limited subset of what AMT usually is.
Agreed; none of this even passes the smell test to me, it seems absurd to bundle this level of functionality and not have a big warning on the can: "You are not really in control of your machine, at all."
I wouldn't have purchased my last Intel chip with better knowledge of this junk side-loaded, full-access, completely-opaque OS, made by a company incredibly thick with our mates in the 5 Eyes etc.
There was a submission on the weekend which posited that ME was made mandatory because of lobbying from the content industry to implement copy protection that the OS cannot tamper with (HDCP etc.).
So are they paying Intel to do that? How much? Why would Intel agree to that? Otherwise seems like a convenient cover for the aforementioned conspiracy theory…
As I understand it, ME is used to remotely control the processor like in a datacenter. If a datacenter is buying hundreds of thousands of these it makes sense to have it on by default so their people don't have to go in and turn anything on. As much as I recognize it as a vulnerability (to the extreme), it doesn't make sense to have it off by default. They should certainly support a way to _permanently_ disable it. I wish there were easy tools to verify ME is "not accessible" since I don't work in a datacenter and I wouldn't know how to test that it's off.
Things that come with great benefit to risk for abuse:
> ME is used to remotely control the processor like in a datacenter
It's used to remotely manage almost all aspects of the computer; it's a parallel, out-of-band subsystem, complete with its own processor, memory and OS.
It's very useful for managing computers at scale and at physical distances. Imagine making changes to thousands of computers; manual, one-at-a-time, hands-on solutions are very inefficient and error-prone. Imagine a campus or office building where the average distance from the IT support office to the computer is 20 minutes. Staff can spend most of their time in transit: 20 minutes there, 10 minute fix, 20 minutes back. 80% of the IT labor budget is paying people to walk.
But I agree; there's no reason the computer's owners shouldn't have the power to disable it if they choose.
Consumers have also shown zero intention of paying more for secure devices. Until we have a public ME hack with real consequences, I do not expect that to change.
Demonstrably untrue. Consumers pay a premium for access to the Apple SEP and App Store review/analysis process (Jekyll and XCodeGhost notwithstanding, it's been a major security success)
Maybe someone could clarify some things, because I think the impression that I got from reading about this vulnerability is completely wrong. Isn't vPro just something in server hardware? At least the CPU, Mainboard and NIC all need to be certified/from Intel to support this?
You could get the impression that every single computer with a Intel CPU is vulnerable to be hacked over the network. Which I really doubt.
> I want them to disable that thing by default!
When you say this is enabled by default, do you literally mean that they open up a HTTP server without you doing anything?
I don't know much about the Intel ME but the things people are saying about it just seem totally unbelievable.
vPro/AMT is the "consumer"/workstation version, the server implementation is based on IPMI, but the ME is present on all systems, even those without vPro at all.
vPro is branding for several related products. ME is a platform with its own CPU, memory and OS, on which applications can be run. One common application is AMT, which provides remote management services.
And let Intel know that you and your business won't be buying new Intel machines until they provide a way of completely disabling/removing the ME chip.
You do use it. AFAIK, the ME handles power management, legacy backwards compatibility, and all sorts of other random chipset stuff you don't necessarily want to expose to the main cores, in addition to the DRM and remote administration capabilities.
You can. It's disabled on any non vpro system. It takes two exploits on a non vpro system to exploit the ME remotely: the first exploit running on the main cores to reenable the remote administration, and the exploits listed here for once it's enabled.
Also worth noting that they're not patching it for 1st, 2nd or 3rd generation Core CPUs. I'm sure there's plenty of Sandy Bridge/Ivy Bridge CPUs in the wild, and it's not like you have an option to discontinue use of the Intel ME :(
There's always me_cleaner. It's a bit of a pain and requires hardware access to run but better than being exposed to an unpatchable vuln. I encourage every hackspace to set up an ME removal station (I'm building one for EMF Camp this year, and will document it so others can easily replicate)
Please do! I would love to do this, but am far too much of a sissy to mess with SPI flash. I've subscribed to your RSS feed. Do drop any guides/artifacts in the Noisebridge IRC when you have them :)
Planned obsolescence of otherwise viable product implemented via a certain-to-be-exploited architecture (the ME) followed by strategic withholding of patches?
It doesn't have to have been a full-blown plan from years ago in order to be a viable strategy. Intel can choose planned obsolescence going forward today for selected products by not developing or releasing security patches.
The extent to which this particular strategic business option was discussed during the design phase of the ME is hard to know from the outside. Surely someone within Intel pointed out that the ME was inherently insecure, but as for all the consequences, who knows?
There are many ways to to planned obsolescence. I'm sure there is a way to just manufacture the chips such that they degrade in a few years.
Planned obsolescence through major security bugs doesn't sound very smart to me.
I don't get a vibe that Intel is enjoying this publicity or the presumed replace of those chips. My understanding is that AMD is quite competitive today in the data center.
The Intel C2xxx Chip was faulty and I wonder if there is more to this story. Like CPU degradation due to heat.
Maybe the traces on the CPU just can't hold up to heat.
The reason I say this is the top end C2750/C2758 seems to be more at risk then the C2338 dual core. This is unscientific just browsing forums. I think this is heat related.
Wasn't the C2xxx series also an expensive server SOC.
Man these guys are fumbling.
with an entry-level GPU and slow RAM that works with an i5-2500k, i can play every modern game i have tried on high.
Nothing about me opening a text editor, interpreting most code, or compiling the occasional thing require anywhere near that much power.
For the overwhelming majority, and i really mean overwhelming, there's really no progress or point to upgrading anything except the gpu in the past 7 years or so
in laptops, new stuff uses a LOT less power for the same speed, so there's that I guess.
I upgraded my 2500k last year (to a Ryzen system) and the extra cores are very nice sometimes but honestly the biggest improvement was the fact that I also went from 8GB RAM to 16GB. There were a couple of games that benefited substantially from faster memory (Rise of the Tomb Raider, Fallout 4) but generally not so much the actual CPU performance.
If Intel has made currently processes ultra fast recently that wouldn't be an issue, but the Intel Core i7 2700 is actually not that slow. Roughly comparable, at least within 20%, of other 4 core + 4 HT Intel processors.
No real advancement after Sandy Bridge was made.
Only incremental 10% with each gen. That means current gen is only 2x as fast when comparing the same lines (i7 to i7).
If you can't make new things better, just gimp the old ones, like Spectre/Meltdown.
2 times faster would be quite good actually. Sadly we're barely at 50% faster per core, and that's including the usual 20-30% frequency bump on newer models. Pure IPC improvement is even lower, at maybe 2-5% per generation (~30% total at same frequency, 2nd gen vs. 7th gen).
Yeah, you could argue that doubling the core or thread count doubled performance in selected software but the reality is that for real world use, excluding specific corner cases, the improvement is hard to notice.
A reason to upgrade is to have a newer platform and the features that would bring, definitely not the CPU.
Yes, Apollo Lake is an Atom SoC. While I am doing simple web development, performance is about the same. Mostly lacking RAM versus older, Core platforms.
But all-up, Windows 10 at 6 Watts. Including the display, storage, the whole system — or so the performance counters on the battery tell me.
If you're memory-starved, yes. A typical quad core is not going to be memory-starved, and the effect is pretty mild compared to the effect of a GPU. Or for non-gaming, it either doesn't matter at all or is 90% about core count.
I guess I should not be surprised that the HN community doesn't really seem to care.
Intel put the Management Engine into every CPU with no choice from consumers to opt out. That alone is fairly surprising, since they knew it was a big chance it would have exploits and consumers would have no defense.
I'm not sure nobody cares, there are a lot of people who do care and (like me) refuse to buy Intel products. I think the problem is there is no other competitor. Both Intel and AMD (or possibly a three letter agency) have colluded to ensure there are locked management engines in all of their products.
Also people just don't understand what the ME actually is. It is surprising the HN community who are mostly technical don't see just how atrocious it really is.
One thing to note however, is I tell my none technical associates there is a second computer in their laptop. It runs software meant to remotely control their computer. They they can not remove it, they can not see what it does, and there's frequently security bugs found in it. When I tell them this I get told I am paranoid and being stupid. However when I ask them to just imagine if it was true, wouldn't it be awful, they mostly agree.
The issue is people really don't want to know the truth. It reminds me constantly of a quote from the matrix:
"Many of them are so injured, so hopelessly dependent on the system, that they will that they will fight to protect it."
I'm curious what would you consider to count as an acceptable "reaction" from the HN community?
At any rate I'm unlikely to provide one. Personally I probably won't care until a Snowden-like disclosure that demonstrates exploitation of the ME in a scenario that directly affects me. i.e. I wouldn't be surprised that nation states are exploiting this in targeted fashion, but nation states already have all kinds of ways to get my data in ways that I'm unlikely to be available to defend myself against if I'm targeted.
I'd be more interested in hearing about how ME vulnerabilities are being used in US State-level dragnet surveillance or perhaps to subtly manipulate the population by changing their Google/Twitter/Facebook results or something of that nature. If you find any evidence of that, let me know.
Imagine: "We, the U-Boot Command, don't care about fears and rumors until an exploit of the Enigma encryption technology is actually demonstrated."
A good exploit is kept as invisible as possible, and when it is publicized, it may already be game over. A Trojan usually takes a lot of measure to stay undetected. Much of the recent crop of router-targeted exploits did not manifest their presence to the users in any way. Even with Snowden revelations, what has been shown was not just illicit mass data collection, but also huge reams of data already illicitly collected.
I care. I removed ME from mine. I'd prefer to use a competitor but the only viable one is AMD and their equivalent tech is less documented and no known way to disable it exists. Disabling/removing ME is possible for intel stuff so intel is actually the better choice if this is important to you.
No, this is a firmware bug. It just happens to be firmware that runs on the ME and not the main cores. The code is stored externally to the CPU along with the BIOS, and looks like it's being patched via a BIOS update.
Sure, but that's "in our CPUs" in, kinda, exactly the same way that nginx is. It's a microcontroller. It's not like the hardware implemented an HTTP server.
> It's not like the hardware implemented an HTTP server.
But yes, yes yes it is.
I don't mean "in our CPUs" in the sense of running nginx. I say "in our CPUs" because
- the Web server is physically inside the CPU die
- you can't remove or change it thanks to code signing, so (to me) it's truly wedged in there
In effect, it's as hardcoded as the electrical circuitry and the transistors are.
Andrew Tannenbaum penned an open letter of surprise and shock because the use of MINIX was an implementation detail ("the licensing works for us, and it's been stared at by tons of professors and a bunch of smart kids for about 20 years, it's good"). Okay, so it's running on 3 little 486-and-a-bit class x86 cores, and it isn't embedded into the main execution pipelines (...yet. I can see that being attractive, something something "software defined ICE").
These details don't change the bigger picture - until someone can break the ME signing infra in a way Intel can't easily fix, and we can disable (or, more ideally, take over/pwn) ME for good, it's as good as mask ROM.
> - the Web server is physically inside the CPU die
It's loaded from external storage, and probably runs in external DRAM.
> - you can't remove or change it thanks to code signing, so (to me) it's truly wedged in there
They literally just did, that's what the linked article is about.
> In effect, it's as hardcoded as the electrical circuitry and the transistors are.
If your criteria for hyperbole is the inability of the user to make modifications, then every effective DRM strategy is "as hardcoded as the electrical circuitry and transistors" also.
That's silly. It's a CPU. It runs software. It speaks to devices with drivers. There's no technical meat to your argument.
>> - the Web server is physically inside the CPU die
> It's loaded from external storage
No, it's stored on NAND located physically inside the CPU.
> , and probably runs in external DRAM.
It can access all of main memory (and actively uses this ability as part of operations, probably for MMIO communications with UEFI and SMM), but I do think the little 486+-class cores have a bit of their own dedicated RAM on-chip. This makes sense; you don't want MINIX's operation interfering with whatever OS is running, and besides, if you did use main RAM, masking the used pages (with the MMU), so the OS couldn't simply observe/control everything, would honestly leave too much of a visible dent in the system and probably make more of a stink.
>> - you can't remove or change it thanks to code signing, so (to me) it's truly wedged in there
> They literally just did, that's what the linked article is about.
Right. Now to wait and see how long this jailbreak lasts for...
>> In effect, it's as hardcoded as the electrical circuitry and the transistors are.
> If your criteria for hyperbole is the inability of the user to make modifications, then every effective DRM strategy is "as hardcoded as the electrical circuitry and transistors" also.
I'm taking into account the specifics of this particular scenario. I'm aware of other context, but in this case I'm not generalizing. ME updates are signed, and there's currently no complete "perfect" jailbreak, so the effective summarization is "it's locked down".
Considering the specifics of other DRM implementations, well, my favorite DRM is WideVine, since that runs on Linux, there's nothing like HDCP for audio yet, so... https://news.ycombinator.com/item?id=15796420 :D - but see all the replies :(, some DRM is indeed that locked down in practice, with no straightforward recourse.
> That's silly. It's a CPU. It runs software. It speaks to devices with drivers. There's no technical meat to your argument.
Technically you're right, for a strict/narrow definition of "CPU" that describes an abstract bridge between a perfect software environment and the squishy/vague real world. I'm not using that definition. I'm also not looking at Intel products as CPUs here (or, okay, not just CPUs), but as devices that contain a component I literally cannot control, with the exception of some vulnerability PoC code that's already been patched. The established status quo is that I cannot own this part of my hardware, that I'm buying the physical package but relinquishing [control over] some aspect[s] of its operation[s] to the manufacturer['s agenda].
The ME runs Minix, which does use drivers, sure. But that's back to looking at hardware exclusively from the software side of things, which is not the basis of this argument.
I worked at a place where we designed and rolled out a control system based on HTTP on the microcontroller boards. The non-safety-critical hardware configuration system.
You could have done it with SNMP, but one of the developers got a real simple HTTP server into a smaller runtime, and it simplified the job for writing the front end.
Unless the tool is wrong, but I think it was generally marketed as a reliable source. In which case maybe someone can recommend a reliable one to test the vulnerability.
Tools usually just check CPUID and system configuration and don't actually test vulnerabilities. And not necessarily interpreting everything correctly. You can do that without running anything, just checking your OS updates and whether your CPU is out-of-order one, i.e. with speculative execution. N270 isn't and therefore isn't vulnerable.
If you want to truly test speculative vulnerabilities, compile this program: https://github.com/Eugnis/spectre-attack (EDIT: although this one probably won't work on N270, since it uses rdtscp, that it doesn't have, need to find version with just rdtsc)
No, the Xeon Phi "accelerators" are usable too, they are basically 486 cores on modern litography (to allow for higher density/clock speeds), with a vector unit attached to them. I don't know how hard it would be to boot linux on one though...
This host system should be not much more than a PCIe root emulator though. This is the level one can get on an e.g. FPGA with custom logic, which implies that any attempts to insert hardware/firmware level attacks into the actual logic you care about is near impossible to do due to the low-level nature of the custom PCIe implementation.
A 10 year old buffer overflow? Are they not even running static analysis? If I was working with sensitive info, I would take precautions. I said this the last time the ME got patched: what are the chances that this is the last/only bug?
Just a note that if you want to avoid Intel's disastrous Management Engine, there are companies you can support that disable it.
Purism[0] sell nice MBP-style, Debian-based laptops with modern Intel processors with the NSA's 'High Assurance Platform' bit set, and as much of the ME code removed as possible. It still runs briefly at boot, but this is the most-disabled you can currently get on any i3/i5/i7 processor[1].
The last Intel processors where the ME could be removed entirely without bricking, were the non-AMT Core Duos (2008ish), which were used on the Thinkpad T400 (good for your biceps) and the X200/X200T (thick, but compact, even by today's standards).
Various companies (most prominently the Ministry of Freedom in the UK) sell these models with the ME completely removed, and a completely Free Software boot process via LibreBoot (a subset of coreboot). You can find a full list of suppliers on the FSF's 'Respects Your Freedom' hardware page[3]. Most of them will also remove the ME from a compatible laptop you send them, as a service.
These machines are also 'naturally' resistant to both Spectre and Meltdown, and obviously have no ME to exploit. None of the Intel horror-shows of the last few years seem to have touched them.
I previously thought that running an old-machine for largely hypothetical freedoms was bizarre. After these CVEs, I'm beginning to re-examine how bizarre it really is. And I do miss those old ThinkPad keyboards :)
But not on AMD machines from 2012 and before. You can buy a high-end motherboard (KGPE-D16) that can run libreboot and 2 16core Opteron 62xx cpu's with 192GB ram. You don't have to go the old and relatively slow thinkpad route to achieve freedom.
Yes that's a good starting point. And a great option if you have the money for it. This will also support the people that put in hard work to achieve and provide ultimate user freedom.
If you are on a tight budget however i recommend buying the motherboard on aliexpress (for about $200) and if you are not comfortable in flashing libreboot yourself you can ask a company that delivers bios chips for this motherboard to flash a custom bios and supply the libreboot rom binary to them. You can easily swap the bios chips yourself. The libreboot and coreboot websites have lists of hardware that are compatible (ram/cpu).
Also if you are interested in newer liberated hardware, look into the Talos II. They provide a proper workstation that comes with only free software. It will be more difficult to setup since it has a different cpu architecture, but it is definitely the way forward.
This is a great reference - but I confess to being mightily disappointed at the state of the industry with regards to this issue.
Sadly, it seems like its going to need a major incident before people start to pay attention. I can't believe I'm actually rooting for the black-hats to do something so terrible, it wakes us all up.
System76 is probably the most well-known Linux laptop distributor in the US, their offerings are pretty popular. I know a few people with one of their laptops and all of them are pretty satisfied.
Purism is a bit more "extreme", trying to create an all-libre laptop that avoids proprietary software and hardware. Once their equipment is a bit more stable it's something I'm going to look into more, for now they're still in development though.
This feels like the Onion story “‘No Way To Prevent This,’ Says Only Nation Where This Regularly Happens” that they post after every US mass shooting, just with the picture changed.
Intel will be fixing ME vulnerabilities forever. It has a huge attack surface, but too obscure to get serious resources from them.
Finally it happened. Here's to hoping that after being exposed to this kind of risk, enterprises and regular customers start being more inquisitive about what code gets embedded into their hardware and why.
Finally? That ME thing should be nowhere near private and confidential data. There's constantly bugs being found in it [1][2]. Honestly if you are a large company, organisation, government, etc and you are using Intel or AMD products, then you are being very irresponsible. There is no excuse, enough information is out there that even a non-technical CTO should know better.
There is quite literally no viable alternative to x86 for 95% (more like 99.9%, but I am being generous) of the server and workstation market. Pretending like there is and anyone choosing x86 is irresponsible is just being a smug fool.
I would say the server market could easily adapt to Arm, Power or RISC. However I would concede it's very difficult to replace the desktop/laptop workstation. However Chromebooks are making great strives on this. Yes, they are still sending lots of data to Googles servers, but they do a lot of opensource work. Look at coreboot with depthcharge, which can give you a pretty free (not 100%) Arm notebook.
The newest super computer, summit, ranked as the largest/fastest runs ppc64le which definitely creates an argument that there is an alternative to x86 for servers.
I heard Google spends a lot of money and effort to (slowly) move to Power9. It does have a management processor but it's open for inspection and modification.
Maybe other cloud providers, and/or private clouds, would consider that.
What is crazy is that we sent people to Google about moving stuff to their cloud offering, and when I asked about the Power9 thing nobody knew anything!
Would be incredibly awesome to see POWER9 options on Google's public cloud VMs…
Meanwhile, ARMv8 is already available to the public!
packet.net offers access to a full dedicated dual ThunderX box for $0.5/hr (or 0.1 with spot instances).
Scaleway offers KVM virtual machines (also on ThunderX) for as little as €2.99/month (€0.006/hr).
First gen ThunderX kinda sucks at single-thread performance, but you get many cores and… well, you get to start using non-x86 machines, on public cloud, right now.
If developers start actually using ARM boxes for their projects, the providers will be more likely to expand in this direction, and maybe in a year (or less?) we'll see the much more powerful ThunderX2 boxes available, and hopefully more providers will get into this…
I'd bet that Google has probably 10x more engineers working on improving the situation with their various Intel chipsets (being on laptops, or servers), than those working on "Plan B" (Power9), or "Plan C" (RISC-V) solutions.
As far as I could find out from Intel AMT docs [1], remotely accessible AMT requires an AMT-enabled network adapter.
I suppose built-in adapters of Intel chipsets have this feature (at least if marked as vPro).
This means that there's a quite decent chance that your not Intel-branded PCI-Expess NIC is NOT AMT-enabled. Most likely your USB-attached WiFi adapter is also inaccessible to AMT.
This, if correct, means that your home machine, or your laptop, can be protected from this or any future remotely-activated AMT vulnerabilities by disabling the built-in NICs in BIOS, and using a third-party NIC, either for wired or wireless communication.
(For a server fleet, it's different, but you likely don't want to lose AMT remote access if you have a few racks full of servers anyway.)
Unless your machine is branded as vPro it most likely means that it simply lacks the FW part to run AMT. So it will have the ME in the chipset, it might as well have the correct CPU and NIC for vPro (usually the NICs with M for management in the model name), but it's missing the firmware.
Outside of OEM machines the only time I managed to build a vPro enabled system was in Haswell times when Intel had desktop motherboards with the correct chipset and NIC combination, and the BIOS to run it. Right around that time Intel exited the motherboard business and most manufacturers don't bother with shipping the firmware anyway.
Wow I have never seen this before. All the other incredulities aside I was very surprised by this:
>"I got another clue when your engineers began asking me to make a number of changes to MINIX, for example, making the memory footprint smaller and adding #ifdefs around pieces of code so they could be statically disabled by setting flags in the main configuration file."
Why would Intel ask Tannenbaum to make changes for them? Doesn't intel have unlimited resources?
I have multiple machines with AMT and I'm actively using it every day. Some get patches, some won't because they're gen 3 CPUs or lower. Luckily I can be reasonably confident that the local network is secure so the bug isn't exploitable. This time.
I'm sure hoping AMD does better with their support of the PSP.
AMT/vPro is apparently not for servers, and likely operates on the system NIC. The first rule of out-of-band management interfaces should be "use a physically-separate interface", which is unfortunately frequently broken (by one vendor when the procurement specified a separate interface).
The devices which have the feature enabled are probably business devices and the feature is used to manage them. Business devices are good value targets, I guess.
Edit : I guess those devices will probably receive the fix, since they are managed
As far as I know while the Management Engine is in all chipsets that accompany Intel CPUs, Apple never shipped any AMT enabled firmware. This is the more exposed component.
The ME is most definitely there but AMT is not. And AMT is the one with far more exposed security flaws that can be exploited over the network by virtue of AMT's purpose. Like the ones detailed in the article here. Otherwise without a shadow of doubt the ME is present in every Intel chipset since 2006.
Exploiting the ME is possible even without AMT but it definitely raises the bar in the sophistication of the attack.
The me_cleaner tool might do a good job in disabling the ME in most cases but since it's doing it by removing components from the ME FW it probably doesn't work with every OEM implementation.
me_cleaner removes most of the ME code (including the HTTP parser listed here) and then causes it to crash after bringing up the system, so it's impossible to communicate with the processor running ME. That's about as good as it gets.
i have been told somewhere on the scary internet that the nature and architecture of a mac makes the ME dysfunctional because everything else is apple-made. custom chipsets and so on.
There is a Python script that can take a BIOS image (either from a vendor or scanned from a running system) and remove all ME components that are not absolutely required to operate the CPU. I have never tried it.
For those who need a step by step tutorial for using me_cleaner with the Raspberry Pi, check out my easy video guide, which assumes no background knowledge.
Neither does Intel's advisory (which is something we have come to expect from Intel - complete nonchalant disregard for details/quality/security/customers).
They could specify whether I'm good with AMT turned off or they went extra stupid and AMT processes packets even when it's off. My bet is on the later because otherwise they'd say to disable AMT otherwise. My next processor will be AMD.
I wonder what other (somehow) laptop-worthy CPUs offer a better management engine story?
* AMD processors do have an equivalent management engine (PSP), but I didn't hear anything about remote exploits for it.
* Beefier ARM CPUs also have something like a management engine ("trustzone" only accessible to the manufacturer). I have no idea if it has any remote-access capabilities on any common hardware. On RPi the trustzone is absent.
* Power9 does have a management engine, but it's open and you can upload your own management code. The CPU is not an option for a laptop, and hardly even for a desktop, though.
In order to support reliable mass-remote-update, what is needed is an ME which is disabled by default but can be enabled via a non-reversible opt-in, such as breaking off a pin.
Then a supplier could configure bulk orders to enable the ME and it would be left up to the customer to choose the security-for-convenience tradeoff.
My reasoning was that the step needed to be non-reversible in order to duplicate the current behavior. However, after thinking over your suggestion, I haven't come up with a scenario where that would be important.
Ordinary users, even if they somehow experienced the temptation to disable remote updates, wouldn't have the expertise to act on it. And any malicious actor with physical access to the machine would have other more straightforward attack vectors (like USB vulns).
So I think you're right -- and Intel has even less justification for hard-wiring the ME on by default.
Another theory is that other actors have the will and influence to make sure it stays that way.
We used to be answer with tin foil hat jokes about jokes on mass surveillance. Then came PRISM and nobody is laughing now.
Maybe it's going to be the same with this. We will learn many years later it was enforced by some state or economical entity that benefit greatly to have a standard unpatchable backdoor on most laptop and servers on the planet.
The greed and irresponsibility is boundless. They could set aside 10 billion and have excellent security. 1 billion could have bought thousand phds digging through their processors and I bet they could dig out boneheaded flaws like Spectre.
Out of band management is fine, but have it completely disabled by default, have it run on co-processor optimized for security, not speed, in language optimized for security, not speed and when it's on have it filter all packets that aren't signed to unique secret key. Basic stuff.
Think about managing tens of thousands of server in a datacenter. An ability to do everything you can do form a local console (and preferably more), without physical access or a KV switch, is very important.
Remotely managing a corporate desktop or laptop, e.g. fixing an OS-level problem remotely, may also be important.
OTOH I'd prefer this functionality clearly delineated, usinf strong encryption, and with an explicit reliable "off" switch (preferably physical).
> Think about managing tens of thousands of server in a datacenter.
The proper solution is an optional management chip on the motherboard, not the CPU. PCI bus mastering NICs with wake-on-lan and other management features have existed for decades; it wouldn't be particularly difficult add the rest of the ME features.
Frankly, I don't give a fuck about datacenters. I'd rather pay $100 more and NOT have remote functionality in my hardware.
There is no universe where what we have now is a good solution. There is no universe where there aren't superior alternatives (like custom bios for datacenter users, like just spending a little bit more money and having two versions)
If this is a problem, configure your OS better, or get a better OS. Otherwise you'll soon get need another ME on top of your ME to "fix an ME-level problem remotely".
How many layers of machines do we really need???! most datacenters already run in VMware with dockers inside running Java VM inside executing Javascript inside.
How many people honestly use Intels ME for managing their servers (or large desktop installations)?
I have yet to see or even hear someone defend the ME on the grounds that they use it daily. With a few SPARC servers as the exception, we have a 100+ Intel based servers and we don't feel the need for using Intels Management Engine.
It depends on how expansive "we" is. I don't need or want it. You probably don't need or want it. Large data center operators want it because it saves time and money. Those are Intel's important customers. That's why it exists.
I ask because I've never seen its webserver on my home network.
Heck, I don't even get how it could connect to the internet on a powered off device without Ethernet.
Yes, and it can't be disabled on all newer systems. The Active Management Technology (AMT) application, part of the Intel “vPro” brand, is a Web server and application code that enables remote users to power on, power off, view information about, and otherwise manage the PC. It can be used remotely even while the PC is powered off (via Wake-on-Lan).
The ME is present on all Intel desktop, mobile (laptop), and server systems since mid 2006.
Before version 6.0 (that is, on systems from 2008/2009 and earlier), the ME can be disabled by setting a couple of values in the SPI flash memory. The ME firmware can then be removed entirely from the flash memory space. libreboot does this on the Intel 4 Series systems that it supports, such as the Libreboot X200 and Libreboot T400. ME firmware versions 6.0 and later, which are found on all systems with an Intel Core i3/i5/i7 CPU and a PCH, include “ME Ignition” firmware that performs some hardware initialization and power management. If the ME’s boot ROM does not find in the SPI flash memory an ME firmware manifest with a valid Intel signature, the whole PC will shut down after 30 minutes.
Isn’t this vulnerability based on AMT, which is based on ME but disabled by default? Even then, every setup I’ve seen have AMT (a separate Ethernet interface) behind a firewall and is only accessible via local network. The outrage is hardly justified.
There are X thousand redis servers exposed to the Internet too. This is hardly intel’s fault (having the ports exposed, not the vulnerability).
And again, this is not the main point I’m arguing. What I’m saying is that supposedly “this is something that’s enabled by default on consumer devices” is verifiably wrong.
If you run Redis on a public interface without authentication then it will spit out a bunch of warnings and make you aware of the security implications. The changes antirez has made to Redis both in terms of secure defaults and notifying users of insecure settings has directly lead to a huge reduction in Internet-exposed Redis instances.
And I was trying to address this point:
> Even then, every setup I’ve seen have AMT (a separate Ethernet interface) behind a firewall and is only accessible via local network.
In the past, manufacturers used that defense when a security researcher approached them about a problem and they justified the lack of patching by saying things like "nobody would put this on the Internet". There are simple things a manufacturer can do to encourage good security by the end-user (ex. showing a warning). I don't believe that blaming the end-user is a viable path to fixing the problem. This issue isn't specific to Intel but I would prefer it if the vendor implemented more security safeguards to prevent users from inadvertently increasing their attack surface.
One day it's "[thing] shouldn't be exploitable because [mitigation]", the next it's "welp, [mitigation] has a bug in it and they've exploited [thing]."
Right, and the network security is always part of the attack surface of an enterprise. What I’m saying, though, is that the component that’s vulnerable is 1) disabled by default and 2) near impossible for a consumer to enable.
Could Intel ME be disabled in UEFI firmware setup (if the firmware were to offer a UI for it?) Or it is something that's physically enabled/disabled on the CPU and totally orthogonal to UEFI firmware?
Ergo, could the computer manufacturer release a firmware update providing such an interface option in firmware setup if they really wanted to? Or are they stuck once the product is released?
You can disable ME by giving it a firmware image to run that does nothing. The me_cleaner approach is to keep the module that brings up the hardware and then give it a command causing it to crash, which is as good as that. A firmware update can definitely do this too.
I think most devices have similar vulnerabilities which aren't well known and hard to defend against, like the separate processor in most phones.
Worth reading:
https://news.ycombinator.com/item?id=6722292
So how did they go about making these fixes? Is this another thing where I have to download something from my OEM?
The biggest problem I have by far with any of this is that it's not trivial to update all firmware involved. Everything else is forgivable, people make mistakes.
How can you classify a deliberate architectural decision to trade away security for all for the convenience of some as a mistake? That the ME would eventually be exploited was completely foreseeable, and was surely foreseen and discussed within Intel.
This isn't like some subtle software bug that went undetected. The consequences were known and Intel deliberately chose them.
I finally pushed the button on my lenovo T450S there is a setting in the bios to delete the AMT. While the best route is to reprogram... I would just rather click one button and set bios passwords afterwards.
The ME (AMT) is never actually disabled or deleted as long as the FW is there and running.
Plus you should take advantage of the fact that the 450 is still supported and gets a ME FW update.
Yesterday I updated all my machines with Gen 4 CPU with new BIOS, new ME FW, and (surprisingly) new TPM FW. The Gen 3 CPU machines barely got a BIOS update for Metldown/Spectre and that's it.
Updating the BIOS if you have this option goes without saying. TBH, the BIOS and various FW in your machine should always be kept up to date. Just give them 1 month from launch and let others test it to make sure it doesn't have any serious issues, then just update.
And as long as you are not using it and don't need it you might as well disable and unconfigure it in BIOS.
Of course in this state your machine is ready for reconfiguring it and it will accept the default "admin" ME password. Which means you have to make sure you have a good BIOS password. This will prevent someone who has 2 minutes alone with your machine from reenabling and configuring it without you even noticing.
When dealing with consumer-grade network equipment, this is the same question that always comes to my mind: who decided these devices should have their management features open for WAN access by default?
I'm still not sure if it was an early 2000s fad that nobody really thought about, or it was deliberate (and if so, why).
Most organizations big enough to have an IT department which isn’t in the same room like them because you can do things like restart them remotely to ensure software upgrades or installs happen on schedule.
Unfortunately many of those places historically didn’t have things like separate management LANs, good filtering, etc. because everything was setup around convenience and the desktop support people probably weren’t security experts.
I use this simply because I need to have power on/off and remoting capabilities on machines running environments where I cannot configure such capabilities (meaning I have 0 recourse, no RDP, no TeamViewer, no VNC,, etc.).
The reason they don't show up on Shodan is that the search engine doesn't scan private networks and you have to explicitly configure it to be internet accessible. You have to configure AMT/ME in BIOS, you have to allow the connection through your router. Very few people will actually do that.
No, it means that AMT is not exposed to the internet. The ME itself is not exposed, but the AMT running on top of it is designed to be accessed over the network.
Unless you explicitly expose AMT to the internet you are relatively safe as long as your local network isn't compromised.
So it can be exploited only by an attacker with physical access or at least in the same network. Shodan wouldn't show you any of these machines that have ME configured and AMT blocked from crossing your router into the internet.
In a typical home network “any device on the same network” is a very large attack vector. You just need some unsecured webcam, router or other cheap IoT device; the Mirai botnet proves that those are widely deployed.
Would it suffice to mount and use external Ethernet adapter e.g. on PCIe, leaving the built-in disconnected?
I think the ME intercepts only built-in Ethernet, can someone confirm?
* For servers or desktops, you can plug in a separate PCI network adapter instead of using the one on the mainboard (please correct me if this is wrong or confirm it as I'm unsure about it). That would at least disconnect the ME from the network by default. But anybody could still walk up to your machine, plug a cable into the mainboard ethernet port and own you at the deepest level.
me_cleaner does not disable the ME. It is a partial disablement of ME functionality, but some functionality remains enabled. The ME firmware is an Intel-signed proprietary binary blob part of which is instrumental in the system boot process, so complete removal is impossible.
me_cleaner and/or the HAP bit, or the services offered by laptop vendors which is basically doing the very same for you, may certainly reduce the degree of attack surface and the extent to which the ME poses a threat, but it is not a complete disablement or removal, and you are still reliant on a non-modifiable binary blob to bring up your system; referring to it as removal is misleading.
Since the firmware is proprietary, it's hard to make any guarantees to what extent a reduced-size ME (via me_cleaner and/or the HAP bit) reduces attack surface in practical terms. My understanding is that even with the HAP bit and me_cleaner applied, the ME continues running at least some functionality after system boot is completed.
This is correct but misleading. The me_cleaner approach, with all options, wipes the entirety of the ME firmware except the module needed for hardware bringup. It then causes the ME to crash as soon as hardware bringup has happened. The host system cannot communicate with the ME processor and the ME processor does not execute any further code after this point. This is the current gold standard.
The next stage would be to disassemble the bringup modules and figure out what exactly they do by reverse engineering, and implement that part independently. People are working on this. So far there is no indication that any of the problematic functionality is in the bringup module.
Nothing requires that the ME initializes the hardware - the BIOS can do so as well. So if the initialization sequence is well-understood the ME can be disabled entirely and the initialization done by coreboot or whatever. This is possible on Intel CPUs. Unfortunately on AMD it doesn't work as unless the secure coprocessor clears the reset lines the other cores cannot wake up.
It seems likely, to me anyway, that the CPU doesn't have any effective way of verifing what is running on the ME. So reversed firmware could just lie to the CPU and tell it that its firmware is signed when it isn't. A similar approch is use the microg project to replace proprietary google play services by spoofing google's signature.
Leaving those backdoors open in older products should lead to a recall because the flaw was there all along.