
Intel Security Issue Update: Addressing Reboot Issues - bcantrill
https://newsroom.intel.com/news/intel-security-issue-update-addressing-reboot-issues/
======
bcantrill
This is -- to say the least -- frustrating. First, the busted microcode is
still available on the Intel Download Center[1], without any warning that they
recommend that you not, in fact, install it. Second, the press release is
still being evasive: they have not merely "received reports"; they in fact
know that it's causing issues, and the press release is avoiding the much
stronger language that Intel is giving privately (namely, don't install this).

The broken microcode is (at some level, anyway) forgivable; Intel's ongoing
inability to communicate transparently and honestly with its customers during
this crisis of its creation is much less so.

[1] [https://downloadcenter.intel.com/download/27431/Linux-
Proces...](https://downloadcenter.intel.com/download/27431/Linux-Processor-
Microcode-Data-File?product=52214)

~~~
gatmne
Why should they bother communicating clearly with their customers? Who are
those customers going to turn to? AMD? ARM?

Between Intel's numerous CPU bugs that they refused to refund customers for
and ME, it's crystal clear what Intel thinks about their customers.

~~~
SteveNuts
Customers actually could turn to AMD... their offerings are very competitive
right now.

~~~
djsumdog
I'm thinking of building an AMD dev box. For enterprise consumers, if they're
using 1U or blade servers, they could make the choice to switch to AMD for
future nodes.

~~~
noir_lord
I have a Ryzen developer box at work and it is awesome, paired with 32GB of
RAM and SSD's it absolutely screams.

My next home PC will be Ryzen 2 at some point this year.

~~~
leonroy
Linux or Windows? I've been doing dev on a large React app recently and the
thought of running __npm install __on Windows makes me anxious about the
performance vs my Mac - wondering if Windows has gotten better of late with
lots of tiny file I /O.

------
rusbus
I think it's a bit ironic that the text, in a sense, blames Google for these
problems by calling them the "Google Project Zero Exploits" as if Google was
some sort of cyber crime syndicate using their evil powers to exploit intel.

~~~
9935c101ab17a66
Wow, that's a good point. "...the exploits uncovered by Google Project Zero"
would be much, much more appropriate.

~~~
Animats
The "Intel CPU fundamental design defects uncovered by Google" is more like
it.

~~~
HugoDaniel
"Intel CPU _deliberate_ design defects now uncovered by Google"

------
mrsaint
Uhm, what a mess. This, just when Linux vendors began pushing updated intel-
microcode packages (Ubuntu just released intel-microcode 3.20180108.0). Should
we put the update on hold until this issue is hopefully resolved, or should we
still update as suggested in the last paragraph of this Intel press release,
somehow believing that the random reboots don't apply to "end users"?

~~~
lower
Lenovo has put out an advisory about what to do with the BIOS updates that
contain the microcode:

[https://pcsupport.lenovo.com/de/en/product_security/PS500151](https://pcsupport.lenovo.com/de/en/product_security/PS500151)

 _Withdrawn CPU Microcode Updates: Intel provides to Lenovo the CPU microcode
updates required to address Variant 2, which Lenovo then incorporates into
BIOS /UEFI firmware. Intel recently notified Lenovo of quality issues in two
of these microcode updates, and concerns about one more. These are marked in
the product tables with “Earlier update X withdrawn by Intel” and a footnote
reference to one of the following:

1 – (Kaby Lake U/Y, U23e, H/S/X) Symptom: Intermittent system hang during
system sleep (S3) cycling. If you have already applied the firmware update and
experience hangs during sleep/wake, please flash back to the previous
BIOS/UEFI level, or disable sleep (S3) mode on your system; and then apply the
improved update when it becomes available. If you have not already applied the
update, please wait until the improved firmware level is available.

2 – (Broadwell E) Symptom: Intermittent blue screen during system restart. If
you have already applied the update, Intel suggests continuing to use the
firmware level until an improved one is available. If you have not applied the
update, please wait until the improved firmware level is available.

3 – (Broadwell E, H, U/Y; Haswell standard, Core Extreme, ULT) Symptom: Intel
has received reports of unexpected page faults, which they are currently
investigating. Out of an abundance of caution, Intel requested Lenovo to stop
distributing this firmware._

~~~
leeter
it gets worse, Lenovo shoved out that firmware update as a 'critical' update
back in december, and now it's causing major issues
[https://forums.lenovo.com/t5/ThinkPad-T400-T500-and-
newer-T/...](https://forums.lenovo.com/t5/ThinkPad-T400-T500-and-
newer-T/KB4056892-multiple-problems-on-T440s/td-p/3933019)

------
Animats
Spin control. By "higher system reboots" they mean "the OS crashes after their
CPU 'fix'".

------
ksec
Sorry If i had this wrong, they knew of the issues in June, and they only fix
it in Jan.

I can hardly call this rushed.

~~~
dboreham
The fixes weren't widely run until this week.

~~~
itronitron
presumably because Intel does not have access to the necessary hardware
configurations to test the fixes on?

~~~
ksec
I couldn't accept this as an excuse when you have possibly the worst CPU bug
in x86 history, or perhaps all CPU history, with ample of man power and
resources, along with 6 months time frame.

Rush isn't a word I would use.

------
sateesh
I think all the more jarring is the smiling face of the spokesperson next to
this announcement. Atleast can't the announcement be not with a photo or have
a serious looking photo of the spokesperson.

~~~
AngeloAnolin
The photo seems to remind me of someone saying, "that's all you get _suckers_
" coupled with an evil grin.

------
benjarrell
“We have received reports from a few customers of higher system reboots after
applying firmware updates.”

What does higher mean here?

~~~
Someone
The original statement, as phrased by their engineers, probably was something
like “Our latest firmware regularly crashes your system, triggering reboots”
(plus a few paragraphs with a highly detailed description of why that happened
that only the engineers who wrote the firmware would understand)

This is what they ended up with after a few reviews with legal (“we can’t say
‘our’; they’ll eat us in court”) and marketing (“We need a less emotionally
loaded way to say ‘crash’”)

Legal aimed to maintain just enough meaning in the statement to be able to say
“we warned customers as soon as we could”; marketing aimed to make it a
positive message. I guess that’s why ‘higher’ won over ‘more’.

------
gesman
Happy face on profile of that post is a bit out of context with the customer's
feelings...

------
mtgx
Intel's "security pledge" is hilarious, when ME/AMT continues to be used as a
backdoor.

This just in:

[https://business.f-secure.com/intel-amt-security-
issue](https://business.f-secure.com/intel-amt-security-issue)

------
partycoder
I think it's hard to see how this will affect Intel in the long term.

When Samsung phones were blowing up, I thought that was it, but somehow people
kept preferring the phones.

~~~
partycoder
Now, in retrospective, the Samsung battery issue affected only a small portion
of users, whereas this will affect every single user in the form of decreased
performance.

------
pQd
probably related news from Dell: "NOTE 1: 13G, select 12G, and select DSS
server BIOS files have been pulled from
[http://dell.com/support](http://dell.com/support). This note and article will
be updated as soon as more information is available" [1]

the pulled out BIOS update files for 13th gen were released on 5th of Jan.

[1]
[http://www.dell.com/support/article/us/en/04/sln308588/micro...](http://www.dell.com/support/article/us/en/04/sln308588/microprocessor-
side-channel-vulnerabilities-cve-2017-5715-cve-2017-5753-cve-2017-5754-impact-
on-dell-emc-products-dell-enterprise-servers-storage-and-networking-?lang=en)

------
zimmerfrei
To be fair, it can well be that some sloppy OEM drivers take too many
assumptions on reserved bits in registers (which the new microcode may be
legitimately changing) or on undocumented timing side-effects related to some
instructions (which the new microcode may affect, being that the root problem
in the first place!).

These symptoms are also the classic ones you get when you install an OS on a
new-generation, well-functioning CPU.

~~~
joenathanone
Don't speculate, it adds nothing but confusion.

~~~
zimmerfrei
My comment is no more speculative than those blaming the new microcode for
causing reboots. The real bad quality, bad update process and unjustified
binary nature of OEM firmware is too often overlooked.

------
compsciphd
My broadwell (i5-2500k) windows desktop has started blue screening like crazy
if I do large continous network transfers (i.e. saturating gigabit ethernet).
I didn't have this problem before I rebooted for my most recent update.

I thought it might be memory, but an 8+ hour memory scan (windows internal
one, not the normal linux one) didn't tickle any bad bits and its not erorring
in any unique component, each time, it seems to be a different one (first I
caught the blue screen, it was the network driver, that made sense, so I
upgraded it, just in case), but then it started being ntfs and other things.
wondering if its just limited to those arches, or others.

~~~
Narishma
> My broadwell (i5-2500k)

That's Sandy Bridge, 3 generations older than Broadwell.

~~~
compsciphd
and apparently with the latest news, it perhaps wasn't all in my head

[https://newsroom.intel.com/news/firmware-updates-and-
initial...](https://newsroom.intel.com/news/firmware-updates-and-initial-
performance-data-for-data-center-systems/)

------
ericzawo
Doesn’t exactly reassure you, does it.

------
Twirrim
Well this promises to be fun, especially for cloud providers (and those
running instances on the cloud, who now potentially get to suffer through host
instability)

~~~
jlgaddis
Intel's adaptation of Netflix's Chaos Monkey [0]?

[0]:
[https://en.wikipedia.org/wiki/Chaos_Monkey](https://en.wikipedia.org/wiki/Chaos_Monkey)

~~~
numbsafari
I was thinking the same thing as I read this. “Chaos engineering” has served
some customers very well here.

From speaking in my circles, I get the impression that those of us without our
own data centers to worry about are much better off than those who do.

------
rmetzler
I heard from my colleague that he had boot issues with a server (new Centos7
kernel) and a RAID he set up today.

------
monochromatic
“We rushed a patch out and it’s causing problems we don’t understand yet.
We’ll rush out an updated patch as soon as we can, so please don’t hesitate to
install that.”

~~~
dang
Please don't use quotation marks to make it look like you're quoting someone
when you're not.

~~~
miles
It's clear that monochromatic's "quotation" was ironic.

A possible convention for ironic quotes:

[https://english.stackexchange.com/a/3471/207671](https://english.stackexchange.com/a/3471/207671)

> _To avoid the potential for confusion between ironic quotes and direct
> quotations, some style guides specify single quotation marks for [irony],
> and double quotation marks for verbatim speech._

------
DrScump
I _just_ had a hard crash/reboot on a Dell running Windows 10 on an AMD
processor, followed by a couple of auto updates. There was no indication of
updates being available before the crash.

~~~
mtremsal
Out of curiosity, how do you suppose an Intel microcode update caused a blue
screen on your system running an AMD processor?

~~~
drewm1980
These issues aren't limited to Intel, and the people developing the
mitigations are probably sharing ideas with each other.

