
Memtest86+ probably kills system controller on Lenovo Thinkpad T500 laptop - edward
https://bugs.debian.org/900399
======
pedrocr
The conclusion is that it's probably just a random fault, possibly triggered
by the memtest running the laptop hot:

 _" So, we have to seriously consider the possibility that two laptops died at
the same time just by a coincidence."_

~~~
9935c101ab17a66
There's a thorough update and report of troubleshooting at the bottom which
strongly indicates it's not just a random fault. Did you read it?

~~~
pedrocr
I read the whole content on the page. There are three emails, all from the
same person. I quoted his latest conclusions.

------
gigatexal
The sleuthing done in the big report is amazing.

~~~
dsfyu404ed
Never underestimate the power of competent and interested people with free
time.

------
hueving
Title needs to be updated. It sounds like the load was likely the cause of
triggering faulty hardware. Nothing specific about memtest.

~~~
hirsin
I think it's more damning of the hardware with memtest in the title. It's
literally a test program - by definition your hardware should pass it. If you
fall to pass the test, and fail catastrophically, that's on you.

------
hd4
Even the person submitting the report conceded that it was more to do with
faulty hardware (VCC3SW malfunction) than anything wrong with memtest86. This
needs to be flagged.

~~~
TazeTSchnitzel
Why should it be flagged? The faulty hardware is the interesting part!

~~~
Hydraulix989
lm_sensors when misused can also kill hardware by "probing" registers the
wrong way when looking for sensors.

[https://wiki.archlinux.org/index.php/lm_sensors](https://wiki.archlinux.org/index.php/lm_sensors)

Probing is different than testing, though, and people don't expect a general
system memory testing tool to kill hardware (whereas lm_sensors has all sorts
of warnings and takes the safest route possible by default).

A tool that is designed to test general system RAM should not be touching
memory-mapped I/O devices, and it's up to that tool to avoid them.

------
jnwatson
If the submitter was right, the problem was that the area of memory where a
power controller's registers were mapped was not marked as "reserved", so it
is entirely reasonable for memtest to poke at that memory.

In other words, the failure is actually in the laptop's BIOS that provides the
memory map.

------
londons_explore
I had a T420 which mysteriously died when running memtest and would never
power on again.

Disassembly showed that the power button itself had no power, and I suspect it
might have suffered the same fate.

------
watersb
I think this bug is worthy of HN attention.

While unverified, the failure mode is something that I had not considered
before.

I did not get far with electrical engineering, way too deep into software as a
career. I am getting back into simple things with Arduino and my hardware
hobby is leading me to apply systems-level thinking in a different domain.

If someone sends me a working T500 I will be happy to try memtest86+ on it.

------
chx
Note the T500 is ten years old.

~~~
ErunamoJAZZ
yeah, but no one could run memtest in a young laptop.

~~~
wolfgke
> yeah, but no one could run memtest in a young laptop.

Would you explain this a little bit more?

~~~
ErunamoJAZZ
Failure rate is usually high early in the time, low in middle, and high again
at long term. In the context of hardware, this mean that at start laptops will
fail soon (caught in factory), or it will fail later in the time. A 10 years
old laptop have a high probability to fail, so the probability that someone
wanted to run a memtest is very high too.

~~~
9935c101ab17a66
To be fair, that doesn't explain what you said earlier. It's probably what you
meant, but it isn't what you said.

------
thomas
Update the title please

