
Flash Memory Wear Killing Older Teslas Due to Excessive Data Logging: Report - rbanffy
https://www.tomshardware.com/news/flash-memory-wear-killing-older-teslas-due-to-excessive-data-logging-report
======
Unklejoe
I'm going to be a little petty here, so sorry in advance:

It's common for Tesla fans to mock other automotive manufacturers for being
"old and slow" \- as if they're incompetent because they take so long to
develop new features and adapt to changes in technology. It's like a "why does
it take 5 years for BMW to update their infotainment system when I can do the
same thing with a Raspberry Pi in 5 minutes?" mentality.

What I think goes unappreciated by some is the level of testing that normally
goes into automotive grade development.

That's not to say that modern ICE cars are more reliable than a Tesla, but
their electronics are usually designed for a higher level of robustness.

Here are my issues:

\- Why is the car rendered useless if the infotainment system fails? This
should be a non-critical component. If it is critical and the critical part
can't be isolated, then the system as a whole should have been tested more
extensively.

\- Why does the infotainment system fail if a log message write fails, and why
was this failure mode not tested? It is very common knowledge that flash wears
out, and log messages aren't critical.

\- Why were they not using automotive grade flash?

This, coupled with the recent story about the non-automotive-grade displays
failing really demonstrates that maybe, just maybe, the rest of the automotive
industry isn't totally clueless. Maybe they're just more scared of a failure.

Who knows - it's probably a mixture of both.

~~~
alanthedev
> \- Why were they not using automotive grade flash?

Cool. I did not realize there were such products.

~~~
Unklejoe
Well, for a given chip, it usually just refers to a wider temperature range.

But there are other types of flash that are better suited for this
environment. I see parallel and SPI NOR flash a lot in ruggedized embedded
devices (usually with a minimum of 100,000 erase cycles), but they're
generally a lower capacity.

I don't know the exact cause of this failure, but there are ways to design
around unreliable flash too. They could partition it such that the firmware
image is in a read-only partition, and it runs out of a ramdisk, with only
user settings and log messages stored in a R/W partition. That may be what
they're doing, and perhaps the issue is due to the software crashing when the
write fails. I'm not sure.

~~~
mayniac
>I don't know the exact cause of this failure, but there are ways to design
around unreliable flash too. They could partition it such that the firmware
image is in a read-only partition, and it runs out of a ramdisk, with only
user settings and log messages stored in a R/W partition. That may be what
they're doing, and perhaps the issue is due to the software crashing when the
write fails. I'm not sure.

Not in this field so I might be wrong, but I'm guessing they could have also:

* Have redundant storage, failover when the first flash chip dies, and log a fault telling the user to get it looked at. Flash is cheap anyway

* Use removable storage. PCIe, M.2, proprietary, whatever. If it dies then at least you can replace it relatively cheap.

I'm interested in what other problems will surface with Teslas in the next few
years. This seems like a huge design failure, I doubt it's the only one.

------
0xcde4c3db
It seems strange to me that data-logging flash wouldn't be a dedicated FRU
module, instead of being in the same chip(s) as the actual firmware. I see how
this could easily happen as a BOM-reducing or size-reducing measure for
something like a cloud thermostat, but for a _luxury car_?

~~~
danesparza
This twitter thread from last year might enlighten you:

[https://twitter.com/atomicthumbs/status/1032939617404645376](https://twitter.com/atomicthumbs/status/1032939617404645376)

~~~
microtherion
Discussed heavily on HN at the time:
[https://news.ycombinator.com/item?id=17835760](https://news.ycombinator.com/item?id=17835760)

------
Merrill
Which auto maker's electronics are likely to be reliable for the 25-year
nominal life of the vehicle (US fleet average age is 11.8 years and rising)?

And which will have spare parts available for that long?

I'm thinking about buying a new car, but I'm worried.

~~~
JohnJamesRambo
My family’s ownership outcomes strongly suggest getting a Toyota. I’m driving
my Dad’s old 2001 Tundra pickup with 350k miles on the same engine. I was
amazed when I bought it from him how every switch still worked, the original
AC still blew cold. This is not my normal experience buying used cars from
other manufacturers. Usually at about 150k miles they turn into a clown car of
various systems failing. Someone put thought into each component on the truck
and how it would age.

~~~
kipchak
Generally speaking Toyota is conservative in terms of rolling out new
technologies and sticks with what's proven. For example the base Camry has a
2.5L engine and 8 speed auto transmission and the base Accord a 1.5 turbo and
10 speed auto. Toyota also goes more in depth terms of QA and testing on
individual parts than others.

From an article on the Toyota - BMW Supra/M4 collaboration,

"...BMW couldn’t believe how extensive some of our quality and efficiency
studies were as parts came into shape one by one. We would take every bit down
to a fastener or rivet, and put it through our stringent quality control and a
dozen other testing, we’d ship thousands of parts back to Japan for analysis.
That is normal to us."

[http://club4ag.com/chief-engineer-tetsuya-tada-reveals-
the-a...](http://club4ag.com/chief-engineer-tetsuya-tada-reveals-
the-a90-supra-through-a-viewpoint-of-joint-development-with-bmw/)

~~~
cardiffspaceman
That article seems to be more balanced shall we say, in terms of what was
normal for which team. Toyota's engineering seems to be closer to production
while BMW's seems to be closer to design.

* I almost started to think if they had an infinite budget funding to the task of design. *

And it seemed like the opinion from Toyota was that Toyota has a body style in
mind at first but BMW lets the body style be influenced its contents.

 _BMW 's fundamental difference in approach was that they wanted to design a
package, and from there they would naturally evolve a shape and size of the
body from that packaging, a functionally oriented goal. ... Our company
(Toyota) with my tenure and experience, the focus was always design elements
being the priority. We would first spend a lot of time on the shape and appeal
of the car from visual perspective ... _

But let the article speak for itself. I kept my quotes short.

~~~
rasz
It might of changed directions recently. 2020 BMW 2 Series is a series of
cut&paste design cues from 2019 Toyota Supra.

------
bookofjoe
>Because a Tesla is highly dependent on its electronics, once the flash memory
in a Tesla’s infotainment unit goes bad, it essentially bricks the entire car.
Yikes!

~~~
slg
I really dislike how this issue is being presented in the headline and in
quotes like this, because it is fundamentally no different than a critical
part going bad in an ICE vehicle. The Tesla isn't "bricked" when this flash
memory goes bad anymore than a Toyota is "bricked" when its transmission dies.
You fix the faulty part and get the car back on the road.

That said, if there is a systemic problem with this part failing faster than
it should, which certainly seems to be the case, Tesla should be pushed to
either recall the affected vehicles or waive the service costs for those
owners who are impacted. But that isn't any different than if there was a
design flaw in a critical part of an ICE cars manufacturing.

~~~
vorpalhex
In an ICE vehicle, something like a transmission is very expensive and takes
up amazing amounts of space, and so any kind of redundancy is very difficult.

But we're well aware that flash memory degrades, corrupts and gets a bit
screwy. Flash memory chips are also dirt cheap and quite small - couldn't
Tesla of baked in a bit more redundancy in this case?

In our transmission example, any fault in the transmission is not
catastrophic, but with electronics, a single bad joint or bridged trace can
take the whole thing down.

~~~
slg
I already implied that this is a design flaw and stated clearly that Tesla
should be responsible for fixing this (and it appears they did things to
mitigate if not eliminate this flaw when they redesigned the MCU).

I am not defending Tesla here. I am simply pointing out that this is being
covered in a fundamentally different way than other cars. My last car was
recalled twice for electrical wiring issues. One of the problems could have
resulted in a fire and the other one would have resulted in the exact same
"bricking" that you see here. Neither of those issues would be covered or
dissected like this Tesla issue.

------
vijaybritto
A failure in the infotainment will brick the entire car?! I thought they would
be made up of individual systems talking to each other in messages. Tesla
needs Erlang!

~~~
takumo
This is the case for many, if not most, modern/high-tech cars, especially EVs.

The on-board systems have become very tightly coupled, if not monolithic to
the point where a broken info unit will prevent the car from starting.

I had the infotainment system replaced in my e-Golf a few weeks ago, it took
two technicians three days to install and configure to work with all the car's
on-board systems.

Edit: If I _had_ to guess, I’d say this is because the car industry
(especially electric) has been pushing to innovate features so rapidly that
the time hasn't been available to engineer, test and prove decoupled systems.

~~~
vijaybritto
Why won't they think about these things? Do they really don't care or is it
that we don't understand something very important in making cars?! I want to
know !

~~~
lazyguy2
It reduces cost and makes it easier to give a nice user experience when people
are test driving. It makes it easier to manage all the different aspects of
the car from a single interface.

The primary reason the car is made, in the first place, is to be sold at a
profitable price. That is priority #1.

------
michaelt
_> The Tesla firmware wasn't very big at the beginning, [...] the firmware has
grown, leaving very little room for the logging to take place. That means
individual sectors suffer from lots of data being written in a short amount of
time, further accelerating the wear._

My understanding was modern flash memory wear levelling would swap regularly-
written and seldom-written sectors around, allowing for even wear across the
disk even if it was 95% full all the time. Is it established that Tesla
doesn't have this feature?

~~~
codeulike
It does, but the article says that the car logs a lot of data and so
consequently all sectors get wear. See here:
[https://insideevs.com/news/376037/tesla-mcu-emmc-memory-
issu...](https://insideevs.com/news/376037/tesla-mcu-emmc-memory-issue/)

------
omgwtfbyobbq
In case anyone's interested, here's a TMC thread with good discussion on it.

[https://teslamotorsclub.com/tmc/threads/preventive-emmc-
repl...](https://teslamotorsclub.com/tmc/threads/preventive-emmc-replacement-
on-mcu1.152489/)

And a doc the OP put together on replacement.

[https://docs.google.com/document/d/1ZH8oP4AgdVxmCN0saKA1Dxc2...](https://docs.google.com/document/d/1ZH8oP4AgdVxmCN0saKA1Dxc22DUtA-v6apqyZ5gbblE/edit)

------
gowld
> Today, it looks like Tesla will fix the problem if you're within the
> warranty, and outside-of-warranty repairs can cost $1,800 to $3,000

Any other car manufacturer would issue a recall notice and repair it for free,
regardless of warranty status.

I smell a class action lawsuit -- Tesla installed software that invisibly
destroys user-inaccessible hardware throughout the entire warranty process and
then fails afterward, and then charged people to fix it.

~~~
dawnerd
I doubt any car company would recall this. There’s plenty of production cars
with flaws that completely destroy the engine that were never recalled, like
the north star engine Cadillac used.

~~~
happycube
The Cadillac "High Tech" V8 used in the early 80's was even worse for that...

------
joezydeco
Previous discussion:

[https://news.ycombinator.com/item?id=19912065](https://news.ycombinator.com/item?id=19912065)

------
evancox100
This article is unreadable due to the ads

~~~
crankylinuxuser
Adblock on the web is equalivalent to antivirus on Windows.

If you're not running it, you absolutely should be, precisely because it is a
security issue.

~~~
brokensegue
serious question: do people still run antivirus on windows other than MSFT's
free one?

~~~
ryanmercer
I don't, it's bloatware anymore and always trying to upsell you some other
service.

------
_ph_
Which is why any flash drive that is regularly written to, should be
exchangeable. Be it a car or a laptop. Putting things on SD or M.2 seems like
a much more robust solution.

~~~
rasz
Apple thinks otherwise.

~~~
_ph_
I wrote my post because I disagree with Apple on this point. And any other
company building their laptops in this fashion.

------
mjevans
As a suggestion for a general re-design to solve this issue...

Cluster the consumable components together. Flash and battery. Attach the
storage via iSCSI or something similar, have the car's main computer perform
encryption/decryption (rather than trusting a component expected to be
swapped).

In the case of a battery swap the bulk contents of the (encrypted) external
storage can also be cloned. That can probably happen in the timespan that a
battery is topped up to charge, but would be more difficult for automated
swaps at refueling points. A home/cloud backup and transferring only the new
data might be sufficient.

------
flash_zombie
> Today, it looks like Tesla will fix the problem if you're within the
> warranty, and outside-of-warranty repairs can cost $1,800 to $3,000,
> depending on your location. Tesla’s method is to replace the entire MCU.

This is ridiculous. Can we all agree that this is ridiculous? Because that's
what it is. If we can at least all agree that this is indeed ridiculous,
_perhaps_ the collapse of modern civilization can be avoided.

~~~
uniformlyrandom
Which part? The price? Yes, it is ridiculous.

The method of replacing entire MCU? Maybe not, flash is likely not the only
component in there that has an age limit.

~~~
flash_zombie
Replacing the entire MCU. The price for that isn't necessarily ridiculous.

> Maybe not, flash is likely not the only component in there that has an age
> limit.

The MCU itself should outlast the lifetime of the car in the vast majority of
the cases.

------
LeonM
> With electric vehicles now on the market in masses and coming of age, we’re
> starting to see the real culprits of aging

The issue with the flash memory resides in the MCU ( _media_ control unit),
which has nothing to do with the electric drivetrain. Cars with combustion
engines can have this issue too.

> once the flash memory in a Tesla’s infotainment unit goes bad, it
> essentially bricks the entire car.

Essentially, yes. But as far as I am aware (correct me if i'm wrong) a Tesla
with a bricked MCU will still drive.

I am annoyed that articles like this make it sound that these types of faults
are only happening to electric cars, which it is not. It might be a Tesla
issue, as they are both ambitious and inexperienced in car component design,
but it could have happened with other brands as well.

> Fortunately, Jason Hughes, known as a ‘Tesla hacker,’ can service these
> units at a much lower cost.

Yes, many people can, there is no secret technique to it. Just like a local
garage can fix any part with any brand of car for much cheaper than the
dealership. That is because dealerships have to replace parts, rather than
repair them. There are many good reasons why they do this, I wont go into
details here.

Sure, a conventional car mechanic probably can't do it, but anyone with
electronic repair experience can. That is a shift that is happening with all
car brands. As cars are shifting towards computers on wheels, we must learn to
distinguish mechanical repairs from electronic repairs.

~~~
conception
> It might be a Tesla issue, as they are both ambitious and inexperienced in
> car component design, but it could have happened with other brands as well.

For me this is a trend Tesla has been showing. (e.g.
[https://www.thedrive.com/tech/27989/teslas-screen-saga-
shows...](https://www.thedrive.com/tech/27989/teslas-screen-saga-shows-why-
automotive-grade-matters\)Their) stuff is trying to use off the shelf but
aren't designing for what their cars will actually be experiencing. It -could-
happen to other brands, but it is happening with Teslas. This combined with
their reported code practices
([https://twitter.com/atomicthumbs/status/1032939617404645376](https://twitter.com/atomicthumbs/status/1032939617404645376))
generally put in the camp of recommending people lease, don't buy and
definitely don't buy used. These first/second gen Teslas are going to be trash
is a few more years.

