
Issue 62938 – Barometer driver hangs and kills accelerometer on its way - cryptoz
https://code.google.com/p/android/issues/detail?id=62938
======
simias
Aaaah I2C, one of the simplest protocols there is and also probably the one
that gave me the most debugging time.

If someone wants to look into it and is willing to tinker with the guts of the
phone, the first thing is to see the state of the bus when the devices stop
responding. I2C's "idle" level is high thanks to pull ups, the devices only
drive the bus through an open drain.

Sometimes a device's state machine will go fubar (either because of a hardware
bug or programing error) and will lock the bus down, basically making it
impossible for anybody else to use it.

If this happens the next step is to disconnect/reset all other devices on the
bus to make sure which one is screwing up (that's the difficulty with I2C,
since the two lines can be driven by any master or slave you can never really
know who's doing what when things go wrong).

An other thing to look for is the level of the line. Since there are many
devices and pull ups on the wire it's not common to have messed up levels (0
is really 0.2 or 1 is really 0.8, if the pull up is too strong or too weak
respectively). Depending on temperature and other factors that can lead
certain interfaces to sample bad values.

And then well... you have to capture the transaction that causes the lock up
and try to understand what goes wrong...

As a quick fix it might just be possible to force a reset of the bogus chip
when a lockup is detected, that would prevent having to restart everything.
There's usually a GPIO for that (if they wired it...).

I hope you have a good oscilloscope!

Quite frankly I can empathize with the dev not wanting to look into this bug,
by the looks of it that's the kind of minor bug that'll take several days to
track down and fix.

~~~
mik3y
Very good summary, though let me throw in some doubt:

    
    
        > it might just be possible to force a
        > reset of the bogus chip when a lockup is
        > detected [..] There's usually a GPIO for
        > that (if they wired it...).
    

Yes, _if_ they wired it. In my experience, actual design with this best
practice is frustratingly rare. It's as if the hardware designers think, "Oh,
it's just I2C, what could go wrong."

If they didn't do it, the chips are probably only connected to a master board
level reset, and you're basically screwed.

To find out whether it's the case on the affected devices, absent schematics
or a scope, I'd grep around the kernel sources and look for definition of such
a pin. (If sources aren't available, try symbols.)

~~~
joezydeco
You're being _way_ too kind to these chips.

Look at the datasheet for the BMP280 barometer chip, the AK8963 compass, and
the MPU6500 accelerometer (close enough):

[http://datasheet.octopart.com/BMP280-Bosch-
datasheet-1369120...](http://datasheet.octopart.com/BMP280-Bosch-
datasheet-13691204.pdf)
[http://www.akm.com/akm/en/file/datasheet/AK8963.pdf](http://www.akm.com/akm/en/file/datasheet/AK8963.pdf)
[http://dlnmh9ip6v2uc.cloudfront.net/datasheets/Components/Ge...](http://dlnmh9ip6v2uc.cloudfront.net/datasheets/Components/General%20IC/PS-
MPU-6000A.pdf)

The compass (AK8963) has a reset line, the other two have no reset line at
all. Your best bet is to drop VCC, but what are the chances the hardware guy
just tied them to the power bus and left it at that?

~~~
kosma
> The compass (AK8963) has a reset line, the other two have no reset line at
> all. Your best bet is to drop VCC, but what are the chances the hardware guy
> just tied them to the power bus and left it at that?

That is, if you also have even more circuitry to disable the bus pullups;
otherwise they will continue to power the device via clamping diodes,
potentially calling further confusion. The hardware guys don't implement I2C
slave power control for a reason: it gets _bloody expensive_ real fast - both
in terms of BOM cost and PCB estate.

Having said that, the lack of a RESET line on many I2C slaves is just
ridiculous. I have a sticker on my monitor that says "fix the hanging MMA8451Q
bug"; it's been there for at least three months. The very thought of debugging
I2C transactions makes me stop even trying.

~~~
joezydeco
Good point about the pull-ups. Forgot about that.

------
bichiliad
Pedantic, but: can we replace the title with something like "Bug in AOSP
breaks barometer, accelerometer usage."

~~~
tsuraan
No worries, it won't be long before a helpful mod comes along to change the
title to "Issue 62938: Barometer driver hangs and kills accellerometer on its
way." Enjoy the context we have until that happens.

~~~
hnha
What context? Right now the title is empty clickbait: "This bug in AOSP is
impeding science. How to best get Google's attention?"

~~~
tsuraan
Context, being what does it affect, and why should anybody care. The current
title accurately states what is affected (AOSP), and does a less great job of
saying who should care (all scientists and probably the entirety of humanity,
apparently). The "correct" title gives no indication of what is affected, and
little motivation for caring. The title probably will change (they seem to do
that, anyhow), but it will probably change for the worse. This is an argument
that's been going for years, and it's not going anywhere, but I'm still going
to gripe about it from time to time.

------
iam
It would be a lot easier to get their attention if there was an actual 'adb
bugreport' attached to the bug.

There's very little context on what the bug is and how to reproduce it besides
some vague references to PressureNET (which means very little to someone who
hasn't used that before).

------
btian
How do you know it's a bug in AOSP. It could well be a driver issue.

~~~
pyre
It affects the Nexus 5 (and to a lesser extent Nexus 4), shouldn't that be
stock Android?

~~~
ajross
The Android Open Source Project refers to the components released by Google as
open source. This is the framework for the most part, plus a few Java apps.
The drivers are part of the per-device "BSP" layer, and not part of AOSP.
They're sometimes delivered as source (certainly the kernel components are),
but often not (the userspace HAL libraries are almost never source-visible).

This bug looks to be between the kernel driver and sensor HAL to me. It might
be fixable in code we can see, but none of that is part of "AOSP".

------
sucramb
This is a hardware issue that only affects some of the phones. I did a factory
reset and no additional apps installed and still the phone stopped to auto-
rotate every day needing a reboot. RMA'd the phone, the new one does not have
the accelerometer problem anymore. However, it has the focus problem in low
light conditions which the first one didn't have. Not sure I will buy another
LG phone in the future.

------
chinpokomon
Ah, I'm glad I stumbled upon this. I've seen this exact problem on my device
and didn't know what was triggering it. I've been running a widget to track
barometric changes, specifically SyPressure, and since I installed it, at
least 3 times I've seen the orientation sensor stop working. I hadn't made the
correlation since I was in other applications when it would fail.

------
lifeisstillgood
It's probably embarrassing to admit but I read the title and wondered if
Barometer was an Uber competitor and some driver had gone postal. It did not
quite fit the meaning but it was the best I could do till I read the bug.

but the whole embedded market seems like this now - everything just the wrong
side of reliable and abstractions an impossibility. maybe I am just too new to
the area.

------
VLM
Look on the bright side, my first interpretation of the subject line was
"solder in a new chip" not "This can only be fixed via reboot."

------
taopao
Please, won't anybody think of the science?!

------
pawn
Promise them cake. That worked for GlaDOS...

