
Continuous low-power music recognition - stablemap
https://research.google.com/pubs/pub46522.html
======
awalton
...this doesn't seem like it is at all worth giving up 1% of battery life. I
don't think it's even worth giving away 0.25% of battery life to detect
whatever the 70k songs stored in the database is, nor is it worth the space
the database is taking up on the device. The question of "what song is this"
comes up maybe once a month at best, and apps like Shazam already exist and
have much deeper databases to search through. In other words, it does a worse
job than existing solutions and uses power constantly.

It feels more like yet another solution looking for a problem. Worse, it
_screams_ like a foot in the door for telling users it's okay for microphones
around you to always be listening, ala Amazon Echo. It also weakly smells like
it'll immediately be used to send packets of what songs were listened to and
similar frequency info off-device to be collected by the Google Big Data
machine to be sold to RIAA members, as yet another way of extracting ad
dollars from Android.

~~~
sly010
I would argue that it's not worth my 30 seconds to unlock my phone and open
Shazam (which btw turns on the display and the radio, which then triggers
other apps to poll for notifications, so there goes your 1%)

I would also bet the database is probably not much bigger then the shazam app
itself.

I frankly just find it a well balanced solution both in terms of UX and in
terms of engineering.

~~~
awalton
> it's not worth my 30 seconds to unlock my phone

It takes me ~4 "seconds" (saying "one-one thousand" aloud) to do it when
typing in a password (sorry, don't have a stopwatch handy). Just did it 10
times just to test. It'd be even faster with pin/fingerprint/face unlocking,
or with no locking at all.

I also used Shazam and Google's "what song is this" feature for somewhere less
than 5 minutes on the initial few seconds of 10 songs and my battery indicator
didn't move a single percentage during my testing. Shazam scored 10/10, Google
missed one (a song from an esoteric Chicago band called Terminal Bliss - all
of the songs I chose were intentionally fairly obtuse to try to find missing
info in the databases).

But I can say with fair assurances Google's not shipping their entire database
on phones, just whatever's most popular up to some (probably size driven)
limit; any given CD releases with about ten songs, any given year sees tens of
thousands of CD releases (about 75,000 albums in 2010 alone). Worse, the
entire feature's use case is better suited towards _less popular music_ since
you're less likely to know the song's name if it's new or esoteric - so you
can almost immediately guess that those 70k songs in that database is just ~30
years of various genre chart topping hits.

All you've done is cement my convictions that this is absolutely a solution in
search of a problem.

~~~
Nemi
> It takes me ~4 "seconds" (saying "one-one thousand" aloud) to do it when
> typing in a password (sorry, don't have a stopwatch handy). Just did it 10
> times just to test. It'd be even faster with pin/fingerprint/face unlocking,
> or with no locking at all.

Ah, but did you test this in your quiet office or home, sitting at your desk
with both hands free?

Let's try a real-world environment. Something like, say, driving in your car,
in moderate traffic, with both hands on the wheel while listening to your
radio. That ~4 seconds becomes a barrier-to-entry just large enough to abandon
the idea of finding out what the name is to that song you are listening to.

Because, lets be honest, wondering what the song is isn't that important. But
it is nice. Which is why this feature has some proponents.

------
xr4ti
> Now Playing, has a daily battery usage of less than 1% on average, respects
> user privacy by running entirely on-device and can passively recognize a
> wide range of music.

Maybe I am missing something, but this seems to be much less respectful of my
privacy than an app that only listens to the ambient sound around me when I
explicitly give my consent (by opening/activating the app).

~~~
Ajedi32
What's your actual privacy concern?

If the whole process happens entirely on-device, then this reveals absolutely
no information of any kind to anyone, correct?

~~~
pavel_lishin
> If the whole process happens entirely on-device

Sure, if. Which I can verify, but I don't want to have to. I don't want to
have to verify that a corporation isn't fucking me at every turn.

And it's possible that vulnerability will be discovered that lets people
stream the audio somewhere, or listen to key words and send just those
portions. I don't know enough about Android to know whether it's more
plausible for that to happen, or if that sort of vulnerability would likely
grant an attacker access to my mic even without this feature.

I don't think it's unreasonable for me to expect my device to not always be
listening to me.

~~~
Ajedi32
> Sure, if. Which I can verify, but I don't want to have to.

Wouldn't that be a risk even if this feature didn't exist? How do you know
your phone isn't currently listening for music/speech and sending that data to
the cloud without your knowledge?

> And it's possible that vulnerability will be discovered

Again though, that's already true regardless of whether this feature exists or
not. Is there any reason to believe this feature is more likely to have a
vulnerability than any other feature on your device? Why would you trust this
code less than, for example, the code for your WiFi driver?

------
manol74
Great work, but super scary for me. The "service" of music recognition will
probably be extended very quickly to a constantly running "environment
recognition", just fingerprinting the audio around you, places, speakers etc..

~~~
obelos
Yes, this has a “proof of concept” feel to it.

------
ufmace
I have the Nexus 2 with this feature, and it works remarkably well. No
noticeable power drain, and it identifies a lot of songs that I never would
have thought to ask about, like TV show intros and such. It's handy that if I
ever wonder what a song that's playing is, I can pull out my phone and it's
usually already there instead of having to unlock it, ask it what's playing,
and wait a bit, if the internet is even working there. Pretty cool that they
can pull this off with no network communication.

------
spdustin
Seems to me that it’s another part of a fingerprint for a user that can locate
them, and possibly place you with others in the vicinity who have ID’d the
same audio. Retail outlets have playlists that could be ID’d, possibly with
enough granularity to indicate your path through a shopping mall or your dwell
time in a particular department.

I know the recognition is done on-device for some percentage of tracks, but
it’s been unclear to me what happens to the running tally of
times/locations/tracks identified afterward.

~~~
itp
It's not sent anywhere or even stored locally on the device. Some people view
that as a missing feature, actually, so there are apps that add this
functionality[0], so you can review the songs you heard and find it later.

[0] [http://www.androidpolice.com/2017/10/26/add-track-id-
history...](http://www.androidpolice.com/2017/10/26/add-track-id-history-
pixel-2-now-playing-feature-app/)

------
hugecannon
> author = {Beat Gfeller and ...

A case of nominative determinism?

[https://en.wikipedia.org/wiki/Nominative_determinism](https://en.wikipedia.org/wiki/Nominative_determinism)

~~~
piquadrat
Beat is a _very_ common name in Switzerland. I know about 10, none of them
work in music :)

~~~
StavrosK
Is it pronounced bay-at?

~~~
piquadrat
From
[https://en.wikipedia.org/wiki/Beat_(name)](https://en.wikipedia.org/wiki/Beat_\(name\)):

> pronounced "BEH-awe-t"

------
e12e
I wonder how well this performs. I see I have roughly 2000 songs on mys sd
card. I find it a little hard to believe that 40x that is enough to cover a
big enough slice to be able to actually answer the question of "what's this"
when it comes up?

[ed: i actually think I'd have plenty of room for 70k songs in high quality
ogg vorbis vbr or equivalent - that'd frankly be more interesting... ]

------
Omniusaspirer
Only being able to compare to 70k songs isn't really going to be adequate the
second you step off a mainstream playlist.

------
em3rgent0rdr
Will they use this tech for non-music purposes? Such as for enhancing google
ads or for sending keywords to the NSA?

~~~
adrianN
Automatically identifying whether you're at an illegal public performance and
sending lawyers your way to extract the songs from your ears.

~~~
xr4ti
Well change the words "music" to "conversation" and song to "person" (or
phrase), and things start to get pretty scary.

------
svantana
Using neural nets for this purpose makes it likely to be susceptible to
adversarial examples. Fairly harmless in this use case, but if a similar
system would be used for ContentID on YouTube, that could be exploited to
subvert it (i.e. spam it with false positives and contest the verdicts en
masse).

------
bhouston
Is this why Shazam sold? Its basic functionality was going to be built into
Andriod/Google Assistant?

~~~
jclardy
It already is available on both the iPhone and Android through Siri/Google
Assistant. Just ask them "What song is this?" This just seems to be an
extension of it so you don't need to ask, the phone can just display the album
info passively when you pick up the phone. Of course Apple's integration is
already using Shazam for the actual identification, but you don't need the app
installed.

Hard to compete with that if you aren't integrated at an OS level.

------
iforgotpassword
Is this pixel 2 exclusive or can I install this on my xiaomi? I skimmed the
paper but didn't find any name for that app.

------
qualitytime
These are one kind of HN posts that drive me up the wall.

You say less power to get us to buy more?

How about saving all that energy and using it for real world building.

~~~
taneq
To be fair, they say:

> Since everything runs locally on the device without sending either audio or
> fingerprints to a server, the privacy of the user is respected and the whole
> system can run in airplane mode.

This is far more respectful of user privacy than we usually see from Google.
I, for one, am impressed.

~~~
Harvey-Specter
What does this have to do with the comment you replied to?

~~~
taneq
A complaint about marketing ("get us to buy more"), on an article about
always-on audio processing in a phone, implies a concern about privacy-
invading user tracking. I was pointing out that (on the face of it, at least)
this doesn't seem to do that.

