
Designing an audio adblocker - dest
https://www.adblockradio.com/blog/2018/11/15/designing-audio-ad-block-radio-podcast/
======
notafraudster
A response to some of the many arguments expressed in this thread:

"Then don't listen to it". Alternatively, I'll do what I do now, which is
listen to it but skip the ads and wait for software that can automatically
skip them to come around, which is the same thing I did for websites.

"That's selfish" Yes, just like the millions of people who use adblock
software on their browser. Of course adblock software came after web browsers
built in popup blockers in response to the X-10 Spycam being 50% of web
advertising circa 2001-2003, so really you could argue the web browsers are
selfish and we're all selfish if we don't turn off the popup blockers. Really,
even if a website says "By using this website, we're going to install malware
on your computer", you're selfish if you reject the malware. If selfish means
that I make decisions that make things better for me at the expense of others,
then yes, I am selfish when it comes to ads. I view ads as toxic and bad for
my health, and I take measures to avoid them.

"But it'll lead to the collapse of podcast publishing". Maybe. Of course
publishers repeatedly said adblocking would lead to the collapse of the web,
and as best as I can tell mostly idiots doing a pivot to Facebook Video caused
what collapse did occur (because, as selfish actors, they pursued a greater
revenue stream at the expense of quality and integrity -- oops, turns out
there's no money there because the numbers are all fake.), Adblocking also
ushered in sustainable models for journalism, blogging, etc. Certainly all the
websites I read have found alternate revenue models just fine. Maybe that's
because the ones that didn't went under, but I apparently didn't notice.

If podcasting collapses and goes back to the dark ages of being a non-
commercial hobby medium like it was three years ago, like, okay?

"Stop comparing podcasts to websites, podcast ads aren't as bad" Maybe, but
you don't get to decide for me my threshold for ignoring or blocking ads. Ads
weren't as bad when I started using AdBlock Plus in 2004, but they were bad
enough for me to decide to use it. Others may have had their breaking point
later because they are less sensitive to ads. Some people might still browse
the web with ads on and claim they don't care. You do you.

The strange thing to me is people expressing that they'd rather block ads, but
they're too noble to do so, so they get mad at others for blocking ads? Uh,
okay?

~~~
Hoasi
> "That's selfish" Yes, just like the millions of people who use Adblock
> software on their browser.

I don't see this as selfish, not when it comes online ads. I think this
"moral" argument is pushed by people who generally benefit from ads and have
less incentive to considers ethical issues raised by ad tech. From their
perspective, advertising is free money. Publishing has a cost and people's
work and effort should be compensated, all right. But you are not ripping
someone off just because they picked the wrong revenue model. You are just
protecting yourself by preventing your data being used by thirds parties
without your consent. And you should.

And besides, there are ethical ads options. Given the choice, I would support
publishers using these ads, in absence of a better solution.

~~~
krelian
If there was a paywall and it was priced too high in your view , what would
you do? I assume you'd simply decide not to pay and move on. Online ads are
essentially an honor system: I'll let you consume my content but you'll need
to watch ads. Yet here you decide to circumvent the system instead of just
walking away. Why?

~~~
lelandbatey
The idea that there's an "honor system" always surprises me. Where is it
stated that the cost of the content is watching the ads? That's not how ads
work: nearly universally, what I'm told is "HEY, come get this great free
content!" But when I reach for the content it's snatched away and the "but
first a word from our sponsors" diatribe begins.

You can't have it both ways, either the content is free and I can consume it
on my terms, or inform me of what's expected for this exchange and I'll make a
decision about whether I want to pay the cost or not.

~~~
krelian
>Where is it stated that the cost of the content is watching the ads?

Does it need to be stated? When I leave my bike outside where is it stated
that it's mine and you are not allowed to take a ride? Where is it stated that
you shouldn't have loud conversations in the theater?

You know full well that ads pay for the content but choose to play the
ignorant. You're doing this simply because you can and there are no
consequences and, as of now, those that do watch the ads are financing the
service you consume.

------
nimbius
Its worth investigating the increased use of Subaudible tones to demarcate
commercials. These subaudibles are originally intended to be captured by your
phones and tablets in order to allow the company to track you across multiple
devices through their microphones. the tones are also used to detect
commercial ad presence for analytics companies or suites used by advertisers.

[https://en.wikipedia.org/wiki/Subaudible_tone](https://en.wikipedia.org/wiki/Subaudible_tone)

One could use these tones to 'detect' a commercial and avoid it, or strip out
these tones entirely to promote operational security amongst devices.

~~~
cwkoss
>One could use these tones to 'detect' a commercial and avoid it, or strip out
these tones entirely to promote operational security amongst devices.

Or one could layer superfluous erroneous tones into every stream to poison the
dataset.

~~~
dest
Good point. Note I have designed Adblock Radio to rely on audio only, so that
it gives the smallest possible margin for radio operators to circumvent it.

I do not rely on metadata, nor do I use the usual time schedules for ad
breaks.

~~~
shkkmo
That's great, but can't you use the subaudible tones as an additional blocking
metric? Or even as an assessment metric to flag potential false negatives?

~~~
dest
It may be possible, research is needed on that topic

------
brucemoose
This makes me wonder, what are other viable revenue sources for content
creators besides ads?

Personally I don't mind listening to a few ads on a podcast because I assume
it is helping the creator cover their living expenses so that they may
continue creating content that I enjoy. And spoken ads (hopefully) are not
tracking me, although I suppose there are ways to overlay inaudible tones that
other devices can pick up, etc.

~~~
packet_nerd
I find ads in general, and particularly audio ones, to be a super annoying
cognitive burden. I have to interrupt my train of thought and consciously
block the message they're trying to pipe into my brain.

I'm happy to pay for things if its low friction and reasonably priced, but
will never accept being advertised to (in any form really, but especially
audio and video ads).

An audio adblocker seems like a great thing, and I'm excited about this. If
enough of us use it, hopefully we can force creators to abandon ads and find a
model where they work for us, not the ad companies.

I wish there was a way to block billboards too. Back when google was making
google glass, I thought it'd be possible to block out billboard advertisements
someday.. to bad that didn't work. Or, just move to a state that bans them, I
guess.

~~~
wilsonnb3
> If enough of us use it, hopefully we can force creators to abandon ads and
> find a model where they work for us, not the ad companies.

Why not just stop listening to what they’re creating?

~~~
hopler
Because the content is desirable.

------
JamesBaxter
I probably wouldn't use this as I would feel bad about it but as a UK listener
I've listened to SO MANY adverts from American podcasts for products I can't
buy.

I pay for stitcher premium and export the feed to an app that's actually
useable.

I recently paid to support Thrilling Adventure Hour on patreon and find it
pretty crappy they still have adverts on the paid feed.

~~~
Jonnax
I remember listening to a podcast (can't remember the name) but the adverts
were tailored to my location.

I think the MP3 gets generated based on location.

~~~
sethhochberg
A lot of audio ad broadcast software uses a combination of audible or
inaudible tones to act as "triggers" to mark a location where an ad can be
injected using some kind of location/time/etc aware data, the ad itself is
very rarely baked into the stream unless it is something you're listening to a
host read themselves.

In podcasts, the ad injection software will often pause the audio at the point
where the ad tone occurred while it plays the ad, and resume it afterwards for
continuity.

AdSwizz's AIS suite is one example of the tools commonly used for this:
[https://www.adswizz.com/ad-insertion-suite/](https://www.adswizz.com/ad-
insertion-suite/)

(I work in the internet radio space, lots of old-school internet radio streams
are similar).

~~~
dexterdog
How does this work with a typical podcast player? The media download is
typically just an mp3. The podcast software maker would have to be in on the
scam otherwise it's an mp3 that plays from beginning to end.

~~~
Fogest
When your player requests the mp3 file in this example the server on the fly
can generate an mp3 file that your player then downloads. This means that they
can serve different ads in old podcast episodes or even remove them in old
episodes.

I heard about this tech on a podcast actually.

------
andjd
Panoply has developed software for podcasts that allows them to inject
targeted ads into downloads, which makes podcast advertising closer to the
nightmare that we see on the web.

However, that could offer another avenue for ad-blocking. If you download
multiple podcasts from different IPs, you can isolate the ads by finding the
parts of the audio that are different.

~~~
joshfriend
That's a great idea if the injected ads are exactly the same length and always
get put in the same place. If you have a 30 second ad in one download and 40s
in another, 10 seconds of legitimate podcast audio gets interpreted as an
advertisement.

~~~
jldugger
It works even if they don't, you just need a more intense algorithm to do
similarity matching across all offsets.

~~~
dest
With maximally repeated sequences, you can
[https://ieeexplore.ieee.org/abstract/document/6012115](https://ieeexplore.ieee.org/abstract/document/6012115)

------
shittyadmin
I'd be really excited and willing to pay much more than the normal mobile app
price to have this for my podcasts.

Nothing pisses me off quite like the time waste that is podcast advertising. I
skip them, but I was dreaming of a system like this...

~~~
ucaetano
> Nothing pisses me off quite like the time waste that is podcast advertising.

Why would you be more willing to pay to an app developer who will shut down
the revenue stream for the content creator, than to listen to the ads and
therefore fund the content creator?

~~~
shittyadmin
Unlike with web advertising, you're still downloading the ad in this case,
there's no way for the creator or the advertiser to tell that you blocked it
after.

I've never bought anything from an ad, never intentionally clicked one or
visited a link from a podcast ad. There's truly no difference for me. Their
revenue stream will remain intact, the kind of people who do that will keep
doing that.

~~~
ucaetano
The revenue stream won't remain intact. This lowers the effectiveness of ads
in the long run, reducing the price of ads and reducing the revenue for
content producers.

There's no way to spin this around: this reduces revenue for the creator of
the free content you're consuming (and for all other creators).

Everyone loses, except for the app maker, that makes money (if they are
selling the app) by sucking it out of the content creators.

~~~
rebuilder
I'm going to spin it around: It reduces revenue from ads, thereby reducing
advertising. Now the question is, how much content do you think is both
something you truly want to listen to and something few people are willing to
directly support financially. Personally, I'm starting to think that maybe all
the crud advertising enables is just another downside to ads, not the benefit
in the cost-benefit analysis.

~~~
ucaetano
So you'd kill advertising and content for everyone else just because you're
personally willing to fork up money?

~~~
rebuilder
I don't have that power, and even if I did, I wouldn't ban advertising. That's
not what I'm suggesting at all. I'm saying I'm growing skeptical that the
content advertising finances is actually valuable to the consumer, and am
starting to think it may actually be harmful.

A lot of media is designed to be addictive, hooking the consumer in order to
create a captive audience for ad delivery. I'm no longer sure the ads
themselves, or even the tracking they often entail, are the worst thing about
the whole system. That's why I'm not too bothered about giving people the
tools to not see or hear ads if they so choose.

~~~
jeena
> that the content advertising finances is actually valuable to the consumer

They are not, advertisement never creates value for the society, it's a 0-sum
game. They never add value they only shift the stream of money from a producer
which makes a good product which would sell without ads to a producer which
has money to invest in ads to trick people to buy the worse product.

~~~
ucaetano
> advertisement never creates value for the society, it's a 0-sum game

You're plainly, absolutely, completely, wrong. But, as you made the claim,
I'll wait for your burden of proof.

> They never add value they only shift the stream of money from a producer
> which makes a good product which would sell without ads to a producer which
> has money to invest in ads to trick people to buy the worse product.

You just stated a very narrow case, as if it represented the whole.

Your argument is essentially "if you're selling well without ads, then ads
will not add value", which is self-evident, but irrelevant.

~~~
jeena
How am I wrong that ads are a 0-sum game for the society? Perhaps I'm
ignorant, but I really don't see where the value is for society. I understand
that it's not for the advertiser, but for society?

~~~
ucaetano
Advertising enables product discovery, price discovery, and so on. That should
be enough to answer your question.

Putting up a sign in front of your shop saying "New Mousetrap - Improved
Design" is advertising.

If you make a better mousetrap and nobody knows about it, your mousetrap is
irrelevant.

Sure, not all advertising generates benefits to society, just as not all
products do.

~~~
jeena
There is a difference though, if I want to know something then I search for
information, I go from one shop to another and do research to discover the
best product. Which is kind of a pull mechanism.

What advertising does is a push mechanism, where they try to flood my brain
with nonsense so that I get confused and don't buy the best product for me,
but instead buy something the advertiser sells.

Product and price discovery can be done by me doing research, so saying that
it's advertising enabling that is just dishonest.

~~~
ucaetano
Sure, and putting up a sign in front of your store is advertising.

Advertising is communication, you might not like it, but it is communication.
It is product and price discovery.

------
SmellyGeekBoy
I aggressively block ads and all other forms of tracking in my browser, but I
have to admit that I don't generally have a problem with podcast ads.
Listening to some podcaster blather on for a minute about some product or
service at least gives me the warm fuzzy feeling that they're earning
_something_ for their efforts.

More podcasters really do need to get on the premium ad-free model though. I
pay money to make ads go away on a few platforms (YouTube, Spotify etc) and
would gladly put my money where my mouth is and do the same for podcasts.
Websites, too, if anyone can come up with a decent way to make it work - AND
STOP TRACKING ME!

~~~
cortesoft
> Listening to some podcaster blather on for a minute about some product or
> service at least gives me the warm fuzzy feeling that they're earning
> something for their efforts.

You don't feel the same way about a website? What is different about podcasts?
Genuinely curious, because I have a similar initial reaction but am not
exactly sure why.

~~~
pavel_lishin
I'm not the person you're asking, but most websites offload the ads to a
different service on a different server. It's the difference between your
favorite podcast performer reading some copy, and a movie on TV fading out to
showcase the great deals at _Luke 's Honda Dealership & Taco Shop Come On Down
For The Tastiest Deals In Town_.

It also strongly indicates that the podcast author has vetted the ads, and is
at least implicitly approving of the product; on the web, it's very likely
that the author has absolutely no idea what ads are being shown to their
visitors.

------
Animats
I built a device to do this many decades ago. An electronics magazine
published a schematic for a "voice/music discriminator". This was a simple
analog circuit that ran the audio through a low-pass filter and then detected
sharp cutoffs. Voice has those, but music without lyrics doesn't. It's not
perfect, but it cuts off ad blithering before it gets annoying.

(I should find that thing. It's in my garage somewhere.)

~~~
dest
I would be really curious. I some day you find it please drop me an email!

------
qaute
This post is very well written! I like the colloquialism, exhaustive list of
experiments (both successes and failures, to better educate others), future
work and call to action, and a pervasive lack of fluff. In particular, this
reads like many contemporary scientific papers, but is much clearer (I can
understand everything with one pass - whoa!), albeit absent tabled data and
extreme technical details. I wish more scientists wrote like this.

~~~
dest
Thank you for the kind words

------
baldfat
Personally I want the same option I get from Google. I can see ads on YouTube
or I can pay for an ad free YouTube. Now a creator will put their own ads
inside the video is up to them. Personally I like when they put them at the
end, like Smart Everyday does.

If there was an outlet where I can just pay for ad free podcast I would be
happy to do it. It also would pay far more than what ads pay. I still remember
the release of YouTube Red was how this was hurting creators but it was a
cynical argument that paying actual money would pay less than a paying
customer.

------
cwkoss
Would be fun to record 'antivertisements' to replace the ads with. Ex. Instead
of an Uber ad, its 15 seconds spiel about a recent scandal they've been
involved in.

Punishing advertising dollars is the only way to reduce marketshare of the
industry in the long term.

~~~
dest
This concept reminds me a nice Youtube video where the presenter suggests to
associate brands with unpleasant situations in professional meetings.

It's in French, but there are subtitles.
[https://youtu.be/AqCB6tGR3hs?t=658](https://youtu.be/AqCB6tGR3hs?t=658)

------
aquabeagle
I want this for my TV in the form of an HDMI pass-through device that mutes
the audio and/or greatly dims the picture while commercials are being played.

~~~
dest
There is already software to skip TV ads. Not perfect, but good enough
apparently. For a short review see
[https://www.adblockradio.com/blog/2018/12/10/ad-blocking-
tv/](https://www.adblockradio.com/blog/2018/12/10/ad-blocking-tv/)

You could set that up on a raspberry Pi, to play IPTV.

A pure HDMI pass-through may be expensive because of the HDMI acquisition-
side.

------
aitva
Great work! It is very interesting to see the different algorithms you've used
and the changes you were forced to make due to legal trouble.

------
dirtyid
Detecting speech and music is particularly intriguing for different variable
playback speed for speech bits and 1x for music bits.

~~~
dest
I do not understand your sentence :/

~~~
kevinh
I think that they're saying it would be interesting to use music detection to
have the podcast playback switch from an accelerated playback speed during
speech to normal speed when music is detected.

~~~
dest
OK, this makes sense, good idea

------
anhonestopinion
I like ads on podcasts: not invasive, support the shows and can't put malwares
on my pc. Honestly I find an adblock for podcasts a total dick move (outside
of the intellectual exercise part of course).

~~~
CivBase
> Honestly I find an adblock for podcasts a total dick move

If you expect payment for a service, give your "customers" an opportunity to
support you directly. It can be a premium account, merchandise, or even a
simple donation service. Don't force your "customers" to become the product.

I don't see why podcasts should get a pass.

~~~
umvi
> If you expect payment for a service, give your "customers" an opportunity to
> support you directly.

...but if that option isn't available, that makes stealing ok? "Hmm, I like
this podcast but I don't like the ads. There's no premium support option.
Therefore... I will consume the content of the podcast without the ads."

I find this line of thinking a bit immoral, to be honest. The moral option is
to not listen to the podcast. And before you jump on me asking if its immoral
to go to the bathroom during a TV ad break, I'm talking strictly about
algorithmic ad-blocking.

It would be like if I had a place setup where you can get a copy of one of my
indie video games I've been privately developing after watching an ad. My
place has no option to flat out pay for copies of my indie games. But... you
really like my indie games so you decide it's okay to just walk in and steal
copies of my indie games until I provide you with an option to pay for them.

~~~
yoavm
Not listening to the ads is so far from stealing, I am almost not sure if
you're serious. It's my right to decide whether to rewind a minute, to lower
the volume, to skip a whole boring episode etc'. It's also my right to process
the content with an app that modifies it, for my own use.

~~~
umvi
It's not the act of not listening to them that's stealing, it's the act of
algorithmically excising them that's stealing, _especially_ (but not
exclusively) if there is a premium option available.

For example, watching non-premium crunchyroll with an ad-blocker enabled is
dishonest, IMO.

------
microdrum
This is fantastic. Can we use this as a proxy when we input RSS feeds into our
podcast apps? I use Overcast on iOS, for instance. I'd love to remove all
audio ads from Overcast, both dynamically inserted and normal "live reads,"
which are also fairly toxic.

~~~
dest
Dynamically inserted ads could be automatically removed. Live reads are more
difficult. It would require natural language processing, which is out of reach
for now (but help is welcome!)

Adblock Radio as a proxy is totally feasible (Pi-Hole like [https://pi-
hole.net/](https://pi-hole.net/)). I am currently working on this.

------
choocroot
That's what I've been dreaming for ages!

I've been musing about a system/software that could listen on my PC's audio
and control/adjust/mute the volume when it detects ads.

Such a system could also work for video content played on a PC!

This system could also be implemented on a dongle-like Raspberry Pi that would
take the PC's audio output (3.5mm jack in/out), process the audio, and play
back the "clean" audio to it's jack output: that could be seen as an audio
firewall.

I think that working with analog audio (instead of official radio's streams)
is the way to go to avoid legal issues: they cannot plug the "analog hole"!

------
jccalhoun
I would really like this if only because I listen to a radio station through
iheartradio and they run the same 3-4 commercials during breaks for weeks at a
time. Just give me some different ads and I might not want to skip them!

------
eyeareque
I will be very interested to see how well this does. This has been something I
wish existed for a long time. Great work!

I’d much more prefer to pay the podcaster directly than to listen to ads. But
this doesn’t seem to be an option.

------
celticninja
In the UK this is called a TV Licence, it funds the BBC radio stations (local
and national) that feature no adverts at all. I cant stand commerical radio
and have never listened to one through choice.

------
drewmol
Hey op, my 2¢: just mute them, everywhere. Duplicate the audio stream, detect
ads, use that to mute the volume of whatever is playing the ads.

~~~
dest
It's what I do on adblockradio.com/player

I am currently working on Sonos / Google Home / Amazon Alexa

~~~
drewmol
> Sonos / Google Home / Amazon Alexa

Interesting, are you planning to mute Sonos/Home/Alexa by analyzing it's audio
output stream? If so: do you plan to use the devices audio controls?

Are you planning on using the Sonos/Home/Alexa microphone for detection of
other sources?

~~~
dest
I plan to "man in the middle" the HTTP stream.

------
dexterdog
Why not have a service that maintains metadata for popular podcasts with the
timing of the commercial parts so players that know how to read it can auto-
skip? The players can charge for the option and the metadata can be supplied
by the content creators so they get a portion of the proceeds. If the content
providers don't want to supply the information it can be crowd-sourced.

------
Espadrille
Seems interesting, I will try it. If it is working I may finally listen again
to radio in my car after having stopped it for many years.

------
drewmol
made a comment when Apple acquired Shazam a few months ago:
[https://news.ycombinator.com/item?id=18066724](https://news.ycombinator.com/item?id=18066724)

>...Unfortunately all product paths I can foresee will surely lead to targeted
replacement ads, played at higher volume

Any thoughts on other ways to monetize? ;-)

------
goodmachine
"I started Adblock Radio in the end of 2015, a few months after completing my
PhD studies in fusion plasma physics."

Now I feel sad.

~~~
dest
What makes you sad? The fact that there is not enough funding for fusion
plasma research? Or my unconventional career choices? :)

------
nnd
I think the way to go is directly supporting Podcast makers (outside of the
platforms like Patreon who have dubious censoring practices). The ad-based
model is inherently flawed and luckily it seems that most podcasts providing
valuable content are moving away from it.

~~~
agentdrtran
Care to give some examples? Patreon allows pretty much everyone.

~~~
nnd
The has recently been a controversy with Patreon removing some content
creators accusing them of "hate speech". Here is a pretty comprehensive
summary: [https://podnews.net/update/patreon-
controversy](https://podnews.net/update/patreon-controversy)

------
Fogest
I have an automated system that is recording mixes from various radio stations
and I was wondering if it's possible to get this to run on a saved file? Say I
had an mp3, is there a way to run it through such program and get a resulting
ad-free version?

------
donclark
I would love to implement this on my Google Home mini when I listen to radio
stations. If the radio stations are not going to provide an option to pay to
not hear ads, I will gladly pay for a service to block ads.

~~~
dest
I am working on a "Man in the middle" device that would filter audio streams
before they reach players like Google Home.

------
fpgaminer
Great article, thank you for sharing.

About the machine learning model; forgive me if any of these questions were
covered in the article:

1) Why classify as three categories (music, talk, ads) versus just two (ads,
content)? Seems like that might help simplify things, since I assume all we
care about is whether something is an ad or not.

2) So based on the description you're feeding 4 seconds worth of buffer into
the model for classification, every second; the model being a stateless RNN.
I'm curious, did you try just using a (1D) CNN instead? That would allow you
to use a more robust architecture, versus RNNs which tend to be finicky. (And
RNN doesn't seem to provide benefit, since you aren't using its state, other
than potentially being a smaller model).

3) AFAIK there are loss functions which penalize false positives (in this
case, penalizing incorrectly tagging content as ads). Was this experimented
with?

4) This one is more of a curious idea: 4 seconds worth of buffer might lack
enough context, which we can see play out in some of the failings alluded to
in "Future improvements". So I'm curious...

Suppose, if your architecture isn't set up this way already, that you've got a
layer before your final layer that's just a little bit bigger than the number
of categories. Say, 32 features. (A lot of RNN architectures are built with an
output matrix that's HiddenState => Outputs (n=3). So I'm suggesting
HiddenState => Embedding [n=32] => Outputs [n=3]).

Train that architecture like before. Now, take that trained model and chop off
the output layer so that its output is those 32 element vectors. We can now
build a secondary, context model on top of it. It takes as input those 32
element vectors (one vector per second) and outputs the same classifications
(ad/not-ad). But since its input is much smaller, you can train an RNN with a
much, much deeper BPTT.

Hopefully this model can not only be more intelligent by using context, but
also take the place of your hysteresis.

Since the underlying model is fixed, you can pre-compute its outputs on your
dataset, so there's absolutely no cost while training the higher order model
(and the pre-computed dataset will be really tiny; 32 elements every second
would be only 100mb for your 10 day corpus.)

Now, this is somewhat dependent on your corpus being of the nature "here's a
long section of audio where the ad is tagged precisely". i.e. it can just be a
mixed bag of "this whole audio segment is ad/not-ad". If you don't have that
kind of data you'd need to create it (assuming my crazy idea is worth the
effort). Hopefully you wouldn't need much. This higher order model shouldn't
need to be very complex, which means it doesn't need a big corpus.

Or you could try to use a CTC loss function, in which case you just need a
dataset that's large chunks of audio and vague labels like "this length of
audio has 1 ad somewhere in it."

By the way, this isn't intended as advice; I'm by no means a domain expert
here. I'm really just thinking out loud. And since none of the other comments
at the time of this writing are discussing anything other than the morality of
this software, I thought I'd take the time to inject some more technical
discussion.

> hip-hop music, easily mispredicted as advertisements

Well given the endemic use of name dropping and product placement in popular
hip-hop, I'm not sure that's a misprediction ... :P

Joking aside, one thing that might be useful is to find a stream with
positively no ads (e.g. Spotify Premium, Apple Music, etc). Play that through
a trained classifier. If the classifier ever detects an ad, add that audio
sample to your corpus.

~~~
dest
Thank you for your high-quality technical questions.

1) Audio streams are naturally segmented between those three states.
Separating talk from music is rather easy. The most challenging part is to
separate talk from ads (spoken ads) and music from ads (musical ads).
Filtering ads and talk gives you a music-only experience, which is good when
you want to work for example.

2) 3) I am not an expert on RNNs. My understanding is that the LSTM keeps the
state between each prediction. I will hopefully get back to you with more
precise answers.

4) Your idea about the embedded layer N=32 looks very smart. The dataset is,
strictly speaking, a mixed bags of 10-second 100% ads, 100% speech and 100%
music (with some slight tolerance at the edges of the track). But when
labeling data, I have often tried to label contiguous segments of a minute to
several minutes. Though, to not spoil the dataset, I often get a discontinuity
on transitions (e.g. music -> ads). So in conclusion I would need to create
the dataset you describe. Not a big deal I guess.

X) Streams without ads are quite common. E.g.
[http://www.radiomeuh.com/](http://www.radiomeuh.com/) or
[https://www.fip.fr/](https://www.fip.fr/) The thing is that you get a very
big corpus of whitelisted data. Too big actually. The solution I have used for
a while is to monitor the radio metadata (using
[https://github.com/adblockradio/webradio-
metadata](https://github.com/adblockradio/webradio-metadata)) and downloading
musics with youtube-dl. It worked quite well to bootstrap ;)

Is there any way we could keep in touch apart from Hacker News? Feel free to
email me if you feel like it.

------
jonbarker
The reason ads for podcasts don't deserve blocking and are OK is that they are
completely opt-in. You have to enter a code or go to a URL in order for the
podcaster to get credit.

------
allenleein
The reason why audio adblocker should definitely exist:

those ads will be updated one day.( Company/product shut down...)

------
l9k
I would suggest to integrate a simple local music player during ads and allow
to select a specific playlist.

~~~
dest
I would not know how to do it on the web player, such as
[https://www.adblockradio.com/player/](https://www.adblockradio.com/player/)
Do you have an idea?

On standalone players, it a really great idea and it is already being
considered.

------
polskibus
I wonder if this could work with Spotify free tier.

~~~
helb
I've tried this a few years back when Spotify started repeating some annoying
ad over and over. It's quite "low-tech" compared to adblockradio, but it
worked pretty well.

[https://github.com/serialoverflow/blockify](https://github.com/serialoverflow/blockify)

------
3rdkulturekyd
Like the sound of this!!

