
PiHole-Google: Completely Block Google and Its Services - Jerry2
https://github.com/nickspaargaren/pihole-google
======
ravenstine
This is a nice idea, but the one thing I haven't been able to divorce myself
from is YouTube. I really hate how Google has allowed such a wealth of
constant information that completely dwarfs alternative video hosting sites.
As censorious as Google can be(now "up next" is always some video from CNN or
Fox), blocking YouTube from my network would mean cutting myself off from a
large portion of the world.

~~~
kalleboo
It’s fascinating how differently the YouTube algorithm treats people. I've not
once seen a Fox or CNN video recommended, I didn't even know they had a
presence on YouTube

~~~
whatshisface
Personalization makes it incredibly hard to "watch the watchers," because
everyone is getting a slightly different view of what Google is doing. I would
like to see a program where users submitted data about their recommendations
to researchers so that we could uncover Google's opinions. It would have a lot
of financial value to YouTubers and would make it harder for Google to abuse
their role as censor.

I could imagine shadow-burning YouTubers without banning them by shrinking
their recommendation audience.

Further, it would be good for Google. Every little shift in the weather is
going to get blamed on them whether they deserve it or not, now that it's
common knowledge that they weild this power in more than zero cases. Google is
about to discover why judges write opinions. Administering justice from secret
meetings leads to popular dissent more than it leads to justice.

~~~
distances
I clear my YouTube search and watch history about once a week. Partly because
of privacy, but also because a single binge of, say, metal casting videos does
not mean I want them recommend ever again in the future.

~~~
Nextgrid
They use browser fingerprinting and/or IP addresses as well. Even on a brand
new browser session, doing something even slightly related to the previous
session brings back its entire recommendation history.

~~~
maccard
> doing something even slightly related to the previous session brings back
> its entire recommendation history.

Are you sure that it does, and that it's not just a case of "hey we've never
seen this person before, but they watched X, let's immediately start with
recommendations Y and Z because other people who watched X were engaged with
it?"

~~~
Nextgrid
I used to think that and gave them the benefit of the doubt, but then I
realised that some of the videos being recommended had nothing to do with the
one currently watched other than the fact I watched similar ones previously.

------
geokon
The problem is that JS Fonts and other CDNed stuff won't load and websites
will hang or work weird - particularly Stackoverflow. Bc it's all over https
you can't MITM it and inject your own with OpenWRT/piholes. Decentraleyes (a
Firefox browser extension) fixes some of this, but not all. If anyone has any
additional suggestions, please let me know (it makes life bearable in China
without a VPN)

~~~
maccard
Are there any extensions that modify external resources and point them towards
a "trusted" cdn? e.g. requesting <script
src="[https://ajax.googleapis.com/ajax/libs/jquery/3.4.1/jquery.mi...](https://ajax.googleapis.com/ajax/libs/jquery/3.4.1/jquery.min.js"></script>)

Would automatically remap to

[https://cdnjs.cloudflare.com/ajax/libs/jquery/3.4.1/jquery.m...](https://cdnjs.cloudflare.com/ajax/libs/jquery/3.4.1/jquery.min.js)

~~~
WorldMaker
It is great that you could local cache the top X fonts in Google Fonts and
never have to redownload them from Google's CDN. It's just too bad that having
fonts locally installed or not can be a signal to trackers or otherwise it
would be a lot easier to recommend to everyone to just install larger font
banks.

~~~
maccard
Could we bundle more fonts with Firefox? Or provide a browser opt out for that
behaviour...

~~~
WorldMaker
Bundling more fonts with browsers and operating systems by default is probably
the biggest way to do that.

(The corresponding problem with that being how many people would then blame
that as browser bloat and complain about the size of all the fonts and how
much they "clutter" one's font system.)

The browser would have to be pretty tricksy to solve the tracking problem with
local fonts, because the tracking techniques themselves are pretty tricksy.

Such as: Render text to a GL target as fast as possible and hit detect the
metrics of the asked for font versus the fallback font.

You would think techniques by the browser to minimize FOUT (flash of unstyled
text) mitigates against this sort of tracking, but some of the techniques
involve timing between JS load and DOM Ready events.

Admittedly there are easier tests than font loading tests for deanonymization
on the web, but obviously if the goal is to de-Google it is worth keeping in
mind.

------
julianlam
I find Google Container to be an excellent plugin to segregate my Google
account from the rest of my browsing. It's not an official plugin from
Mozilla, but it is forked from the Facebook container plugin.

~~~
mattlondon
Same here - been using Google Container for 6+ months now and very happy with
it. Highly recommended - you can do this yourself with just normal containers
in Firefox, but this comes preconfigured with all the non-obvious domains you
might not know about. No connection - just a satisfied user.

[https://addons.mozilla.org/en-US/firefox/addon/google-
contai...](https://addons.mozilla.org/en-US/firefox/addon/google-container/)

Only problem with it is now reCAPTCHA sites are a huge pain to use since you
have to answer about 15 challenges before you can get (since you look totally
unknown to Google outside of the container). It is often better to just ignore
these sites now, but it is not always possible.

------
3xblah
Another approach is whitelisting. Like a default firewall rule of "block all"
and a set of specific exceptions, I find this approach can be easier to
manage. Probably not going to work for everyone but works for me.

Figure out what domains I need to access for the content I am after[1] and
just allow those. "Block" everything else. For example, I might need something
like .googlevideo.com once in a while but I will never need something like
googletagmanager.net.

1\. To do this, I just go through the logs of a local authoritative nameserver
that I run solely for this purpose, i.e. collecting lists of needed domains.
Then I add the necessary DNS data to /etc/hosts or another local authoritative
server, e.g., tinydns. I believe unbound or pdns_recursor can serve static
data as well.

Does the author mention avoiding using Google as a third party DNS service. In
the beginning, PiHole, i.e., preconfigured dnsmasq, was pointed at some third
party DNS service, maybe Google. Not sure what the default configuration is
today. If it was Google, then is there any irony in that a project designed to
blocks ads is by default having its users send their IP and ISP location to an
advertising company probably hundreds if not thousands of times over in a
single day of web use.

~~~
jasode
_> Another approach is whitelisting. Like a default firewall rule of "block
all" and a set of specific exceptions, I find this approach can be easier to
manage._

I tried the whitelisting approach but quickly found out this breaks many
websites with shopping cart and credit-card checkouts because they use
payments api gateways. Because the url for the card processing gateway is a
different company from the ecommerce site you're visiting, it has a totally
different spelling so you _can 't predict_ what to put in a whitelist
beforehand. In turn, if you do whitelist the payment gateway url, you might
then find out it makes _another api call_ to a fraud detection url which is
another totally different url that you didn't know you had to whitelist.

Whitelisting DNS entries is workable for use inside of a single virtual
machine that deliberately restricts a web browser to access a few websites
like youtube.

However, I don't see how it's possible to use the whitelisting strategy on a
PiHole that globally filters the entire family accessing it with multiple
desktops and smartphones. It's not easy to tell if a spinning hourglass or
beachball is happening because the a website is slow or whether the whitelist
is missing some url entries. The family members would constantly be visiting
new and legitimate urls so it seems very cumbersome to try and keep up with
adding new whitelist entries for everybody.

~~~
3xblah
For commercial web use, I use a DNS cache just like the website creator would
expect; I use a popular browser in these instances, too. Nothing out of the
ordinary. For exactly the reason you mention. If something goes wrong I want
to be able to say I am the "typical user", not an enlightened one.

 _However_ , I rarely use the web for commercial purposes. Almost all use is
non-commercial.

I do not use a Pi-Hole. I do like dnsmasq. I prefer djbdns. I use older
hardware running Net/OpenBSD as routers and newer hardware running OpenWRT.

I also do not use popular graphical browsers much. I probably would not use
whitelisting if I was doing all web use via a popular graphical browser. I
reasonably consistent speed across all websites by using text-only browsers
and tcp/http clients.

Cannot really speak for other users. Everyone is different. For me,
whitelisting works well.

------
waltwalther
I have been running a pi-hole server at my home for almost a year. We have, at
times, around thirty devices on our network, (thermostat (non-nest), several
Google Home devices, numerous phones, 4 desktops, 4 laptops, 3 ipads, 1 TV, a
chromecast/roku/firestick, a few smart receptacles, and a Xfinity modem) and
sometimes the traffic is pretty neat to examine. Its interesting to see which
devices phone home.

Whenever a necessary site is blocked it only takes a few seconds to whitelist
it. I can also easily blacklist sites. The GUI is very easy to access and use.
We have never had an issue with YouTube (YT premium) or anything else really,
but occasionally a link will be blocked because of Google or other ad traffic.
This has never happened with YT or any other streaming services.

One thing to remember is VPN traffic ignores the Pi-Hole server. Even when the
router/computer/device DNS is set to use it. This has never been an issue for
us, as only a handful of devices here are using VPN, but I suppose it could be
under the right circumstances, but easily fixable.

------
leovander
>GAFAM

Never seen it listed out like that, I thought it was FAANG. Or is FAANG only
used in reference to top salaries in the Bay Area?

~~~
Nextgrid
GAFA is the french-speaking equivalent of FAANG (although they all seem to
omit Amazon and Netflix).

~~~
michaelbazos
GAFA(M) omits Netflix but not Amazon, since it stands for Google Amazon
Facebook Apple (Microsoft)

~~~
Nextgrid
Yes I understand - I was talking more about the ordering of them. English-
speaking websites tend to use FA(A?)NG while French-speaking ones tend to use
GAFA(M?).

------
kyrra
If you don't mind blocking everything hosted on GCP as well:

> dig TXT +short _netblocks{,2,3}.google.com | tr ' ' '\n' | egrep
> "(ip4:|ip6:)"

Gives you a full list of all of Google's IP blocks. You can just blackhole
those.

~~~
jedberg
That's just their SPF record. It's only a list of IPs that google.com email
might come from (or any domain that imports those records)

~~~
_wmd
It's in SPF format, but it's also everything. See e.g.
[https://cloud.google.com/appengine/kb/](https://cloud.google.com/appengine/kb/)

Another method is using GeoIP's ASN database, but they also run many ASNs so
it would require a little effort to ensure you have them all

------
jedberg
Has anyone actually used this? Does the web become completely unusable? I
suspect blocking their fonts and their CDN for jquery would be enough to make
most of the web unusable.

~~~
toastal
You can use the Decentraleyes add-on to deal with jQuery on a CDN

~~~
jedberg
I can, but getting my whole house to use it including the iPhones may be a bit
tough (this is a Pi-hole add on so it needs to work without device changes)

~~~
toastal
I get that. Using uMatrix, it becomes really obvious how many websites are
reliant on jQuery and likely don't really need it.

------
tomatotomato37
How does this deal with recaptcha? That thing is the bane of my web browsing
experience, but at least with my current umatrix setup I can toggle it back on
in 30 seconds if I need to pass the challenge; if I need to remote into the
DNS everytime I hit a challenge it is a no go for me

~~~
jedberg
It says at the bottom that you might want to whilelist recaptcha

------
godelski
This makes me think of the "Cutting the Big 5" article [0] that was on here a
few months back. While I agree with a lot of the sentiment here it seems that
a complete block is actually impractical. Instead I would love to see these
kinds of projects not just blanket cover all of FAMG, but rather target the
most nefarious ones. I definitely don't know if this is even possible. But is
there a way to use services but cut out a significant portion of the tracking?
Those are the curated lists I'd love to see.

[0] [https://gizmodo.com/i-cut-the-big-five-tech-giants-from-
my-l...](https://gizmodo.com/i-cut-the-big-five-tech-giants-from-my-life-it-
was-hel-1831304194)

------
user17843
This is overkill.

A way simpler solution is to simply not have a registered account with those
companies. That's where the problems start, when they tie certain browsing and
telemetry data to your true identity.

For everything else a good content blocker + the typical pihole list that
include telemetry domains are enough protection.

I am registered with Apple and Amazon, and there's no way for me to change
that because there is simply no one else that delivers this kind of value.

Long-term I could see the possibility of leaving Amazon, but there is a
security-advantage when using amazon because otherwise I would leave all my
personal data to countless small vendors who regularly get hacked, etc.

~~~
csydas
This is actually not feasible as a solution because of shadow profiles. Google
et. al. track you even when you are not logged in. Simply landing on a page is
enough to capture your use habit and infer browsing/purchasing patterns from
it. Look at Google Purchases revealed to many just a bit a go. It was
retroactive for sure just scanning our inboxes which Google does have access
to, but it can use known information to find seemingly anonymous data from
referred info in the Anon chain.

It's not really a choice to say "just don't use it", because even appearing on
a site with Google tie ins feeds mineable information.

~~~
user17843
You contradict yourself. Google Purchase history requires a google account,
they can't connect it to you if you are not logged in.

The reason google pushes the log-in in their browser is exactly because they
want to be able to tie this all to your account.

~~~
csydas
No, that was an example, not a requirement. Google has this history they
associate whether or not you have a Google account. The account just
solidifies it. You're still being tracked and identified without the account.

~~~
user17843
and that's an assumption you make that requires evidence. Your claim is that
there is not only this kind of identification happening, but that it happens
even if I have the common tracking blockers. Otherwise the blocklist in this
thread would be completely overkill, just as I said.

------
madads
I would like to simply block irrelevant YouTube ads while my toddler indulges
in ‘Land before time’ episodes. Is that possible? Last time I checked they use
some randomised domains to load ads...

~~~
Anarch157a
Firefox with uBlock Origin takes care of that. I haven't seen an Youtube ad in
years.

Yes, they use randomised hostnames, but there are other parts of the URL that
are not.

If you don't want to use a browser extension for pattern matching the whole
URL, you're gonna need a transparent proxy in your networks gateway.

------
idlewords
I like that this site punts on reCAPTCHA. The web is not usable without Google
services, which fact is useful in talking about the impossibility of consent
as a model for regulating Google.

~~~
CreatedForThis
Alternative exists. It might not be usable entirely without them, but it can
on a certain point. We’ve separated the list into multiple categories, so that
way, this could be easier to block some majors parts of their services only.
And well, We´ve indicated the domain to whitelist in case you have issues with
reCaptcha´s.

~~~
idlewords
My issues with recaptcha don't matter. I can't use a large part of the web
without enabling recaptcha.

------
cortesoft
If you block AWS, Google Cloud, and Azure... that is pretty much the whole
internet.

~~~
CreatedForThis
But that isn’t the point of this blocklist tho.

------
oil25
Nice idea, but you're better off accomplishing it with TLD wildcards or AXFR
transfers than a hosts list, since new sub-domains are always being created
and rotated.

------
zeckalpha
Hosted by Microsoft, suggests blocking Microsoft, too.

~~~
CreatedForThis
I´ve proposed the idea to move it elsewhere. Either from an alternative
service like on gitlab, or even on a self-hosted gittea instance.

------
tempodox
> Simply go into to your blocklist settings

Whatever does that mean? Is that a browser thing or a firewall thing or
something else entirely?

~~~
arnarbi
Presumably it's a setting on Pi-Hole: [https://pi-hole.net/](https://pi-
hole.net/)

------
bronlund
Why is Apple at this list? What have they done?

~~~
CreatedForThis
¨by their size, they are particularly influential on the American and European
Internet both economically and politically and socially and are regularly the
subject of criticism or prosecution on tax matters, abuses of dominant
positions and the non-respect of Internet users' privacy.¨ To be short, that
what it is. But we can pin point each cases one by one. This filter list only
concentrate on Google tho for the moment.

------
ggg2
this will not help much as some google products use IPs (4 and 6) directly
too.

~~~
dredmorbius
CIDR blocks are your friend, at the firewall.

~~~
jakeogh
Know of a lib to manage them?

~~~
dredmorbius
Not offhand. The Routeviews project is a useful way to turn them up, though.

[http://www.routeviews.org/routeviews/](http://www.routeviews.org/routeviews/)

Particularly reverse DNS queries.

------
throw2016
At the moment things like youtube, twitter have become culture and 'technical
solutions' to both unrestrained greed, surveillance and this rich fabric of
human communication seem to miss the big picture of their cultural value.

The value of these platforms are not technical, they are entirely from the
human element and everybody should be able to participate without opening
themselves to surveillance and abuse.

Like everything else to run a civilized society we need laws and its
unfortunate that this basic first principle of organizing human society needs
to be reiterated and debated right untill 2019 because of propaganda by Koch
brothers and their ilk on a self serving libertarianism which is as fantastic
as a disneyland version of reality.

