Hacker News new | past | comments | ask | show | jobs | submit login
Pihole-Antitelemetry (github.com/moralcode)
142 points by luxpir on April 10, 2021 | hide | past | favorite | 75 comments

> For a list of domains that should not break anything, use telemetry-domains.txt

For sure, because it's empty.

I find that a whitelist is easier to manage than a blacklist. I am not surfing the entire web every day. Why should every URL in existence be accessible by default. Instead, I prefer every URL is blocked by default. No different from any other firewall configuration. With logging of blocked HTTP requests and DNS lookups, it is easy to discover telemetry.

> No different any other firewall configuration.

Depends on whose firewall config, but I don’t suppose most firewalls are setup as whitelists for outgoing connections…

I appreciate your dedication but this seems like a huge pain. Like a restrictive Little Snitch config but with worse UI most likely.

I think that GUIs are a huge pain. I prefer config files. Also, Little Snitch is not portable across platforms. Mac-only.

Most firewalls are set up as whitelists for incoming connections. Similarly, I set up DNS and HTTP "firewalls" as whitelists for outgoing connections. Zonefiles define the RRs that applications are able to query and lists/maps/tables define the hostnames/URLs that are accepted by the proxy.

I’d be divorced in short order if my family had to constantly ask me to add sites to the whitelist.

I can’t see this being workable for the vast majority of people.

You could probably put each person on their own vlan and let them manage their own lists.

It might postpone the divorce by about 5 minutes while you try to explain it.

I don't understand why you would use MacOs if you care about privacy. Why do you use it? (Just to be clear, this isn't "boo macos sucks", I'm genuinely curious why so many people use it (especially in it inclined communities))

This has to be a mistake lol. I had kind of assumed that the lists that come with my pi-hole block most telemetry stuff already.

This was started 11 days ago. I would also start out with all domains in beta. If you add the list to pi-hole you will eventually get the domains once they are "stable" either way.

Even the beta list is only 24 lines long...

Yeah... about that. Seems less useful.

A reminder that this is not limited to PiHole. I use an EdgeRouter and this list can be dropped in when the right plug-in is used (1).

(1) https://github.com/britannic/blacklist

Thanks for the pointer.

> edgeos-dnsmasq-blacklist has been tested on the EdgeRouter ERLite-3, ERPoe-5, ER-X, ER4, UniFi Security Gateway USG3 and USG4 routers

A nice HOSTS resource (that I frequently post in similar discussions)(not affiliated, just a user for a looooong time) is:


Hasn't broken anything yet on my machines, and does a pretty good job (with my AdBlockPlus, uBlock Origin, NoScript, Privacy Badger).

Some devices started to use hardcoded IPs to phone home, so purely domain name based blocklists won't work with them. Is there a similar project with upgradable IP lists to deal with them too? That would be technically a firewall, but then ideally it should also implement DNS based blocking since we're protecting also from the inside. It would be a nice product to build around any of those ARM small boards with dual Ethernet and WiFi such as the NanoPi R1 and similar ones.

Not exactly what you’re asking but dnscrypt-proxy has IP based block lists. You list IPs and then any domain that resolves to one of those IPs is blocked. Works when companies setup domains that actually resolve to some 3rd party data tracker.

Actually blocking IPs as you’ve said is a harder problem sadly.

Why? If the devices use hardcoded IPs, then those should be fairly static so fairly easy to maintain in some list.

I'd think that the best workaround for doing these kinds of shenanigans will be using some form of DoH, in which case the countermeasure would be to set up an HTTP proxy which wouldn't allow http connections to "naked" IP addresses.

DNS based block lists are incredibly easy to implement and maintain and require very little resources. All of the complaints from corporate IT admins about DoH demonstrate this. (I believe Chrome still won't default to DoH for corporate managed browsers)

Any normal home users can setup dnscrypt-proxy or PiHole and have it 'protect' their whole home network, but actually filtering your whole network's traffic based on IP is out of reach for most.

Blocking the IP means having something in the traffic flow. This would likely be a firewall if your aim is to block any "weird" connection from your network. But both firewalls and proxies are substantially more challenging than your run of the mill RaspberryPi Zero and PiHole.

That would be the goal. Malicious actors aren't going away anytime soon, so I would expect more and more devices in the future to use either encrypted or off standard DNS queries to different ports, if not downloading ads and uploading telemetry disguised as system upgrades. We'll likely get to a point in which we'll need to block connections address by address, in the hope they won't set up their malware on addresses and ports we can't block to keep the device functionality.

Some routers (DD-WRT I think?) now can hijack any non-encrypted DNS queries and send them where you want. That will be my next step.

You can DNAT outbound port 53 connections to an internal server. Any router/firewall with configurable NAT can do this. This is a must with some smart TVs for example.

My Mikrotik router does this easily. You can tell it (with a Firewall NAT rule) "any outbound connection to port 53 is to be redirected to this internal IP and port" -- and this internal IP and port is where my PiHole is.

Yup, with the Adblock package on openwrt this was a one click option in the GUI. Doesn’t help with DoH unfortunately, but it definitely helps in general.

Or... point your DNS resolver to a nextdns.io and enable their ad/telemetry blocking lists (thousands of entries regularly updated)

1 When exceeding the free monthly quota, NextDNS will continue to answer DNS queries like a classic non-blocking DNS service.

This was linked from the repo. https://github.com/nextdns/metadata/tree/master/privacy Looks like you can just add those to pihole as well.

There's also filterlists[0]

[0] https://filterlists.com/

Hadn’t seen these before.

Bunch of JSON files there - any advice on which ones to use??

Those JSONs all link to the real source. This metadata is probably NextDNS specific. If you open the JSON, copy the link and add it to your pi-hole/AdGuard Home setup and you're set.

Is this effective against CNAME masking?

I recently found that my mesh wifi was logging all outgoing traffic. In a 4 person household where we are all online, the two Android devices absolutely dominate the logs with Telemetry. Samsung Smart TVs are pretty chatty too.

Internet tracking is completely out of control.

Why... not add these to the default tracking lists used in pihole and call it a day ? Been using pihole on 1. Zero w and 2. 3b+ for over a year now at two places. Around 2 mil domains in the list and 70%-80% domains blocked like always.

Wouldn't it be more efficient to send imaginary data instead of completely blocking telemetry? Blocking your own telemetry data results in Google collecting just a bit less info about you. They can still make decent profile about you from data they collect from other user devices.

On the other hand if you poison the well you compromise other user data as well. Detecting and filtering out invalid data takes time and effort and by the time it is detected the bogus data has already been replicated and used to drive decisions. BTW would it be legal to inject bogus telemetry?

There was a chrome plug in maybe a year or two ago that did something similar. It automatically clicked every ad on the page. It ended up impacting revenue / billing enough that Google removed it.

Here is it: https://adnauseam.io/. Works on Firefox just fine.

I run unbound with dnssec on a raspberry pi to block domains. This list will be a fine addition to my collection. Sadly it is empty.

Presumably the entries in the beta file will be moved to it once they are out of beta.

My sentiments exactly.

I'm tempted to set up a wifi with logging of all ips / names then connect a fresh android with no external apps up. See what appears.

But help me understand, obiwan, why is an empty file useful in this instance?

I did this for an M1 mac running macOS 11 (Big Sur) recently, and posted a summary and the actual pcaps:


Thanks for this. I recently was handed a Mac for work and gobsmacked after running Little Snitch at earlier post's suggestion.

Something along these lines is the idea.

"Fresh android" like, a Samsung phone? A Lineage phone? A google Pixel? A GrapheneOS Pixel?

Great start, now do Microsoft.

I find Microsoft is absolutely crazy. I have a Win 10 PC I mainly use for Prime Video and some idle browsing when I can't be bothered to turn on my main PC (which runs Linux).

All my browsers have some form of adblock extension. uBlock for Firefox/Linux and Edge/Win10, and AdGuard on Safari/MacOS.

According to the stats of my pi-hole over the last 24 hours, more than 50% of the queries originating from my Win 10 PC were blocked (6277 blocked out of 11930 total).

For comparison, my Mac, which is the computer I've used the most for actual browsing since last Friday afternoon, only had 1292 blocked queries out of 7100.

The Linux PC usually has extremely low numbers of blocked queries. It's probably thanks to the combination of uBlock and uMatrix and it running Arch, so practically nothing even tries to phone home.

My numbers are lower, but I've used OOSU10 to disable a lot of stuff.

Not sure if it's still the best tool for this, but in case someone wants to try it: https://www.oo-software.com/en/shutup10

They write the system processes.

Blocking some interceptable calls to some of their servers doesn't do shit.

Lobby them or your government to protect your privacy, or switch.

This is great.

I just set up a pi-hole here a few weeks ago.

Are there other good lists that people recommend I add?

I run these: https://www.github.developerdan.com/hosts/

Of course the PiHole default ‘Steven Black’ list is also a combination of many well maintained lists and so even if you don’t add lists, his project is regularly adding new sources.

I have found https://firebog.net/ to be a good source for generally non-disruptive lists, which you can pick and choose from based on your needs. Hope this helps.

I find good recommendations on reddit.

Pi-hole sub or something else?

Does anyone know about a well maintained (anti-) gaming blocklist?

> Research shows Google collects 20x more data from Android than Apple collects from iOS.

But "open-source", amirite? Thanks Google.

There was a discussion here a few days ago that showed how misleading this statistic can be, by pointing out that Apple is sending home geolocation while Android isn't. The conversation needs more nuance than who sends the most bytes.

AOSP is open source but all of Google's apps and Google Play Services is proprietary.

What, if anything, can be done about devices that hard-code DNS servers and get around your pi-hole?

You can force all port 53 traffic to your server.

There is nothing you can do about devices/apps that really want to use their own servers (DNS over HTTPS, pinned certificate), short of keeping them offline.

Is forcing all port 53 traffic to your pi-hole something that’s can be done on the pi itself? Are their any websites you could link to that would go into more detail?

You'd do this on the router, or the unit providing NAT, which is probably not the Pi.

Cf. Edgerouter: https://www.reddit.com/r/Ubiquiti/comments/6lndq4/question_r...

You have to do this on your router, so it's model-specific. Searching "<your router model> Pi-hole redirect" will likely turn up something of assistance.

Great work. Adding these to my domain blocklist now. Much appreciated.

It is crazy that people have to resort to such solutions. Why telemetry isn't illegal? If you were going to track someone in real life, you'd end up in jail in no time, but on the internet it is fine?

Tracking someone in real life is not illegal.

You can sell someone a physical device that makes a lot of noise and then sit outside their home and write down each time they use it. Nobody would be able to stop you.

It is called stalking and it is very much illegal.

Because people agree to it when they sign the "terms of service" of these software and OSs.

This is bogus circular logic and you know it. People want to use thing X and will blindly press "I agree" because they simply see it as a door handle before entering somewhere.

Having such long ToS-es that "protect" the company against any eventuality should be by itself illegal.

It's a rigged system is what this whole thing is. Let's not pretend otherwise, please.

Life by design is rigged and there is little we can do to change it. Natural selection may help with some problems, but it is also cruel in its nature. Knowing it does not mean we should not try to make the world a better place, we should. But there are fundamental limitations like IQ, free will, laws of physics etc that we should not forget about. Going back to the ToS problem, they could probably offer a more expensive version with a ToS aimed at more demanding customers, so that they could opt out by paying more. I think it would be fair.

> they could probably offer a more expensive version with a ToS aimed at more demanding customers

This is already on offer, but not from Google.

I know, I was answering the question literally from a legal standpoint, nothing more. Chill.

Almost all click-through terms of service are, IMO, designed to prevent people from reading them.

https://www.eff.org/wp/clicks-bind-ways-users-agree-online-t... and many many others if you search for "click through terms of service readability"

Would people read ToS, if they were more "attractive"? Well, maybe some, but then again, it would be a chore anyway. I do not see a way to make people spend a substantial amount of time on it, if they are not absolutely forced to do it (for example if the stakes are high). However, I do not have any papers to back it up, it is just my hunch.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact