
Jumping out of a Google's bubble. It's more than just a matter of self-respect - behindai
http://oleginspired.zecamp.com/c1otdqmWfdAS4f
======
reimertz
4 years ago when I told my non-techy friends about the risk of using Facebook,
Google, allowing full access to photos/camera/location etc, I was a tin foil
hat man. I don’t blame them.

Today those same people are the ones discussing and raising their concerns.

I think privacy will be the next big selling point of any platform and there
is already one company that figured it out; Apple.

~~~
flokie
You mean one company that figured out _marketing_ that before the rest. Apple
is no saint.

~~~
InTheArena
We can hope that Apple is just as ahead of the curve now as they were then.
Apple may have figured out marketing, but let’s not equate that to selling you
as a good to other corporations a and governments. There’s a lot I don’t like
about their model. I think it’s completely Disingenuous to claim that Apple is
paying the App Store developers, as opposed to users buying goods. However
their privacy stance is something that should not be subject to whatBoutisms.

------
kazinator
> _Google use your personal data to put you in a bubble of your own interests.
> Creating illusion of objectivity, it narrows our perspective down and slows
> down the process of personal development._

I don't agree. People want this. Long before computers, let alone Google,
people had no problem in absorbing themselves in themselves and their
interests.

But, wait, really? Google uses my personal data to put me in a bubble of my
own interests? A large chunk of the results for anything I search for are
SEO-d garbage and many are irrelevant in other ways. _That 's_ supposed to be
a bubble consisting of my interests?

> _Google ignores websites privacy. Even if the website owner doesn 't want to
> expose any content for search engines, Google will come, get it, and show it
> on its search results and will make money off it._

Absolute, unadulterated nonsense. The Google spider identifies itself via HTTP
headers. It's trivial to refuse service to specific clients.

Example log:

    
    
      66.249.79.219 - - [23/Jun/2019:15:36:58 -0700] "GET /robots.txt HTTP/1.1" 200 66 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
    

I could easily set up a filter to turn that into a 404 based on the request
coming from Googlebot. Also note that it's accessing robots.txt. I don't have
a link to that anywhere; it is spontaneously doing so; why would they if they
had no intention of honoring the content.

I sometimes put private content on a URL to make it easily from somewhere. No
links to it exist; you have to know the URL. I've never had such a thing
accidentally indexed.

If people leave documents in searchable directories, don't use robots.txt and
don't filter out the Google indexer, they shouldn't be surprised if those
documents end up indexed.

~~~
behindai
> Google ignores websites privacy. Even if the website owner doesn't want to
> expose any content for search engines, Google will come, get it, and show it
> on its search results and will make money off it. Absolute, unadulterated
> nonsense. The Google spider identifies itself via HTTP headers. It's trivial
> to refuse service to specific clients.

Google spider ignores rules in robots.txt and show results from closed
directories. I could write a distinct article about how we fight against it.

~~~
kazinator
If that is so why does the indexer bother fetching these files at all? It's a
waste of bandwidth and cycles to be doing this extra fetch of robots.txt for
countless sites.

~~~
behindai
Kidding me? The average size of robots.txt is 500 bytes for one web site!
Average web page size is 2Mb. So if a crawler respectfully downloads a
robots.txt it could significantly save more bandwidth not downloading
protected pages.

------
reaperducer
I'm still in the (long) process of de-Googling my life.

Does anyone know of a list of addresses that I can plop I to my hosts file to
block Big G's tracking on non-Google sites? Something that is updated
regularly?

~~~
Fnoord
Privacy Badger by EFF should do the trick. On top of that I can recommend
Firefox + uMatrix.

If you want DNS-based, Pi-Hole indeed. If you want hosts based you're going to
have to modify that manually (AFAIK, there are probably managers which auto
update it for your OS). DNS-based is going to be network-wide, including
Android TV or your wife's smartphone or... you get the idea. I got my DNS
server available via WireGuard VPN. It also does DNS over TLS which adds
security.

Put Google in its own Firefox Container via Google Container extension.

You still need some other measures to avoid tracking and fingerprinting. Using
these will increase the amount of capcha's you're going to have to solve. I
can think of e.g. CanvasBlocker, and setting privacy.resistFingerprinting to
true.

There's also a Firefox extension which removes UTM from URLs. That would be
Tracking Token Stripper. Not sure about avoiding AMP.

------
elorant
I'd like to completely de-Google but the thing is that I have sites that run
AdSense and I'm happy with it. Not that I wouldn't mind for a viable
alternative.

------
AlphaWeaver
Anybody know why this fell off the front page so quickly?

~~~
Jerry2
A massive pro-Google bias.

