
Noiszy – A Chrome plugin that creates meaningless web tracking data - yarapavan
https://noiszy.com/
======
_wmd
There have been a few of these plugins floating around recently, and really
everything that needs said about them appears in the comments already. Fake
traffic is wasteful, hard to make look authentic, and only serves to create
more records of the end user around the web rather than less (e.g. your laptop
IP was generating fake traffic? That probably means you had the lid open and
were doing something with it at that time)

I'd much rather see a browser with some kind of built-in distributed cache,
something along the lines of FreeNet, but trading perfect anonymity for
performance. Given a large chunk of disk space, and a handful of browsers
talking to each other in a local area (e.g. same ISP), it should be viable to
concoct a scheme where after a handful of browsers request a particular page,
the remaining browsers are confident enough that the data cached in their
local group is representative of the data sourced from the origin network.

There are a million issues to iron out with a scheme like that, e.g. bad
actors injecting crap into the cache, handling staleness, interactions with
dynamic content and API endpoints etc., but I think something like this would
have a much greater privacy benefit by denying at least some traffic to the
origin networks, or simply by keeping some of that traffic within the boundary
of the local ISP's network (and if the local ISP is evil, requests between the
nodes could be encrypted as in FreeNet).

~~~
guelo
That would be a cache of what's popular. That would actually help on the
surveillance side because it would effectively be a filter of the common
traffic allowing them to focus on your unique traffic.

~~~
_wmd
That's an excellent point!

------
liotier
Distributed spidering for a community-fed search engine index would provide
comparable end-user benefits while being considerably more socially useful.

~~~
ballenf
Or a local meetup group for subscribers to a compatible ISP (all DOCSIS 3.x
cable modems, e.g.) where you bring in your cable modems throw them in a pile
and then take home a random one. Do it monthly.

Combine that with a script that takes all non-sensitive browser history and
cookies and does the same -- swap out with a random stranger.

Neither is practical and the former would require (at least for my provider)
that I go through a sometimes cumbersome process of registering the new
device's MAC address.

~~~
luhn
How would the former affect web trackers?

------
watty
It's kind of funny their website includes TypeKit, SquareSpace, and Google Tag
Manager scripts - all of which can (and probably are) tracking various things.
They may not be connected to "you", may be anonymized, may be unique number
representing you.

~~~
noiszytech
Hi - Noiszy creator here. You're right about the scripts - we're using GTM and
Google Analytics. I think it's 100% legit for sites (like mine) to want to
know how much traffic they're getting - this to me is a good use of data. I
work in the analytics field and actually I love data - I just don't like AI
using data in creepy ways. The thing is, there is a big moral grey area
between "count visits" and "creepy targeting based on everything you do". My
hope is that by disrupting the data, we can make the conversation happen about
the _right_ things to do with algorithms. I don't have the answers but I hope
to be a part of the discussion.

~~~
watty
I get it. Everyone, everywhere, is collecting data - whether it's CNN.com or
Noiszy.com or Google Analytics. While you may be using Google Analytics for
basic tracking of anonymous visitors, Google may be using it for "creepy
targeting based on everything you do".

I'm really not opinionated (I think the plugin is interesting and am
completely ok with being tracked) I just found it a bit hypocritical to create
a product against tracking while simultaneously providing tracking data to
Google.

~~~
noiszytech
I think of it as a product to deter AI processing of your data, not a product
against data itself. I, too, am ok with being tracked - but being ok with data
collection =/= blanket approval of all data processing & use.

Thanks for raising this point.

~~~
fenwick67
Google Analytics most certainly does feed user data into "an AI" and use it to
track users and sell ads targeting them. I'm not sure what you're getting at
here.

~~~
delroth
Does Google Analytics' privacy policy even allow Google to do that?

~~~
fenwick67
They are certainly allowed to according to their TOS:
[https://www.google.com/analytics/terms/us.html](https://www.google.com/analytics/terms/us.html)

"Google and its wholly owned subsidiaries may retain and use, subject to the
terms of its privacy policy (located at
[https://www.google.com/policies/privacy/](https://www.google.com/policies/privacy/)),
information collected in Your use of the Service"

Also read:
[https://www.google.com/policies/privacy/partners/](https://www.google.com/policies/privacy/partners/)

------
codydh
I don't mean to be a curmudgeon, but why put out this kind of privacy-minded
plugin for Chrome, one of the browsers I'd probably least trust to respect my
privacy?

~~~
gingerbread-man
Google has actually blocked similar "noise generation" apps from the Chrome
Store in the past. As I recall, there were a few extensions out there that
would 'click' on every ad displayed. Obviously that would have created serious
problems for Google, so they nixed it.

~~~
literallycancer
[https://adnauseam.io/](https://adnauseam.io/)

I wasn't convinced it was very useful and thought it would only take some easy
heuristics to filter out when I first saw it, but if Google removed it from
the store, it must have been a pain to deal with.

Maybe I'll even end up using it.

~~~
always_good
This website reminds me of the KenM post where he suggests that he's getting
back at Papa Johns by not tipping the pizza delivery driver.

Generating click fraud on your favorite web comic's website or other small
websites you may be visiting is pretty sad. Adsense bans for life.

~~~
literallycancer
I don't read webcomics, and somehow I don't think people like gwern who write
long-form content do it for the ad revenue. And even if there are ads on the
website, if they respect the DNT header, AdNauseam won't auto-click them.

(I'm not sure what the tipping anecdote is supposed to mean, but I don't
understand why US customers put up with tipping either, the business is
responsible for it's operating budget, and there is no legitimate reason to
bother the customer with it.)

~~~
thrill
How do you determine if a server is respecting a DNT header?

~~~
dexterdog
You don't.

~~~
obmelvin
I believe that is GPs point.

------
slifty
Hello all! Recent creator of a similar tool that has been getting a lot of
buzz, I am here to throw some constructive thoughts out there!

These ideas are useless from a technical level (for all the reasons that have
been mentioned already.)

Where they are useful is at a social level. People are energized and ready to
fight. Many of them didn't know about this issue. Many of them didn't know
that there are things they can do as individuals to fight back. Your tool (and
mine) are getting attention because they open eyes and tap into pain.

As useless as noise might be, people understand the idea and that makes it
accessible. That means people will try it, get it, and share it.

We need to leverage that attention in order to teach those people things they
need to understand about privacy. Our tools should be seen as a gateway into
impactful approaches like Tor, VPN, HTTPS Everywhere, Privacy Badger, and the
EFF at large.

Tooting my own horn: that's what I've been doing with
[https://slifty.github.io/internet_noise/index.html](https://slifty.github.io/internet_noise/index.html)

In all interviews I make sure to explain that while this is an amusing form or
protest, it is not effective and people who care need to go take the steps
outlined on the project page.

A website can do this. A chrome plugin, however, risks being harmful with
minimal benefit. It minimizes the potential for communication to your
audience, it is also harder to access which means you are touching a more
narrow audience.

Here's the good news! The project I linked to is open source --
[https://github.com/slifty/internet_noise/](https://github.com/slifty/internet_noise/)
\-- you could contribute to it directly and then update your plugin so that
instead of generating noise and hijacking their browser information you just
direct them to the website version of the concept.

~~~
noiszytech
Hey there - I like your project. Sounds like our tools are complementary, and
you make an interesting point about the browser-vs-plugin based approach. I do
think it's worthwhile to specifically browse news sites - I believe they're
the worst "filter bubble" offenders right now, and this can help break that.
The more efforts there are in this space, the better. Thanks for sharing your
project!

------
marvinkennis
I created something similar yesterday afternoon [0]. Instead of opening a new
tab, it just requests pages through a hidden iFrame and drops the headers so
the request goes through.

It doesn't click on anything, because it would be awkward if this by chance
started sharing things on logged-in Facebook profiles etc. I plan on adding
sequential requests over the weekend so the traffic is more realistic.

It's open source and on Github, so you can download, install and modify it
from there if you wish [1].

[0] [https://chrome.google.com/webstore/detail/decoy-
requests/aeh...](https://chrome.google.com/webstore/detail/decoy-
requests/aehfkpmplnnelkljnpekbbiddaigocmf)

[1] [https://github.com/marvinkennis/Decoy-
Requests](https://github.com/marvinkennis/Decoy-Requests)

~~~
noiszytech
Cool! I really think this is a "the more the merrier" space. More tools =
good.

------
jstapels
Not to sound pessimistic, but as more and more people get forced into metered
bandwidth, how is using a plugin that generates extra random traffic "sticking
it to the 'man'"?

Edit: Yes, I know this is supposed to mask your actual viewing habits. But
security through obscurity has never really panned out for anyone in the end.

~~~
dullgiulio
This is not security through obscurity, this is obfuscation. Like everything
in security, it's not all-or-nothing. It adds valuable noise, it won't hide
your tracks.

------
amelius
Will this not be counterproductive, i.e. distribute your personal data to even
more websites?

By the way, I once got blocked by Google after installing a plugin that did
automatic random searches in the background.

~~~
agumonkey
Maybe install it in a different profile ?

~~~
amelius
IIRC, my IP address got blocked for some time.

~~~
agumonkey
Yes, Google can still monitor IP behavior, but you don't leak cookies for
instance.

------
bognition
> Read and change all your data on the websites you visit

While I applaud the authors for trying to solve a problem, I will not install
this plugin unless its open source. There is no way I'm going to grant the
above permission to some random plugin just because they tell a compelling
story.

~~~
TjWallas
Same here with a tiny twist: opensource _and_ has a SHA512 mentioned somewhere
in the repo that could be verified against whatever comes from the chrome
appstore.

~~~
throwaway2048
Would be meaningless because the chrome store auto-updates addons.

~~~
TjWallas
Questionable action but, one could freeze extensions via extended file system
attributes & permissions or work around the auto-updates through the hosts
file or other means... [https://serverfault.com/questions/354606/where-do-i-
find-the...](https://serverfault.com/questions/354606/where-do-i-find-the-
update-url-for-google-chrome-extensions/354635#354635)

.. Or even a combination of the above.

------
soared
IMO this is taking the wrong approach. Why spread your data accross more site,
when the problem is the individual sites you visit - not random ones you
don't? This (dead) project of mine sends similar data to the site you are
currently on - but instead of noise it is purposely malicious and will ruin
the website owners tracking abilities if used at scale.

[https://hello-kill.github.io](https://hello-kill.github.io)

------
glenneroo
I've been thinking about building something similar since the first Snowden
leaks. I figured encrypted traffic to various locations would be useful
considering that security agencies store everything until they can decrypt it
at a later date. Unfortunately I'm not well-versed enough in implementing
proper encryption and I'd probably just end up shooting myself in the foot.

Has anyone else ever thought about doing this?

~~~
arekkas
[https://news.ycombinator.com/item?id=14003270](https://news.ycombinator.com/item?id=14003270)
:)

~~~
d--b
[https://news.ycombinator.com/item?id=13643873](https://news.ycombinator.com/item?id=13643873)
:)

------
ajuc
Using this will raise some flags (your online behaviour will be different from
most).

------
scotchio
Somewhat related:

A long time ago I tried to delete my Facebook and realized it only deactivates
until next login. You have to specifically request that they permanently
delete it or something. And, it's not even clear if FB still doesn't store
your info or not after that whole process (little sketch to be fair...).

So I came up with this idea that I'd make a service called socialfacewash.com
that just completely trashes your digital profile (liking random things,
changing your info, and just basically obfuscating what FB thinks they know
about you).

I never built it though, but I kind of wish I did. Still own domain if someone
wants it.

Trend seems to be that we're not going to have protections over our own
digital privacy/data for a very long time. Maybe a service that could at least
mask, lie, trash, obscure our footprint for everyone else would be nice to
have.

~~~
bartkappenburg
Someone already created that:
[http://suicidemachine.org/](http://suicidemachine.org/)

The even got a cease and desist letter from FB.

------
yarapavan
Here is the author's guest post providing background for this plugin -
[https://mathbabe.org/2017/03/31/guest-post-make-your-
browsin...](https://mathbabe.org/2017/03/31/guest-post-make-your-browsing-
noiszy/)

------
akerro
[https://github.com/dhowe/AdNauseam](https://github.com/dhowe/AdNauseam) does
the same but for ad-tracking. It hides ads from webbrowser but clicks them in
background (just sends request to tracking server that user clicked the ad).

~~~
neuralFatigue
I think Google removed it from the Chrome Web Store recently. Bummer.

~~~
akerro
so at least we know it works!

------
JustSomeNobody
So, it's going to be noise among a patterned world. How will they not be able
to filter this out?

People are creatures of habit. Example: I read HN with my coffee every
morning. Interjecting meaningless data doesn't prevent them from finding my
patterns.

~~~
disiplus
so we need a plugin that creates a patterns. could we crowd siurce a patterns
also. lets say i share my pattern with everybody and you share yours.

~~~
JustSomeNobody
I bet the ML guys and gals would be able to suss out my pattern from a fake
pattern.

~~~
snerbles
Start an arms race, use ML to generate fake traffic.

------
Johnny555
I'm not sure I understand the point of this -- marketers will just discard
this random data that doesn't match a human access pattern.

If this anonymized my tracking data for sites I visited, that would be useful,
but sending a bunch of random hits doesn't seem like it will keep anyone from
tracking _my_ activity which is what I want to shield.

------
Animats
It generates meaningless tracking data by doing meaningless browsing
automatically. So it sucks bandwidth. That's not a good solution.

Returning fake responses to tracking cookies is more efficient. For phones,
returning bogus info to app requests for contact lists and location is
especially effective.

~~~
noiszytech
It does suck some bandwidth - and your point is well taken, that would be a
great technique too. It loads about one new page per minute, so it shouldn't
be a crazy amount (unless pages contain video, for example). The thing is,
anything less than visiting from your own browser on your own machine and
doing user-like things (like waiting and then clicking again) is _easily_
filter-out-able for most tracking tools. The idea here is to purposefully
associate the "noise" with your online footprint.

------
justshashank
I have been working on something similar last night. Hack it and make some
noise. No strings attached, kill it once you think its obsolete.

[https://github.com/shashanksurya/HaughtyDog](https://github.com/shashanksurya/HaughtyDog)

------
Sir_Cmpwn
This would probably be better as a daemon, not a browser plugin.

------
arekkas
This is exactly what I had in mind when Snowden happened. I never wanted to
take this on, because of multiple reasons. Very awesome to see this, installed
it immediately.

------
Insanity
Maybe I am being too naive, but could someone explain to me why just using a
VPN would not be good enough to hide your traffic from these companies /
governments?

I understand that perhaps they _could_ ask the VPN provider to give logs
(depending on the provider, if they keep logs or not). But that would not work
for the algorithms used to target you as an individual.

Could someone please explain that to me? :-)

~~~
jehna1
VPN commonly only hides your IP for the time you're using it. There are many
other factors that can be used to track you.

First is cookies. Say you log in to Facebook with your browser from your own
IP address and a hour later you fire up VPN to browse the web. One of the
websites you load has Facebook's tracking pixel and - unless you cleared your
cookies - boom: Facebook knows you have visited that site.

Even if you clear your cookies, you may have some long-living stuff (like
Flash cookies etc) that can be used to track your browser.

And even if you clear everything or use incognito, sites can use some clever
heuristics (CPU power, enabled plugins, timezone, browser version, webrtc,
etc.) to track your browsing and to match that it's you.

~~~
Insanity
Well, for the first part, just restrict internet access without turning the
VPN on. (For me, I can not access any website when the VPN is not active).

The last part of which you wrote is incredibly interesting. I had no idea they
could find out the CPU power or the enabled plugins.

Though, to be fair, I think I'd fall in a pretty large pond to match. And
either way, that information does not seem very useful (to me) for targetted
advertisements.

Thank you for your answer :-)

~~~
programd
The EFF lets you see all the fun ways you can be tracked by your browser
fingerprint:

[https://panopticlick.eff.org/](https://panopticlick.eff.org/)

~~~
Neliquat
And yet that is a tiny fraction of fingerprinting tech. Many devs and
companies roll their own as well. It is hard to anticipate or defend against
all, or even most.

------
iask
An interesting way to look at this, but wouldn't it be possible to pluck this
pattern out from relevant data?

~~~
danbruc
It most likely is because it is incredible hard to create realistic fake
traffic. The moment you start using such a tool is probably very easy to
detect due to a sudden jump in traffic. Then you can use prior data to figure
out what you were actually doing - sites you visit, times you are active
times, how you switch between sites, how long you stay on sites, and so on.
Randomly following links will show very different distributions.

You have to fake your browsing behavior with pretty high accuracy but if you
do that, you defeat the entire purpose, you just created a copy of your
browsing behavior. Maybe if there is no prior knowledge of your browsing
behavior and the fake traffic is human-like enough, but even in that case it
seems not too unlikely that one could separate the two traffic sources. If you
want to hide your browsing behavior, the fake traffic must be different from
yours. But if there is a difference, the difference can potentially be
exploited to separate the sources.

~~~
jerf
If I were writing this, I would try to get it into as many hands as possible,
and do some sort of backfeeding of data to generate "real" profiles of user
activities, then send back down suitably randomized profiles of other real
users that you use for some period of time.

It's important to conceive of this as an arms race, and I think one of the
underappreciated things about arms races is that if you successfully
anticipate the next three things your opponent will do and counter them, you
can end up discouraging them from even fighting in the arms race. I don't know
how random the page visits are, but if they are random in, well, almost any
sense, eventually they will be something the trackers can filter out. "Well,
this person shows a really strong signal for football, and weak signals for
anime, robotics, accounting, and a hundred other things. Show them the
football ads." Even today that wouldn't necessarily pose much of a challenge.

By contrast, if you get profiles from volunteers that are real, and then start
mixing them up so that everybody downloading this extension shows ten or fifty
equally strong signals of interest, of which only one is real, that'll
scramble the data being collected something fierce, and require the data
collectors to jump straight to very sophisticated teasing out of what's really
true, which initially won't even be worth it until a lot more people are using
this stuff. There's a lot of elaborations you can think of from there
(temporal correlations, i.e., college basketball interest should be spiking
now, vs. football), etc.

I have no idea what this extension is doing, because I can't seem to find any
data about what it is doing, so maybe it's doing some of this stuff, but I
expect they'd be talking about at least the data they'll need to collect to
make this really work if it was going to work this way, so I assume without
proof that it is not this.

In general, it's worth pointing out that a lot of systems can't be fooled by
uniformly random data, because all real-world systems already have to be able
to filter that out because all real-world systems experience noise, of which
at least a significant component is probably more-or-less uniformly random. If
you really want to scramble a system you need to be more clever.

~~~
danbruc
_By contrast, if you get profiles from volunteers that are real, and then
start mixing them up so that everybody downloading this extension shows ten or
fifty equally strong signals of interest, of which only one is real, that 'll
scramble the data being collected something fierce [...]_

This would also increase your traffic ten or fifty fold in a first
approximation. You would make a single household look like a family with
several people sharing the Internet connection. Netflix for example can
separate several people using the same account, I read about that during the
Netflix Prize [1].

But overall I agree with your point, you probably can throw a wrench into the
machinery maybe even to the point that it is not worth to pursue tracking any
longer. But it certainly is not easy and randomly clicking on links will
definitely not do the trick.

[1]
[https://en.wikipedia.org/wiki/Netflix_Prize](https://en.wikipedia.org/wiki/Netflix_Prize)

~~~
jerf
"This would also increase your traffic ten or fifty fold in a first
approximation."

I suspect if you worked at it you could mathematically prove that will be a
necessary condition for any _effective_ data collection scrambling technique,
under an assumption that we can't use proxying to other nodes. And I do mean
"mathematically" fully literally.

I'm eliminating the possibility of proxy, where you try to set up a situation
where you create a P2P network and trade page views around, because I think
that only works with a _really_ abstract view of how the tracking works. In
practice, as soon as you want to use a site in an authenticated fashion you're
getting tracked via that authenticated account, so I _think_ I can argue the
only real possibility for scrambling the data is for it to source from the
same network location as the real data you are generating.

On that note, it occurs to me this plugin probably ought to be automatically
creating authenticated accounts on services you don't care about; the
authenticated status of an account creates a shining signal that may be too
bright to mask. Considering that a lot of the data you're trying to mask would
be coming from Facebook, that would be a problem.

And now that I think of _that_ , use of this plugin really ought to result in
your Facebook profile getting pretty badly scrambled, too.

Man, this is a big challenge. There's a part of me that is actually quite sad
I can't drop everything and start going to work on this right now for pay.
This sounds wicked fun, but way bigger than I could ever dream to take on.

------
nkkollaw
Excuse the ignorance, but are we safe in Europe..? Are we only talking about
American ISPs?

What if an American ISP had a branch in Europe, could they sell our data, or
the history must have been generated on American soil, or something like that?

------
edem
How is this supposed to work? If I click on `Start` it loads a page then
nothing happens. I have to click `Start` repeatedly to make it visit new pages
despite that it says it is `Running`.

------
ComodoHacker
I believe random noise can be filtered out easily with AI. To be effective,
this gonna be AI vs AI arms race, much like AV vs malware.

~~~
kleer001
No need to bring a grenade to a fist fight.

If the overall history data is aggregated you won't really need to filter out
the noise. Random browsing will just disappear under the threshold of noise
and the 'real' browsing will stand out.

------
MR4D
How about a plugin that routes any request to a sdserver thru Tor?

That would really screw up things for any location tracking.

------
xyz-x
Is this plugin available for Firefox?

------
IgorPartola
What exactly is the point? If the FBI suspects you in a case of the
international heist of carrots and your search history includes "consealed
carrot transportation" and "circumventing carrot museum security", it doesn't
matter what else you googled. It only matters that this is included in it.

Edit: it was a 24-carrot job.

~~~
chasingtheflow
I don't think it's about hiding from FBI inquiries but obscuring the value of
data that ISPs, Google, Facebook, et al are collecting and selling without
your consent.

------
accountface
Doesn't seem like the best idea when it comes to environmental sustainability.

~~~
monochromatic
What does one have to do with the other?

~~~
accountface
Bandwidth has a physical cost

~~~
monochromatic
I assume it's de minimis, but do you have a reason to think otherwise?

------
grandalf
I had an idea a few years ago to make a chrome plugin that would send
encrypted emails (containing a randomly generated message) to all sorts of
Muslim country email addresses, helping to reduce the fruitfulness of
encryption circumvention and surveillance.

------
lowonkarma
Finally a DDOS solution for the Chrome browser

------
En_gr_Student
radar detector detectors ... they are going to eventually happen here. sadly.

------
apahwa
this needs to be open source. this plugin could be very dangerous to install

------
akirayamaoka
Traffic filters easily remove any plugin attempts to make a noise.

------
cfarre
Seems a nice idea

------
doodpants
Turns out it's only for Chrome, and the front page doesn't even mention this.
Could we at least change the HN title from "A browser plugin..." to "A Chrome
plugin..."?

~~~
jchw
It's hosted on Chrome Web Store, but otherwise it could very well work just
fine in Edge, Opera and even possibly Firefox. They're all converging toward
supporting WebExtensions, a mostly-Chrome compatible API.

~~~
throwaway2048
Could? or does?

~~~
jchw
Didn't try. It almost definitely will work on Opera since Opera is based on
Chrome and there's ways to install straight from the Chrome App Store. If the
author released a direct download, it'd be easy to test in Firefox and Edge.

------
siegecraft
It's a nice marketing stunt -- not meant to be perjorative, I hope it inspires
more innovation in this space. But I find it ironic that a plugin that
purports to make it harder to track you has google analytics.

~~~
TeMPOraL
Dogfooding, I guess :D.

------
romanovcode
inb4 it gets removed from Chrome Web Store!

------
mtkd
"creates meaningless web tracking data"

or

creates a massive botnet which will eventually be used for some nefarious PPC
scam or worse

you decide ...

