
You’re not anonymous. I know your name, email, and company. - theinfonaut
http://42floors.com/blog/youre-not-anonymous-i-know-your-name-email-and-company/
======
pygy_
Several people mention Ghostery[0] against trackers. It offers only partial
protection. It is possible to fingerprint a browser without any custom
tracking data.

<https://panopticlick.eff.org/> <\-- check how unique your browser is.

Instead of a script to embed, these firms could provide an API to identify
users from the server side. The scripts that captures the profile would be
served by the sites themselves rather than from third party services.

Toast.

A possible solution would be anonymize the browser fingerprint, at least in
private mode, ie lie about the details of the system.

Google, Mozilla, Opera, can you hear me?

\--

[0] <http://www.ghostery.com/>

~~~
dsr_
Not only does panopticlick say my browser is unique, but it said that last
time I visited it, several months ago.

Maybe it's forgotten. Maybe it lies. Maybe every time I rev Firefox Nightly I
change identity.

What is true is that every time I leave an email address, it's tagged with the
name of the site where I left it.

~~~
pygy_
The gmail + trick can be defeated, though. If you own a domain, you can use
arbitrary addresses and it becomes more reliable.

I use Google Apps (possibly not a good idea but soo convenient :/) with a
catchall. I didn't catch any offender so far...

However (off topic), spammers are spoofing addresses as if they were coming
from my domain. I receive two to three dozens automatic replies from mail
servers (this address does not exist...).

I've properly set up DKIM and SPF records, making it obvious that these mails
are spoofed, but I'm afraid my domain will end up on grey/black lists...
Anyone out here familiar with this kind of issue?

~~~
bathat
I think a lot of sites have caught on to the + trick, though, because more are
more disallow the + character in the email validator.

~~~
pyre
More like laziness. They probably only allow something like:

    
    
      ^[a-z0-9_.]+@(?:[a-z0-9-]+\.)+\.[a-z0-9]+$

~~~
jfoster
In my experience it's been ignorance rather than laziness. I suppose you could
argue that the root cause of the ignorance is laziness. If someone is writing
a validator, they probably ought to check what constitutes valid input.

------
eli
I'm skeptical of this unnamed company's actual abilities. In the _initial_
email how are they able to identify anything about your visitors _before_
you've installed the tracking code? Since they apparently can see search terms
used to reach your site the only thing I can think of is their code is running
on some site that links to you (perhaps an off-brand search engine?) and
they're tracking outbound clicks. Or it's fake.

It's pretty easy to guess company name from IP address, especially if you
don't care about accuracy. You can kinda sorta do this in Google Analytics
under Audience > Technology > Network. That seems to be roughly what they're
doing in the screenshots posted. IMHO, this is not the most serious privacy
issue on the web.

I would be very curious to hear exactly what percentage of visitors it is able
to supply Name and Email for (and how many of those fields look bogus). This
sort of individual-level tracking across sites is obviously possible, but I
don't think it's common. Google/DoubleClick do not, as far as I know, do any
sort of tracking at the level of an individual's name or email address (And
why would they? It's asking for regulatory problems and it doesn't really help
them much -- they target ads to groups of similar people based on
demographics, not to particular named individuals.)

~~~
paulgb
For users without showdead, the user darrennix (who appears to be the same
Darren Nix who wrote the article) posted this comment. Why the mods or system
would kill it I have no idea.

> It's a fair question and one that I asked myself. If the entire service is a
> fake, then it is an extremely elaborate one because the name and emails of
> the individuals it did indentify (which I noted was a small percentage) were
> real.

~~~
SoftwareMaven
I don't know what kind of numbers we are talking about here, but if a user
clicked through the OP's site to a tracked site, there would be referrer
information that could be backtracked.

~~~
eli
It wouldn't be able to include what search terms were used to arrive at OP's
site like the email seems to be showing.

------
rsobers
HubSpot (and pretty much any other marketing automation tool) has this
feature, too. They lookup company name and location by IP address and build an
anonymous "prospect" record representing each visitor so that salespeople and
marketers can detect whether prospects from a given company are hitting the
site for information.

The second a prospect submits a web form, all that previous web activity is
tied to their email address (and any other info you collected via the form).
You now have a real lead.

I don't see any privacy issues with this.

What I _would_ see an issue with is if the tracking company were sending the
IP address and cookie back to a central database to query "Does anyone _else_
know who this visitor is?" and then provide PII any company who uses the
tracking service.

The moment you start giving my PII to a company that I didn't voluntarily give
it to is when I feel a line has been crossed.

~~~
paulgb
> What I would see an issue with is if the tracking company were sending the
> IP address and cookie back to a central database to query "Does anyone
> _else_ know who this visitor is?" and then provide PII any company who uses
> the tracking service.

That appears to be exactly what's happening. The email mentions "access to our
entire network of identified data ([...] we can identify any visitor [...] if
that person has filled out a web form from any other website we are
tracking)".

~~~
rsobers
Well, then this _is_ F'd up. I guess I didn't read the post carefully enough.
:-/

------
nostromo
Just looked through Zendesk's network calls -- looks like it's probably
Demandbase. [http://www.demandbase.com/landing-page/demandbase-real-
time-...](http://www.demandbase.com/landing-page/demandbase-real-time-id-
service/)

Surprisingly, AdBlockPlus doesn't seem to block it.

Edit: actually it's LeadLander.com as pointed out by NiekvdMaas here
<http://news.ycombinator.com/item?id=4891764>

~~~
r4vik
AdBlock is for blocking (annoying) ads, not trackers.

~~~
fl3tch
Depends on the filter list. EasyPrivacy is designed to block trackers, and if
you look at the list

<https://easylist-downloads.adblockplus.org/easyprivacy.txt>

you'll see that Demandbase is there.

------
jfriedly
This sounds eerily familiar. Around a decade ago, a data analytics company
called Pharmatrak was actually found guilty of breaking federal wiretapping
statutes for doing something very similar. [1] In their case, they had built a
network tracking HTTP GET requests to pharmaceuticals companies websites with
a web bug [2] and attached cookie. But because some of the pharmaceuticals
companies were using GETs as the method on HTML forms (remember, this was ten
years ago), the users actually ended up making GET requests with personally
identifying information in the URL encoded parameters. Since these GET
requests were logged by Pharmatrak, and neither party (the users nor the
pharmaceuticals companies) had consented to giving away personal information
to them, Pharmatrak was found guilty of wiretapping.

Pharmatrak eventually won on appeal though, arguing that they had no intention
of collecting personal information, which exonerated them because only
intentional eavesdropping is a crime.

The company in the OP's article could make no such arguments though. I suspect
that their main difference is that they make no assurances of confidentiality
to the websites using their software the way Pharmatrak did. Which 1) is just
really creepy, and 2) sets them up for trouble with users in California,
because California's wiretapping statutes say that it's a crime unless _both_
parties agree to it. [3]

[1] <http://cyberlaw.stanford.edu/packets001737.shtml>

[2] <http://en.wikipedia.org/wiki/Web_bug>

[3] I'm not sure if this applies to police, but it definitely does to private
parties: [http://www.citmedialaw.org/legal-guide/california-
recording-...](http://www.citmedialaw.org/legal-guide/california-recording-
law)

Edit: Added third reference.

------
andrewljohnson
If I found out a site I used employed this tool, I'd both trash them publicly
and never use their service again.

~~~
antoncohen
sencha.com, activestate.com, sandisk.com, clustrix.com, and about 2000 others
use LandLander. I checked the privacy policies of those four sites and none of
them say they are giving away your personal information. On the contrary, they
all explicitly say they aren't.

"We do not share any information about you or your company to unaffiliated
third parties, except as necessary to administer the communications we offer
and as permitted by law. We may use a third party service provider to for
communications; that company is prohibited from using our users’ personally
identifiable information for any other purpose. If you follow us on Twitter,
Facebook or on other social media services, we may use information provided by
these services to customize our communications to you. We will not share the
personally identifiable information you provide with other third parties
unless we give you prior notice and choice." -
<http://www.sencha.com/legal/privacy/>

Nearly every company using LeadLander is breaking the law because their posted
privacy policies do not state that they are giving a third party your personal
information, and that third party is giving it to others.

Edit: It looks like <http://formalyzer.com/formalyze_call.js> is the specific
js file that uploads personal information. Of the sites I listed only
clustrix.com is loading that (on the contact form). The other sites seem to be
using LeadLander without the form tracking.

~~~
jholman
As I understand it, in the US the FTC enforces privacy policy violations. If
you don't promise your customer anything, then you're more or less off the
hook (as far as I know). But if you do have a privacy policy, and you violate
it, then you're misleading consumers.

IANAL, YMMV, etc.

~~~
antoncohen
California law requires a privacy policy to be posted:

<http://en.wikipedia.org/wiki/Online_Privacy_Protection_Act>

------
seiji
Is the weasel company's javascript (and/or flash bug) logging all form input
back to its own servers to capture name/email when you sign up somewhere else?
Are they capturing credit card numbers too?

We can tell the world all day long this is Bad and Unsafe, but within six
months it'll be more popular than ad retargeting and the meebo crapbar
(because, hey, analytics!).

~~~
eli
I can't imagine many legitimate sites participating in this scheme. It would
certainly violate many publicly posted privacy policies.

~~~
seiji
Sadly, opportunistic jerkfaces aren't limited by our privacy-hat-wearing
engineer imaginations. They can devise much, much worse schemes we would
dismiss in five seconds out of "ethical" concerns. (privacy=dead, remember? do
anything to track people and manipulate them into giving you money. if you
aren't selling anything, sell the tracking as leadgen.)

~~~
ams6110
I pretty much assume that anything I post to any website could someday come
back to haunt me. Expect no privacy on the internet, and you won't be
disappointed.

------
ChuckMcM
Can someone provide a regex that would identify this tracker? I'd like to run
it through our index and see if I can come up with a list of sites that employ
it.

~~~
k3n
Probably not since none of us know who this firm is -- and thus the
hostname(s) and/or IP(s) used; we'd probably need to contact the author for
that info. Once we know that, the regex would be dead-simple...

~~~
ChuckMcM
Well Darren saw the tracker, and he reads HN, so perhaps.

~~~
systemtrigger
try:
[http://www\\.leadlander\\.com/trackingcode\\.asp.*&id=.*](http://www\\.leadlander\\.com/trackingcode\\.asp.*&id=.*)

~~~
garazy
We've got trends for trackers that we can detect and are currently not loaded
by third party JS -

To name a few -

<http://trends.builtwith.com/analytics/LeadLander>
<http://trends.builtwith.com/analytics/Hubspot>
<http://trends.builtwith.com/analytics/Marketo>

There's a lot of them out there now and mostly all of the big ones are
continue to grow in popularity.

------
keesj
The initial data is fake.

Proof: <http://o7.no/Z0huP7>

I get emailed by them for every startup I'm involved with and that first email
is mostly the same every time as you can see in that screenshot. (Compare it
with the one posted in the article and you'll see).

They seem to be targeting startups and make it look like some big VC firms are
visiting your site to get you interested. I'm not sure how they come up with
the 'search terms', but I guess they could just look at your META-tags or make
them up.

In their email they do say it's a "mock example", but still I find it very
deceptive.

------
angersock
I suppose it's too much to ask that we as developers and engineers show some
fucking backbone and refuse to work on or with these tools and projects? And
publicly shame those who do?

~~~
peterwwillis
But engineers that build components for bombs and missiles should keep
chuggin' along.

~~~
angersock
Those are all overt means of oppression, whose use and abuse is so obvious in
most cases that restraint in their usage is exercised by those who wield them.

Something like this is a quiet, terrible thing slinking about unnoticed until
it is rather too late. I believe these things have more potential to cause
harm than any missile built and kept in stasis.

~~~
peterwwillis
My point was that shaming engineers for choosing what they work on is stupid
because often they work on _components_ and don't ever build a complete
missile, or "quiet, terrible, slinking thing" by themselves.

The developers at these kinds of places don't need to know what they're
building. They have many tasks assigned to them and one of them is to write an
API that collects a single piece of data. Many kinds of data are collected
from many places and put into a database. Reports are made and cross-
referenced by an analyst. Final reports are generated and fed to a guy who
deals with direct marketing or advertising or sales. Any of these jobs could
also be done by contractors or third parties.

You can't just tell people how to make a living without understanding what the
hell you're talking about. That's my $0.02 anyway.

(P.S. people that work on missiles are often academic researchers and work for
both the private and public sector on the same thing for many different
clients, and aren't told what it's used for. the more you know...)

~~~
angersock
Anybody competent enough to build a system like this, even somewhat smaller
units thereof, is more than clever enough to see the forest despite the trees
and recognize that their work could be used for bad purposes.

Again, let's not argue over defense contractors or some damn fool thing--when
you work for Google, when you work for AT&T, when you work for Palantir or
HBGary or whoever, you don't get to say "lol not my department I made swing
apps and file dialogs" when you find out they've done something bad.

We need to speak out when people work on harmful technologies.

~~~
peterwwillis
What's one of the "bad purposes" you're talking about, anyway? What's the
threshold of "badness"? How do you define what is worth quitting your job, and
what might merely annoy a user? Can you even quantify it? Is it illegal, and
where? What is 'it', anyway?

I'm talking about things like identifying if somebody is gay or republican or
kinky and using the information for profit. Aside from selling it to
background-check websites and the like, and the fact that it's information
people willingly give up about themselves to entities unknown, I have trouble
understanding how you can be so offended you think people should lose their
jobs rather than develop potential parts of a potential system that could
maybe harm someone at some point.

Your assumption about the "cleverness" of developers is misguided. If a guy is
told to write a small piece of code which simply takes HTTP requests from
JavaScript and plugs it into a database, there is no idea what the fuck that
could be used for. The guy maintaining the database also may not know what the
fuck he's looking at, it may just be numbers. Are you really so willfully
ignorant as to believe every single outcome of every single human action is
cut & dry?

~~~
angersock
Have you ever read "Scroogled" by Cory Doctorow or Stallman's "The Right To
Read"? Both are a bit absurd at first glance but viewed today seem oddly
prescient.

I'll even accept your assertion (for the sake of argument only, mind you!)
that engineers at a company might only work on some small fragment of JS
munging numbers in a database.

At some point, though, an engineer needs to implement the API for a saleable
product using that information, or code up a dashboard with element names like
"#user-site-history" or "#tracked-profile-visits", or at the very least see
the marketing materials the sales folks use to show that the product is
competitive due to this information gathering.

Your assertion makes publicity even more important--eventually, _some_
engineer or admin is going to have to get their hands dirty and _that_ is when
they need to speak out.

~

To go back and answer your "so what if we have targeted advertising" directly:
there is currently no heavily established legal framework of which I am aware
that protects metadata about users gathered for the purposes of advertising. I
do not know if Google or Facebook is prevented from giving up (for whatever
reason!) the results of their ad engine's analysis of user browsing to anyone
at a whim.

We (Americans, at any rate) are very lucky that our government at least goes
through the motions of liberty enough to not overtly round up deviants and
send them off to the camps or send drones after them--this is far from the
case in various other countries.

As far as the idea that the information is given up willfully, we're talking
about techniques and technology that are really only ten or fifteen years
old...the average consumer has not had time to build up any sort of reasonable
intuition about what they are sharing or not sharing, or how that information
can be linked to other facts about their lives. To say that they've
"willingly" given up this information is, I suggest, somewhat misleading.

------
dskhatri
Dataium does this too, as covered by WSJ's recent article on the subject [1]

The article goes into depth about how much personal information is sent along
to advertisers including a popular dating site's apparently anonymized
information about drug use, and sexual orientation.

I think we need a non-profit service that defines a set of privacy licenses
(akin to CreativeCommons' licenses) which companies can opt to label their
websites/apps with. There would be no policing/auditing [2], but companies
found to violate the privacy licenses would be obliged to donate a sum to an
organization like the EFF.

That the privacy policies would be encompassed by one simple privacy licence
badge would allow users to quickly and easily identify a company's privacy
policies. I believe users would gravitate toward using services that display
this license.

Edit: it appears such a service is in the works - <http://privacycommons.org>

[1]
[http://online.wsj.com/article/SB1000142412788732478440457814...](http://online.wsj.com/article/SB10001424127887324784404578143144132736214.html)

[2] The auditing process would likely become complex, costly and corruptible

~~~
yk
I like the idea, but unfortunately the damage is already done, when you see a
site without a privacy badge. (Since the browser did already execute any JS/
send headers etc.) But perhaps there could be a plugin like noscript, that
searches for a given privacy settings on the site and allows/blocks JS and
third party content. ( If there is no widespread use of the plugin, then noone
will include the privacy badge, if there is widespread use, then there is a
strong icentive to abuse the badge. So one would need some way to really
enforce the privacy badge...)

------
z0mbak
quote: At 42Floors, we’ve made the decision not to use any visitor
identification tools...

facts (detected by a ghostery at 42floors.com): ClickTale, Facebook Connect,
Google +1, Google Analytics, MixPanel, Optimizely, Twitter Button

~~~
chewxy
None of them can be used to single out and identify individual users (except
GA... which I believe can be done if you are clever with it)

~~~
cbs
>None of them can be used to single out and identify individual users

...by 42 floors. They're still telling all those networks that I visited the
website.

~~~
Hovertruck
I can't speak for all of those other sites, but at Chartbeat we're vehemently
against tracking any sort of personally identifiable information.

------
eranation
Going to site A, not providing any info, then going to site B, C and D and
seeing ads to site A haunting you is one thing, capturing your name and email
is a new level. If you don't use a tracking blocker, clearing cookies is not
always going to work, these persistent trackers are quite sophisticated, they
use local storage if possible, IP address, header information and whatever is
possible to be able to identify someone, there is a huge industry behind it.
But this one is taking it a little bit too far, scary.

On the other side, most startups including YC ones, use some sort of tracking
for analytics to improve usability and internal flow, so advocating against
all trackers and for all users installing a blocker is a double edge sword.

~~~
jordo37
Transparency: I'm a co-founder at Perfect Audience and we believe strongly in
the benefits of retargeting for the end user, for the advertiser and for the
content publisher.

I don't see a moral issue with retargeting because at its heart it's anonymous
- all we know about a user is a string of sites and maybe search words.
However, as soon as that data is correlated against personal information, as
soon as the real world data and the digital paper trail are correlated and
identifiable it becomes sufficiently creepy to me. Who knows - maybe 5 years
from now this will seem innocent and benign compared to the mind-reading
banners on the bus stops but this seems like a line in the sand I am willing
to draw today.

~~~
eranation
Yes, I agree this is where the line passes, I don't see a big moral issue with
ad retargeting as long as there is an opt out option and a privacy policy
somewhere to read. We don't like it when we see ads we don't like (or worse
when people looking behind our shoulders can know a lot about us just based on
the ads we get on our laptop in the coffee house), but we all like it when we
use it to promote our own projects, or when an actually relevant ad shows up

------
losvedir
This is why I've deleted my facebook account and browse with Noscript
disabling javascript (except for whitelist), RequestPolicy blocking cross-site
requests (except for whitelist), and CookieMonster blocking cookies (except
for whitelist).

It wouldn't completely work here (e.g. EFF's panopticlick could still fairly
uniquely identify me, or IP address would give away info if I'm not going
through my VPN), but it improves things.

It feels kind of extreme, but it's worth it to me. My experience is not broken
that much, and I feel like various sites are aggregating less about me. These
tracking technologies not such an issue now, but I foresee at least the
possibility of abuse in the future, so I figure I'll do what I can now if it's
not too much hassle.

Lastly, at its heart most of this is about advertising, something I know I'm
very susceptible to (try as I might to convince myself I'm not). So the better
I am at blocking out these things, I think the less money I'll spend in the
long run on frivolous nice-to-haves.

~~~
huhtenberg
Thanks for the RequestPolicy pointer.

(edit) Does anyone know of a Firefox/Chrome add-on that strips referrer info
from cross-site requests? That'd be the simplest way to deal with all
externally hosted .js and images that double as trackers.

~~~
copypasteweb
<https://addons.mozilla.org/en-US/firefox/addon/refcontrol/>

------
Bockit
This kind of thing is what I've always seen as the potential end result of
things like google analytics and also facebook connect. Both products that
have javascript running on a vast number of websites, with the potential to
link to personally identifiable information, in a similar manner to that
discussed in article.

I can't imagine that I'm alone in this train of thought.

~~~
k3n
You're not, and it's why I largely avoid (at all costs) turnkey solutions that
certain websites employ for parts of their site. For instance, the sites that
use something like zoho.com or disqus.com for blog comments; even though
they're overt about their usage (as opposed to hidden tracking code), I'd
rather not be heard at all then to willingly yield my personal information.

------
pippy
I had to give Dick Smith (A NZ retailer) my phone number before I bought an
external the other day.

"Do I _have_ to give you my number before I buy this?"

"yes, but it's for return purposes only"

Of course I received 'promotional' txts the next week. I was hesitant to give
it to them for just this reason, and because I acknowledged I had a phone
number I felt obligated to give it to him. Dick Smith is a member of a larger
chain it's no stretch of the imagination to hook up CCTV cameras to an OpenCV
instance and send txts to customers when they walk in.

No matter the law, morals people hold, or customer wants large companies are
always motivated by profit margins. The Consumer Guarantees Act, the Privacy
Act, the Bill of Rights Act all become murky when you're dealing with new
technology, and law will find it hard to keep up.

~~~
chris_wot
I'm surprised tat NZ doesn't have stringent laws prohibiting this. In
Australia, if you give info for a pacific reason, that's the _only_ thing it
can be used for. Heavy fines can result if it's used in any other way.

~~~
Evbn
Isn't everything done in Australia for a Pacific reason?

~~~
chris_wot
Darned iPad...

------
isalmon
I recognize these screenshots - it's definitely Leadlander. I'm not sure if
they do what he claims they do, but they can identify by your IP which company
you belong to (assuming you're connecting from the office). There are a lot of
companies doing that right now actually.

------
px1999
Read the article, thought that it was something interesting but probably not
that applicable to me because I clear cookies on (frequent) browser close,
don't enter my details into many sketchy sites, use multiple different
(isolated) instances of my browser for different purposes.

Today, I get an email from a site that I visited yesterday and haven't heard
from in 6+ months. It's too much of a coincidence for me to assume it's random
so I dig into their website a little and they're using one of these services.

TL;DR: even though I'm relatively paranoid with giving out details online, one
of these networks seems to have successfully identified me and provided my
email to a website that I visited, who then reached out and tried to sell me
shit.

------
jpxxx
Tacky, cynical, nasty, and inevitable.

~~~
wilfra
My thoughts exactly. It's terrible, but completely expected in an age when
sites like RottenTomatoes and TripAdvisor already know who I am, which of my
friends are on their site etc when I haven't even signed up - all from deep
Facebook Connect integration.

~~~
paulgb
Both RottenTomatoes and TripAdvisor require me to authorize them on Facebook
before they show me any social data. Are you sure you didn't authorize them in
the past and forgot?

~~~
TillE
I'm almost certain that RottenTomatoes will display your name if you're logged
in to Facebook, regardless of whether you've given them permission.

I remember being disturbed when I saw that recently, and immediately sought
out and installed a social widget blocker.

~~~
thefreeman
If that's happening the widget is most likely an iframe loaded from facebook,
and not accessible to the RottenTomatoes server

------
datamaze
Can you please let us know the name of the company?

------
angryasian
ghostery blocks trackers and analytics

~~~
cpeterso
I use the Ghostery add-on for Firefox, but note that if you enable "GhostRank"
then the add-on will send every URL you visit to Evidon. This is purportedly
for "tracking the trackers", but it does give one pause.

~~~
graue
Their FAQ says otherwise: <https://www.ghostery.com/faq#q16>

“When a user opts-in to GhostRank, Ghostery sends the following information
each time a tracker is encountered:

    
    
        the tracker identified by Ghostery
        the blocking state of the tracker
        domains identified as serving trackers
        the time it takes for the tracker to load
        the tracker’s position on the page
        the browser in which Ghostery has been installed
        Ghostery version information”
    

Nothing about the URL you visit. Do you have reason to believe they're lying?

~~~
cpeterso
Ghostery's Alert Bubble does not report any trackers on this Hacker News page,
yet _every_ time I reload this page, my HTTP monitor (Charles Proxy) logs a
ping to ghostery.com:

    
    
      http://l.ghostery.com/api/page/?d=news.ycombinator.com%2Fitem&l=304&s=0&ua=firefox&rnd=7639974
      http://l.ghostery.com/api/page/?d=news.ycombinator.com%2Fitem&l=426&s=0&ua=firefox&rnd=5747246
      http://l.ghostery.com/api/page/?d=news.ycombinator.com%2Fitem&l=346&s=0&ua=firefox&rnd=8989043
    

Why does Evidon need to know about pages that have no trackers? The FAQ says
the _domains_ serving trackers will be identified, not the complete URL path
(minus query string parameters) for pages that have no trackers.

------
nikunjk
I had a similar experience with FlightFox. I entered my origin and destination
and got distracted and closed the tab. I get an email several hours later
asking me to start the contest with the exact two places. Creepy, much?

------
alxbrun
I agree it shouldn't happen. But honestly, is this really worse than what FB
does ?

~~~
rybosome
This is way, way, way worse.

If I'm understanding it correctly, the vendor is offering the following
service: place a JS snippet on your website. When a user visits your site,
data will be pushed to the vendor's server about the user and what they do on
your site (probably keyed on IP and as many other things as they can use to
fingerprint). In return for you sharing this data with the vendor, the vendor
will give you all of the data on this same user that was contributed by their
other clients.

Here is an extreme (yet possible) scenario. You go to a medical forum that
uses this software and create an account using your personal email address and
real name, both of which you select NOT to be displayed to the public. You
then post a message asking about a specific type of back pain you're having. A
few hours/days later, you're browsing for a gift for someone, and visit the
website of a salon that also uses this software. They can identify that your
browser visited medicalforum.com, see the email address and real name you
created an account with (since they were passed your form submission directly,
without regard to what privacy settings you used for the forum), and see the
topic you posted on back pain. So just to be helpful, they email you an
advertisement: "Hi {your real name}, we see that you're having some back pain
- bring this email in to {salon} for 15% off a massage!"

EDIT: To add, how do you know that you can trust the vendor not to display
seriously private data? What if an online store uses this JS, and the vendor
has your credit card info, possibly not-so-securely stored? Your information
becoming public would be as simple as Asshole Q. Pirate making a fake site
with some link-bait, and creating an account with the vendor.

~~~
mesozoic
Or worse you go and apply for health insurance online and get denied because
of this.

------
peterwwillis
Advertising and marketing companies aren't the only ones that do this. Any
corporation which owns more than a couple websites collects bits of
information about them from each site and then builds profiles of its users,
often then selling the information.

Say you own a sports website, a fashion website, a political website, and a
gaming website. The user only specifies a tiny bit of information on each
website. Each bit is collected into a single user profile from which they can
refer to do things like figure out what product advertisements to show them.
They use the same techniques to identify users that don't have accounts, and
still collect their viewing/interacting habits and add them to the profile.

Sometimes they'll send you an e-mail telling you to check out their gaming
website if you're not signed up, because the comments you write in their other
websites' forums have to do with gaming. Sometimes they just sell the
information to a gaming company. In the case of Target, they might send your
teenage daughter a list of baby products for the little one you didn't know
she was expecting.

This is not some horrifying violation of privacy. There is a price for all the
free shit you get from the internet. Usually it's paid for by all the personal
information you leak onto the net. They're just mopping it up and selling it
back to you.

------
physcab
I may be one of the few and perhaps I've just been desensitized with all the
social network invasion, but I don't find this stuff that reprehensible. At
worst, its moderately annoying because its one more email that I have to
archive but its definitely on the lowest totem pole of annoyances. Recruiters
have been cold calling and emailing me for years based off of my LinkedIn and
Github profiles and all I have to do is tell them "no thanks" and my life goes
on.

What's the big deal?

~~~
marshray
We used to call the Soviet Union the "Evil Empire" for collecting about one
tenth or less of this data on its people. That and rounding them up en masse
to send to the labor camps.

In the US we had this thing called "McCarthyism". For a while back in the
1950's you could very easily be fired from a professional job for having read
certain materials (mainly those from the Evil Empire of course) or having had
certain political discussions when in college.

Just wait a few years until you need to find a health insurance plan that will
agree to pay for the expensive medical treatments that will let you live a few
decades extra. We'll see if your past web surfing and consumer habits make you
worth keeping around.

------
anonymouz
This is outrageous and very much illegal in the EU.

------
Jach
In other words, the situation hasn't changed since the 90s.
<http://www.unc.edu/depts/jomc/academics/dri/idog.html> (By the way, is anyone
using <http://samy.pl/evercookie/> in practice?)

------
inthewoods
I'm rather amazed that any company would put this on their website. What you
may be doing, in fact, is likely identifying customers to your competitors.
Cross-shopping is very common in most product categories - so it is quite
possible that you're giving up your customer to a rival.

------
wlesieutre
Anyone know if this system respects Do Not Track settings?

------
joey_muller
Like you said, it shouldn't happen, but it's inevitable.

------
new_test
Please... A clever use of GA + Wolfram Alpha can reveal a lot of potentially
identifiable information already. You can't expect the Internet to become a
big part of our society and, at the same time, remain a place for complete
anonymity.

~~~
morsch
I can see how being a big part of society would _exert pressure_ to realign
with society's default approach towards anonymity (1) but I don't see how that
would imply any final status of online anonymity. Other forces might exert
pressure in the other direction. The technology itself might resist some
pressure. Access to certain technologies changes the societies themselves.

(1) And of course society in general used to be and still is pretty anonymous.
I can easily buy a newspaper in almost perfect anonymity through regular
channels, apparently I need to take special precautions to get the same status
online.

------
CookWithMe
Shouldn't we be able to make these systems useless by filling them with loads
of fake data/spam?

I.e. if I have to fill out a form somewhere, I would not only submit it once,
but several times (ideally automated), ideally with realistic data, i.e. other
businesses in my area (so geo-location won't raise a red flag then).

If I visit the next website which employs the same network, they can't really
identify me - they have a big set of businesses I could possibly be (or they
just take the last one, which would be fake).

At least currently, they do not seem to verify whether the filled out form can
be properly validated, i.e. if the user clicked on a confirmation mail or
similar.

Anonymity by obscurity :)

------
bcoates
It's sufficient to disable third party cookies and not browse cookied by major
social networks to prevent this right?

Ignoring the part where you can be "tracked" by company, but that's just
looking up public IP records.

~~~
lincolnwebs
As far as I know, yes. This is what I do. Ironically, doing so and thereby
solving the issue he's blogging about makes it impossible to comment via
Disqus on the 42floors blog.

------
laumars
Custom hosts files can be used to block trackers across all browsers and
applications. I personally use: <http://someonewhocares.org/hosts/>

------
Eeko
How the article is designed - it took me a while to understand that 42floors
was not the company performing the tracking. I initially went there to find a
name of the company (to put it on permaban in my NoScript), yet the only
organization popping up while skimming through the page was 42floors. I was a
bit spooked when I checked the noscript-list for blocked resources and saw the
url I thought was tracking me.

After that, I looked at the URL-bar and it took that long for me to click.

------
inetsee
I find it interesting that this article should show up the very same day I
found out about TAILS - The Amnesiac Incognito Live System
("<https://tails.boum.org/>), a live Linux distribution that uses TOR and
other tools to enhance your online privacy. The more I read about online
tracking efforts like this, the more I want to set up a wall around my
computer.

------
gggggggg
I would never use this on my site, but I feel if I was to, it should have big
tick box agreement, with a simple 1 sentence explanation.

------
jconley
Fond memories of the movie Minority Report spring to mind. Startups are, in
fact, working on this exact end-game facial recognition based ad technology
right now.

<http://www.youtube.com/watch?v=6-ZLw2Q7U2M>

and the company: <http://www.immersivelabs.com/>

------
mcantelon
This will be done in meatspace via facial recognition and the state will
likely demand access to this data. Disney are pioneering this sort of tech:

[http://occupycorporatism.com/disney-biometrics-and-the-
depar...](http://occupycorporatism.com/disney-biometrics-and-the-department-
of-defense/)

------
hoodoof
I am okay with companies displaying ads to me - this is what pays for the web
to exist. If however things like this continue to exist then I will take up
all options offered to opt out of identification and ad networks. Google and
Microsoft etc should take note to shut this sort of behaviour down.

------
kibwen
So, how feasible is it these days to do all of your browsing through a VPN?
Not that a VPN's going to save you from the attacks mentioned here, but hey,
maybe it's time to start getting serious about my privacy.

~~~
corford
Totally feasible and you can set it up in less than 15 minutes (either
yourself with a cheap vps and openvpn or from one of the hundreds of existing
VPN providers). However, if you're serious about your privacy that's only one
piece of the puzzle. You also need to invest some time in managing what sites
you are happy to accept cookies from and get serious about deciding how far
you are willing to go (in terms of inconvenience) to blacklist everything else
(using incognito mode, ghostery, noscript, adblock etc.)

Edit: also remember that all a VPN gives you is an encrypted tunnel between
your PC and the VPN end point. This means your ISP can't snoop on your traffic
and ad providers can't see your real IP (and thus location) but that's all it
gives you. It isn't a solution for anonymous surfing. If you want that, use
tor.

------
freshhawk
I noticed ghostery blocked 11 tracking cookies when I went to read this.

------
ChuckMcM
See <http://news.ycombinator.com/item?id=4954972> for an update

------
kragen
The Tor Browser Bundle is probably the best current tool for anonymous
browsing.

------
lcusack
I'm not knowledgable in this area but would using a VPN prevent this?

~~~
troels
No, how so?

------
adolfoabegg
Doing this in Spain would be illegal.

------
allsop8184
This is just terrifying.

------
skurks
wowza

------
martinced
Thanks for figuring out how they do it...

That said since a (very) long time I'm using separate Linux user accounts to:
check my professional email + G+, surf my personal email + G+ + FB (my FB is
using a fake but plausible name) and a third one to surf the Web.

The one surfing the Web is linked to a fake online identity: entirely made up,
with fake friends / fake G+ circles, fake StackOverflow / OpenID and basically
fake everything.

I then only ever surf using a transparent proxy for anything "work related":
the IP can't be linked to my fake IP.

It's not difficult to set up: I did set up the transparent company Web proxy
(VPN would to too) myself and basically Linux user accounts take care of the
rest.

Now I'll start using different browsers too and, why not, maybe Tor in one of
the account.

I take it I could take all this a step further and whitelist websites that my
"personal" account is allowed to connect to (using iptables' owner-uid mod).

~~~
untog
That seems extreme. Why not open your e-mail and social networks in
incognito/private windows? Personally, I use a browser add-on to remove
trackers, but I realise that isn't 100% foolproof.

~~~
cwe
Some require cookies to be enabled (Facebook I know does), so they networks
won't let you log in in a private mode.

~~~
untog
Chrome Incognito mode (at least) allows cookies, though- it just isolates them
to the incognito window and destroys them on close. I just logged into FB and
it worked fine.

~~~
WickyNilliams
Correct, cookies are stored for the duration of an incognito session.

I find it a little annoying though that cookies are not sandboxed by tab,
rather than by window/session. If I log into FB in incognito, and then do a
little more private browsing in a new tab the FB cookies are still accessible
in the other tab.

I guess I could be more vigilant with my browsing habits but I think this is a
fair feature to implement at browser rather than forcing user to jump through
more hoops to protect privacy. On a side note, when will chrome finally offer
API hooks to allow NoScript to be developed for it?!

