
The Internet With A Human Face - NelsonMinar
http://idlewords.com/bt14.htm
======
ChrisNorstrom
The formatting of this article itself is something worth studying. It's
brilliantly seductive to read and read all of it. The pictures/slides by each
paragraph were like rewards, continually luring me to the next paragraph. For
the first time in a long time. I read every single word. Not skimmed.

Although I disagree with the idea of regulating how long behavioral data is
saved. Not all behavioral data is sensitive. Rather we should consider fully
disclosing to users either how long their data will be saved or what data has
been collected on them or both. Any other regulations may be too burdensome to
the startup.

=== Examples ===

His suggestion that all behavioral data be deleted after a certain period of
time means every little piece of data collected must also have a timestamp.
Inflating databases and costing money.

A program must be written that seeks out timestamped data ready to expire and
delete it.

If the deleted data is connected with other pieces of data or reports
elsewhere we're going to run into complex problems.

These obligations must be handed down from company to company during
acquisitions. A company selling data about to expire will get acquired for a
lot less than a company with fresh data. This may in turn cause a series of
unforseen consequences in the acquisition market.

=== Solution ===

Rather than controlling and manipulating what can and cannot be done, it may
be best to just create transparent policies and let the free market converse
its way towards a compromise.

~~~
coldtea
> _let the free market converse its way towards a compromise._

There's no "free market" that's an ideology (a made-up idea of how the world
is that obscures one's thinking).

With government: I have more money than you, and I have friends in Washington.
I'll use it to push things my way.

(And lest somebody suggest: "sure, the problem is government", here's the
government less version:)

Without government: I have more money and/or power than you, and I have you
beat to a pulp, and also spend it to make people go along with my
propositions. I'll use it to push things my way.

Usually is a mix of 1 and 2. E.g most Latin American countries, for example,
there's not much of a "free market" with regards to their exports/materials
because stronger countries force them (with military might, diplomatic
pressure, juntas, putting friendly lackeys in power, briding, etc) to go with
their way. A powerful country can spend tens of millions of dollars just to
promote a favorable candidate in power in a smaller countries (easily
recouperated in a day's worth of profits from resource and trade agreements).

And of course with things as a patent system, intellectual property, etc,
there's no free market also. The IP owner sets the terms, and you cannot offer
the same thing for a reduced price even if you can.

~~~
munro
> There's no "free market" that's an ideology (a made-up idea of how the world
> is that obscures one's thinking).

Everything is a made up idea that obscures one's thinking. I think I have to
drive on roads, but really nothing is stopping me. I could get through traffic
faster if I just started driving on the shoulder, but the problem is if
everyone did that it would be chaos, and would be worse overall. Systems have
the potential to create a net good.

~~~
coldtea
Well, the "have to drive on the road" is a "law" or an imperative not an
ideology. People know it's a made up convention so that we don't hit each
other or depestrians.

People talking about free market, on the other hand, think of it as a real,
concrete thing, and even further, that it has this and that properties.
Thinking thusly about a made-up thing can have dire consequences -- like when
hallucinating on drugs and jumping from a building to avoid a huge snake.

------
pja
This is a great quote, and very timely with the recent metafilter kerfuffle:

"If you don't run your own ad network, advertising is a scary business. You
bring your user data to the altar and sacrifice it to AdSense. If the AdSense
gods are pleased, they rain earnings down upon you."

"But if the AdSense gods are angry, there is wailing, and gnashing of teeth.
You rend your garments and ask forgiveness, but you can never be sure what you
did wrong. Maybe you pray to Matt Cutts, the intercessionary saint at Google,
who has been known to descend from the clouds and speak with a human voice."

------
ronaldx
I'm now more hard-line than this on data privacy:

I have come to believe that businesses should not be legally allowed to store
any consumer data unless it's obvious to the consumer that it's absolutely
required for the primary function of the service, and they should only be
allowed to store data for that one function, with an exception if the consumer
explicitly and voluntarily opts-in for each additional function.

Large internet companies have been collecting swathes of data with the claim
that they are secretly using it to improve people's lives. But it seems to me
that A/B testing has failed to improve anyone's life.

Example: I use search engines to search for something I'm looking for.

I do not benefit from being shown 'targeted' ads, nor from the search engine
identifying the most populist answers which it uses to spoon-feed me later
rather than serve what I asked for, nor from the search engine identifying
which particular arrangement of pixels will leave me personally more addicted.

Businesses are welcome to use my data in ways which are in my interest, but
_they should not get to decide_ which of these uses are in my interest.

~~~
delluminatus
I don't think this will happen, for two reasons.

1\. The issue of what is "in your interest" is not so black and white. For
instance, one of the reasons Google is still so popular is because it more
consistently returns the results we want. On many DDG discussions (or Bing-
related discussions when it was still being discussed), this has been raised
as a deal-breaking issue for many people. Google's results are often more
relevant and useful. This is in part because of Google's vast store of
consumer data. Now, you may argue that their results are better for other
reasons, but the fact is that using their consumer data is an integral part of
their ranking algorithm, and _people like their ranking algorithm_.

2\. On a more cynical note, it seems that governments and corporations are
aligned in their interest to collect as much data about citizens as possible.
I doubt that the U.S. government will mandate a reduction in storage of
consumer data, when they themselves benefit regularly from that data thanks to
widespread use of NSLs and other legal and extralegal demands.

~~~
ronaldx
1\. Sure. My point is that I should have the right to choose what's in my
interest, rather than have a company tell me what's in my interest. At a
minimum, companies should explicitly say exactly what they are doing with my
data. That's part of our agreement, after all. I use DDG for this reason.
Actually I'm not super-happy with DDG in this respect: I believe there is some
shady blurring of their promotional message vs the small print. But I would
rather support DDG than Google, for now.

I don't think DDG needs to or should be catering for the people who find a
lack of super-personalised results deal-breaking:

a) the world certainly doesn't need DDG to ape Google. There needs to be more
competition in this space, and DDG's position distinguishes it. I hope and
believe there is a market for a variety of search engines (as there are around
the world).

b) in blind-testing, I'm not aware there is any evidence that Google does
better.

2\. I hold out hope that the EU data protection principles will one day be
properly upheld within the EU. I have no such hope for the US government.

------
coldtea
> _There was an ad for the new Pixies album. This was the one ad that was well
> targeted; I love the Pixies. I got the torrent right away._

I laughed very hard on this!

In all, an excellent article. I disagree with blind faith in technology to
solve all our problems and not create new ones

People often forget that technology is tools (and not always neutral tools, as
is another naive belief: some inventions have larger inherent "harm
potential"), and that policy matters as much, or even more, as does the kind
of cultural landscape we guide our use of the tools.

(Remember the classic xkcd comic: [http://xkcd.com/538/](http://xkcd.com/538/)
).

~~~
skj
> I laughed very hard on this!

Really? It made me think he was a bit of an asshole.

------
gammarator
The thesis the talk pivots around is this one, in my reading:

"Investor storytime only works if you can argue that advertising in the future
is going to be effective and lucrative in ways it just isn't today. If the
investors stop believing this, the money will dry up."

~~~
krakensden
It's worth noting that web advertising is still small potatoes compared to
more traditional channels- and the traditional channels make it astonishingly
difficult to measure effectiveness. There's an awful lot of money in selling
Coca-Cola, and Pinboard & Facebook haven't scratched the surface.

~~~
maxerickson
Google alone is ~10% of global advertising revenues. That's at least an
interestingly shaped potato.

[http://investor.google.com/financial/tables.html](http://investor.google.com/financial/tables.html)

(a nice friendly and solid number for global ad revenues is harder to come by,
but it's ~$500 billion)

------
woah
We read this, nod wisely, and go back to working on our centralized services
for VC's who hope to own part of a monopoly. "One day, this will change" we
think to ourselves.

------
L_Rahman
Still reading the talk, but as an aside wanted to point out that the way the
transcript is formatted with the words alongside the slides is probably the
best way I've seen a talk presented in text form on the internet.

~~~
alecdbrooks
He's uploaded talks in this format before. Thoreau 2.0 [0] is a particularly
good one.

[0]:
[https://static.pinboard.in/xoxo_talk_thoreau.htm](https://static.pinboard.in/xoxo_talk_thoreau.htm)

Video:
[https://www.youtube.com/watch?v=eky5uKILXtM](https://www.youtube.com/watch?v=eky5uKILXtM)

~~~
lemming
I also like "Our Comrade the electron":
[https://static.pinboard.in/webstock_2014.htm](https://static.pinboard.in/webstock_2014.htm)

I'm generally not a writing geek, but Maciej is one of the (very) few people
whose writing makes me wildly jealous.

~~~
L_Rahman
This might be my favorite of the three of his talks that I've read. Is there a
list with URLs out there of his talks?

~~~
lemming
Not that I'm aware of. His blog is also highly recommended, see for example
[http://idlewords.com/2010/03/scott_and_scurvy.htm](http://idlewords.com/2010/03/scott_and_scurvy.htm).

------
angersock
There's a rather hilarious portion (in an otherwise soul-crushing deck): the
author is trying to figure out what this massive dragnet and mining of their
information has actually gotten, and so they look at all of the ads they get
served. This bring forth this gem:

 _" There was an ad for the new Pixies album. This was the one ad that was
well targeted; I love the Pixies. I got the torrent right away. "_

~~~
jay_neyer
The thing I wondered was whether that was deliberate or not. Charles Duhigg
talks about this in his book on habits. Target has the data to identify
expecting mothers. Rather than bombarding strictly baby related photos, they
careful include seemingly other unrelevant ads so it doesnt come off as
intrusive.

Although I will say that in my own ad experience, the ads have generally been
more related to my interests than this author describes. .

~~~
thesteamboat
I thought I read something recently stating that the Target anecdote had been
somewhat overstated. Unfortunately, I could not find the article I had in mind
-- but here's a similar one.
[http://www.ft.com/cms/s/2/21a6e7d8-b479-11e3-a09a-00144feabd...](http://www.ft.com/cms/s/2/21a6e7d8-b479-11e3-a09a-00144feabdc0.html)

Choice quote:

Hearing the anecdote, it’s easy to assume that Target’s algorithms are
infallible – that everybody receiving coupons for onesies and wet wipes is
pregnant. This is vanishingly unlikely. Indeed, it could be that pregnant
women receive such offers merely because everybody on Target’s mailing list
receives such offers. We should not buy the idea that Target employs mind-
readers before considering how many misses attend each hit.

In Charles Duhigg’s account, Target mixes in random offers, such as coupons
for wine glasses, because pregnant customers would feel spooked if they
realised how intimately the company’s computers understood them.

Fung has another explanation: Target mixes up its offers not because it would
be weird to send an all-baby coupon-book to a woman who was pregnant but
because the company knows that many of those coupon books will be sent to
women who aren’t pregnant after all.

None of this suggests that such data analysis is worthless: it may be highly
profitable. Even a modest increase in the accuracy of targeted special offers
would be a prize worth winning. But profitability should not be conflated with
omniscience.

------
moultano
[https://support.google.com/accounts/answer/162743?hl=en](https://support.google.com/accounts/answer/162743?hl=en)

I thought it worth noting that Google does strip personal identifiers after 18
months which is in line with one of his proposed fixes.

~~~
ForHackernews
Not if you have a Google account, they don't.

> If you have Web History enabled, this data may also be stored in your Google
> Account until you delete the record of your search.

> When you create a Google Account, Google Web History is automatically turned
> on.[0]

So for everyone who has a Google account, and who hasn't taken specific steps
to disable "web history" (read: 99.9% of users) they're storing that data
indefinitely.

[0]
[https://support.google.com/accounts/answer/54068?rd=1](https://support.google.com/accounts/answer/54068?rd=1)

~~~
Centigonal
On that note:

The site you linked links to
[https://history.google.com/history/lookup](https://history.google.com/history/lookup)
, where you can see all of the searches Google associates to your account (if
you have web history enabled).

For me, this goes all the way back to 2006.

~~~
ForHackernews
If you turn it off, and then turn it back on again, are all your searches
still there?

~~~
jljljl
There is also a "Delete All" option.

Which is frustrating, I wish I could set it to just keep the last 3 months of
data, instead of having all or nothing.

------
corford
Fantastic as always. Every time I read one of Maciej's talks or essays I get a
little closer to throwing in the towel and pursuing a more meaningful
existence. It's going to happen one day and I can't wait to read the post that
forces it.

~~~
dkarapetyan
That post will never come. You have to do it starting now. There is no right
time and if you keep waiting for it then it will never happen.

------
marknutter
_This_ is how you share slides!

------
sirdogealot
Just a note about the whole "buying $500 triggered the police to be alerted
because of money laundering" notion...

The actual quote from the mashable article was:

>When her husband tried to buy $500 worth of Amazon gift cards with cash in
order to get a stroller, a notice at the Rite Aid counter said the company had
a legal obligation to report excessive transactions to the authorities.

So in reality, they made a big stink about the fact that they noticed that
Rite Aid practices safe KYC laws and would report suspicious money-laundering
activities to the government. As they must. Lest they expose themselves to
money laundering charges as well.

That isn't to say that the clerk who sold them the $500 gift cards immediately
picked up the phone as they were walking out the door and called the cops...
that just means that if they notice you regularly buying $10,000+ worth of
gift cards with cash only AND they just don't like the look of you in general,
that they may pick up the phone and call the police.

------
lightyrs
Ditto the kudos on the formatting. This piece really resonated with me. As for
solutions, I have none. Hopefully someone smarter and more resourceful than me
will be inspired by this talk.

------
thaumaturgy
This was excellent. It described some of the reluctance I've had towards
social networks since 2000 at least.

It's also a little bit funny that it was written by the guy behind
pinboard.in, a nice social bookmarking service (where many people went when
del.icio.us died). But that makes me trust the service more, not less.

Which probably means I am stupid.

~~~
abrowne
Pinboard does describe itself as "antisocial bookmarking" (and "Social
Bookmarking for Introverts"). Delicious seemed to me to be about making
bookmarking social. Pinboard seems to be about bookmarking with some social
options.

[Edited to add more context.]

------
bowlofpetunias
Much of what the author suggests in terms of regulation already exists in most
European countries, and most of it pre-dates the commercial rise of the
internet.

Sure, much of the wording and enforcement is lagging behind today's reality,
but the principles are clear: data about me is _my_ data. Companies are not
free to collect, collate and keep anything that they can get their hands on.

Facebook, Google et al are breaking the laws of countries they operate in on a
massive scale. The backlash is being tempered by massive lobbying from those
companies and the US government.

Not to mention the media propaganda (media are part of the advertising mafia),
as seen in the recent wave of scaremongering bullshit about the "right to be
forgotten" verdict.

~~~
fixermark
Can you expand a bit upon which media coverage of the "right to be forgotten"
verdict is scaremongering bullshit?

We may be tapping different media sources; most of what I've seen seems pretty
level.

------
gone35
_This festive map shows seismic hazard in Northern California, where pretty
much all the large Internet companies are based, along with a zillion
startups. The ones that aren 't here have their headquarters in an even
deadlier zone up in Cascadia. (...)

So even if you don't agree with my politics, maybe you'll agree with my
geology. Let's not build a vast, distributed global network only to put
everything in one place!_

That slide[1] hits close to home. I'm painfully aware of how hard (and almost
pointless/powerless) it is to reason about long-term geological risks, esp
compared to less catastrophic and more (short-term) predictable hazards like
hurricanes, tornadoes or blizzards; but from time to time I idly question the
wisdom, from a civilizational point of view, of having so many concentrated,
incredibly talented people living directly atop one of the most dangerous
fault regions on earth[2].

But again, it's pointless to think about it as an individual, so better get
back to work and keep living day by day, I guess. _Wovon man nicht sprechen
kann..._

[1]
[https://static.pinboard.in/bt14/bt14.069.jpg](https://static.pinboard.in/bt14/bt14.069.jpg)

[2]
[http://peer.berkeley.edu/pdf/Senate_testimonial-8-07.pdf](http://peer.berkeley.edu/pdf/Senate_testimonial-8-07.pdf)

~~~
fixermark
Maciej is over-estimating the risk potential here. Or, to be more precise---
the companies that have headquarters in the Valley are aware of the risk
potential and have contingency plans to mitigate it.

Not that I'm against seeing some great software companies grow out of other
geographic locations; I'm merely noting that it's not necessary to do so to
avoid the threat of earthquake-related disruption. Threat known and accounted
for.

~~~
PhasmaFelis
Perhaps. A awful lot of wealthy and reputable companies thought they were
solidly prepared against cracking attempts, too. It's harder to take their
word for it now than it was five years ago.

------
nl
It's too bad that Maciej Ceglowski (the author) is banned on HN, over some
infraction I never understood.

~~~
Mithrandir
Apparently he _was_ banned for a short period of time, but not anymore.

[https://news.ycombinator.com/item?id=7617576](https://news.ycombinator.com/item?id=7617576)

[https://news.ycombinator.com/item?id=4108205](https://news.ycombinator.com/item?id=4108205)

~~~
nl
Ok - I clearly hadn't followed it closely enough.

Maciej - please come back.

------
dkarapetyan
One issue is that big data is just too big for the small minds that are tasked
with gaining insights from it. It is a really weird hysteria that I can't
explain. Even network engineers are starting to collect every packet and
archive it in some kind of distributed data store like HDFS. What insights are
they going to gain from it exactly? If the goal is security then work on
better network infrastructure and tools, e.g. libressl. Collecting all that
data is not going to get you any closer to making better/smarter networks or
allow you to fight DDoS attacks any better because the underlying network
infrastructure is what makes it possible in the first place.

------
bambax
Very well written, but a little unconvincing: he does a great job showing the
ads he's bombarded with on Youtube are completely irrelevant to his personal
situation.

Of course, it could be that advertisers are inept. That's what he implies.

But it could be, they don't have access to all the data he/we fear they do.
Maybe this data doesn't even exist in the first place, or is downright
unusable.

------
Karunamon
This is going to be a fairly controversial opinion, and I fully expect this
text to end up about one shade off white approximately five minutes after
posting (at least if my last few attempts mean anything, hence the following
wall of text), but I notice a few things, without fail, whenever this topic
comes up.

1) The advocates for all of these restrictions on data are incapable of doing
so without resorting to flawed arguments, if not outright scaremongering.

A couple selected pieces from this article:

* "If you've ever wondered why Facebook is such a joyless place.."

Can't say I have. Perhaps your Facebook interactions are all dour and joyless
because the people you interact with are dour and joyless online? In any case,
it's hardly proper to speak this opinion as if it were fact.

* Comparison of ad targeting data to the "pink files" collected for the express purpose of destroying LGBTs, or data collected by various secret police.

Why this is problematic is left as an exercise for the reader.

I'm going to coin a phrase here. You know "reductio ad absurdum"? I'm going to
call this "reductio ad missionem malum" \- reduction to the worst case
scenario. A close relative of the slippery slope.

In this case, the thesis appears to be "Lots of data is collected, therefore
burn it all to the ground because it can be misused by the bad guys with
guns."

2) Their arguments are not followed to their logical conclusion.

Much hand wringing is done (not by the author necessarily, but in general)
about the world when everyone's "youthful indiscretions" and "every mistake"
are available online, permanently, for the world to see. Fast forward a decade
or so after that line is crossed. What happens when everyone has dirt on
everyone? Does that not greatly lessen this impact? It's pretty hard for the
guys in suits and dark glasses to blackmail you for something when _that
information is already out there_.

It would be a type of "post-privacy" society, in other words. It is
fundamentally different from what we have now, where we have a public
personality and a private personality. This carries some positives, some
negatives, and I feel it has yet to be discussed in an objective way.

3) Instead of advocating for greater privacy controls that make sense, they
instead advocate for measures like the EU's misaimed and "feel good" "right to
be forgotten" law.

Look at what the author advocates for:

* "Limit what kind of behavioral data websites can store. When I say behavioral data, I mean the kinds of things computers notice about you in passing—your search history, what you click on, what cell tower you're using."

I for one GREATLY LOOK FORWARD</s> to the day when bureaucrats tell me how my
nginx access logs must be formatted and stored (after all, they contain,
fairly explicitly "what you click on"). I also look forward with the same
enthusiasm to how such a thing will ever be enforced and to find out how much
money will be allocated to this particular measure.

I think this is approaching the problem from the wrong angle.

The author says that this is an implementation problem. I, for one, do not
want that implementation decided by people who don't understand how technology
works. Unfortunately, that seems to be the case whenever you have tech-by-
legislative-diktat.

Let me give you an example by way of the next item:

* "Enforce the right to delete. I should be able to delete my account and leave no trace in your system, modulo some reasonable allowance for backups."

Remember how I said the EU law was misaimed and "feel good"? A case where this
would do more harm than good: Hacker News. Or indeed any other discussion
forum or mailing list archive. Imagine a prolific and quality contributor
here, like such as tpacek or even PG. Now imagine that for whatever reason,
one of these people want to be "forgotten" and invoke this law.

Imagine what this would do to every single thread that person has ever
participated in. Context would be utterly annihilated. This would eviscerate,
in the most disgusting sense of the word, most any discussion forum.

I came up with this edge case in less than 60 seconds of thought, and I don't
even begin to rank on the list of smartest people on HN. If I can locate such
a problem with so little effort, that means both that the people who wrote
this law don't know what the fuck they're on about, and it also means that
worse edge cases probably exist.

I would say that the legislation needs to target behavior, not tech. Ensuring
that private data remains so even after mergers, acquisitions, etc? Excellent.
Penalties on companies that misstep? Great idea. Ensuring that government
types have to go through the full judicial processes (none of this secret-
court-rubber-stamp malarky) to access this data? Awesome!

Making me destroy my website because someone wants to disappear themselves?
Less so.

~~~
lovemenot
The author is right in the diagnosis, but as you showed, wrong in his
prognosis. It's a serious problem and getting worse. I don't agree that when
everyone has dirt on everyone it all just nets to zero. Rather, that would be
a sick society. Sure, one ought to reserve judgement, but in practice people
usually crave judgement. For better or worse, we are categorising beasts.

My alternative prognosis is much, much more data. Instead of unenforceable
rules, I propose digital chaff. When everone has near infinite information on
them from many and varied sources, if nearly all of it is bullshit only those
already in the know can discern the truth and even then with rapidly
diminishing confidence levels. Such chaff would poison many business models so
it'd be a disruptive change. My proposal gets to the heart of the author's
concern about the disparity between memory and storage, which I share.

------
quadrangle
This tells the truth.

------
twvance
Amazing post. Thank you!

------
pradeep89
> America built 75,000 kilometers of interstate highways

Liked the of use of kilometers over miles

------
davidhariri
I really like this essay

------
quadrangle
One thing: this guy says he couldn't figure out how to block YouTube ads.
Ridiculous. It was years before I learned they even had any. Adblock Plus or
Adblock Edge both fully block them if you have EasyList (the default).

