
Yandex can now search for Facebook posts - iamtechaddict
http://en.itar-tass.com/non-political/714496
======
beagle3
Only the public posts.

However, Facebook has a long history of changing their privacy policy and
settings, and always in a way that (by default) makes public some things that
previously were private. Usually with a couple of weeks to a month notice, not
more.

So, unless you are diligent enough to review/delete your private stuff every
time Facebook makes such a change, it is likely that in the future, stuff you
marked "private" today will, in fact, be publicly indexed and searchable by
yandex tomorrow.

Which is why, as has always been the case, you should assume everything you do
on Facebook is public. Either because Facebook will decide to make it that way
one day; or because of a security breach of some sort (hasn't happened yet, as
far as I know), or because NSA/MI5/Mossad has access to it and will use it
against you at the most inopportune moment (for you; it's a great moment for
them).

~~~
ctrl
This is ridiculous speculation. Source?

~~~
aroch
Well, they went from allowing you to have a private profile (ie. not
searchable and non-friends see no info) to making
name/age/gender/profpic/email/friends list public. There's quite some evidence
that Facebook is using private messages in targeted ads (meaning the ad buying
now knows private information about you).

~~~
cmelbye
Please don't mislead people. Ad buyers do NOT receive any private information
from Facebook users when they run an ad. They tell Facebook what kind of user
they want to see their ads, and Facebook shows the ads to those particular
users. There is no exchange of information.

~~~
aroch
Pretty much what Beagle said. Targeted ads on Facebook are the social network
equivalent of a spear phishing email. It's trivial to target a very specific
set of even just one individual if you've done your research and then follow
them around the web

------
DangerousPie
The title makes this sound much more severe than it is. According to the
article they only get access to _public_ posts, i.e. posts they would have
been able to access by crawling the site anyway.

~~~
gojomo
Note, though, that Facebook's robots.txt and Terms-of-Service typically
prohibit the crawling of these 'public' posts.

Users shouldn't be surprised that others can see them; they may still be
surprised at the new level of discoverability by strangers and via other
sites.

Developers may still be a bit miffed at the mixed meaning of 'public' here –
world-readable to the extent it benefits Facebook (and negotiated partners),
but not allowing downstream automated indexing/analysis/excerpting by the
general public.

~~~
nly
Honoring robots.txt isn't a legal requirement, and I don't think a crawler can
read or agree to TOS. Personally I'm in favour of anything publicly accessible
being fair-game for indexing purposes, but this kind of news about database
access makes me uncomfortable.

~~~
gojomo
Re: _honoring robots.txt isn 't a legal requirement_

In common-law jurisdictions (like the US), I wouldn't be so sure of that. It
has a lot of precedent behind it, as a longstanding convention for indicating
site-owner/rightsholder wishes. Ignoring it – deploying software designed or
configured to be oblivious to it – could create legal risk.

A literal reading of copyright law and laws about 'authorized' use of computer
systems would assess all bulk copying/reuse of web content without explicit
advance permission as illegal. It's the force of traditional/reasonable
industry practices (like robots.txt), and offsetting considerations like fair-
use, that make it legally defensible.

------
startupclarity
I still think yandex is underrated outside of Russia. It's webmaster tools[1]
seem to get better and better whereas Google's seem to remove more and more
information.

[1] [http://webmaster.yandex.com/](http://webmaster.yandex.com/)

------
na85
Surprised something like this hasn't happened sooner. That social networks
like FB sell personal user data to advertisers is the industry's worst-kept
secret.

At least now they're dropping any pretense that they give a rat's ass about
privacy.

~~~
ryanmerket
Facebook doesn't sell data to advertisers. Do you have a source for your
claim?

~~~
beagle3
Don't know if this is still the case, but in the past it was possible to pay
facebook for ads targeting e.g. 25-30 year olds in the US, and color that add
with a cookie, so that when you saw the same browser later, you'd know it was
a 25-30 year old from the US.

(It went much deeper than that - marital status, month of year, a few other
things I can't remember). So, while they weren't directly offering and selling
this information as is, you could buy it from them by buying ad space on the
demographics you cared about. (Yes, this also required you had other access to
ad networks, real time exchanges, etc -- but if you really want info on
people, it was chump change)

------
ryanmerket
Access to the firehose != access to database. Bad reporting.

------
lucb1e
I have a Facebook account nowadays, mostly for private chats, reading friends'
statuses and a few groups with classmates, but any posts I make are strictly
set to public. I know I shouldn't expect them to be private on Facebook and
this forces me to think a little about what I post. Pseudonymously, I don't
use my real name on Facebook, but still.

------
antihero
What part of "public" do people not understand?

~~~
TallGuyShort
The part that is subject to change at Facebook's whim

~~~
res0nat0r
From public to "more public?"

~~~
TallGuyShort
No, from private to public. They are notorious for changing privacy policies
with effects that are not obvious to most users.

------
NigelTufnel
A year ago Yandex made an iOS app "Wonder" that could search in Facebook and
other social networks. Facebook blocked this app one month later.

Good to know that now Yandex and Facebook have signed an agreement.

------
jonknee
FWIW, here's Facebook's robots.txt file. While this was a firehose agreement
and not subject to robots.txt, it is an interesting look:

[https://facebook.com/robots.txt](https://facebook.com/robots.txt)

They give the Internet Archive the most access, but oddly go out of their way
to block their TOS and privacy policy from them and not anyone else. Sneaky.

Update: scratch that, I missed that they were Allow statements for
ia_archiver. The Internet Archive has by far the least access.

~~~
mapleoin
_They give the Internet Archive the most access, but oddly go out of their way
to block their TOS and privacy policy from them and not anyone else. Sneaky._

It's actually the opposite. They allow IA access to their terms and policy,
but block everything else:

User-agent: ia_archiver

Allow: /about/privacy

Allow: /full_data_use_policy

Allow: /legal/terms

Allow: /policy.php

Disallow: /

~~~
jonknee
Duh, all the others were Disallow statements and I just missed that the IA
ones were Allow. Thanks for the tip.

------
guard-of-terra
I guess it's only for Russian and Turkish-language posts.

~~~
slezyr
Right, most likely only Russian/Ukrainian languages. I don't think they will
index English posts.

------
linux_devil
I am deactivated since long time, do I need to worry ?

~~~
gtirloni
I don't know if you need to worry but you shouldn't assume that deactivating
your account means your data is not there anymore.

I have deleted my FB account once and waited the amount of time they say is
needed for data to be deleted. In fact, I stayed out of FB for >6 months. When
I decided to create an account again using the same email address, at the
first login, FB was ready to remind me of all the friends I "probably knew"
(sure enough, they were all people I had added as friends before deleting the
account). So that information is unlikely to be deleted.

To get a somewhat vanilla experience I had to create an account using a
different email address. Then it behaved as not knowing me (too much).

------
staticelf
I deleted my facebook about a year ago now.

feelsgoodman.jpg

